OpenAI Launches GPT-4, a Multimodal AI with Image Support

OpenAI GPT-4

ChatGPT is all anyone is able to talk about lately. Powered by the language model GPT 3 and GPT 3.5 (for Plus subscribers), the AI chatbot has grown by leaps and bounds in what it can do. However, a lot of people have been waiting with bated breath for an upgraded model that pushes the envelope. Well, OpenAI has now made that a reality with GPT-4, its latest multimodal LLM that comes packed to the brim with improvements and unprecedented tech in AI. Check out all the details below!

GPT-4 Is Multimodal and Outperforms 3.5

The newly announced GPT-4 model by OpenAI is a big thing in artificial intelligence. The biggest thing to mention is that GPT-4 is a large multimodal model. This means that it will be able to accept image and text inputs providing it with a deeper understanding. OpenAI mentions that even though the new model is less capable than humans in many real-world scenarios, it can still exhibit human-level performance on various levels.

GPT-4 is also deemed to be a more reliable, creative, and efficient model than its predecessor GPT- 3.5. For instance: The new model could pass a simulated bar exam with a score around the top 10% of test takers (~90 percentile) while GPT 3.5 came in the bottom 10%. GPT-4 is also capable of handling more nuanced instructions than the 3.5 model. OpenAI compared both the models across a variety of benchmarks and exams and GPT-4 came out on top. Check out all the cool things ChatGPT can do right here.

GPT-4 and Visual Inputs

As mentioned above, the new model can accept promotes of both text and images. Compared to a restricted text input, GPT-4 will fare much better at understanding inputs that contain both text and images. The visual inputs remain consistent on various documents including text and photos, diagrams, and even screenshots.

chatgpt multimodal

OpenAI showcased the same by feeding GPT-4 with an image and a text prompt asking it to describe what’s funny about the image. As seen above, the model was able to successfully read a random image from Reddit and answer the user-asked prompt. It was then able to identify the humorous element. However, GPT-4’s image inputs are still not publicly available and are a research preview.

Prone to Hallucination and Limited Data

While GPT-4 is a sizeable leap from its previous iteration, some problems still exist. For starters, OpenAI mentions that it is still not fully reliable and is prone to hallucination. This means that the AI will make reasoning errors and its outputs should be taken with great care and with human intervention. It might also be confidently wrong in its predictions, which can lead to errors. However, GPT-4 does reduce hallucination compared to previous models. To be specific, the new model scores 40% higher than GPT-3.5 in the company’s evaluations.

Another downside that many were hoping would be fixed with GPT-4 is the limited dataset. Unfortunately, GPT-4 still lacks knowledge of events that occurred after September 2021, which is disappointing. It also does not learn from its experience which translates to the reasoning errors mentioned above. Moreover, GPT-4 can fail at hard problems, just like humans including security vulnerabilities. But there’s nothing to worry about as Microsoft Bing AI is using the GPT-4 model. Yeah, you can try out the new AI model, with the backing of real-time internet data on Bing. Check out this article to learn how to access Bing AI chat in any browser — not being limited to Edge.

Access GPT-4 with ChatGPT Plus

GPT-4 is available for ChatGPT Plus subscribers with a usage cap. OpenAI mentions that it will adjust the exact usage cap depending on demand and system performance. Furthermore, the company might even introduce a ‘new subscription tier’ for higher volume GPT-4 usage. Free users, on the other hand, will have to wait as the company hasn’t mentioned any specific plans and only ‘hopes‘ that it can offer some amount of free GPT-4 queries to those without a subscription.

From the looks of it, GPT-4 will shape up to be an extremely appealing language model even with some chinks in its armor. For those looking for even more detailed information, we already have something in the works. So stay tuned for more.

Comments 1
Leave a Reply

Loading comments...