Home > AI > 6 Cool Things ChatGPT 4o Can Do That OpenAI Didn’t Highlight

6 Cool Things ChatGPT 4o Can Do That OpenAI Didn’t Highlight

Updated: May 14, 2024

OpenAI recently released its next flagship model GPT-4o and demonstrated some cool demos. The human-like voice chat has become the headline feature, but there is more to it. OpenAI didn’t highlight many cool things that ChatGPT 4o is capable of. These details are available on OpenAI’s page and I went through all of them. On that note, let’s find out the cool new capabilities of ChatGPT 4o.

Table of Contents

1. Accurate Text Generation in Images

We know that Diffusion models struggle with generating texts on images. Dall -E 3 still fails to generate images with the given text. However, the ChatGPT 4o model which is an end-to-end multimodal model, can render texts accurately. OpenAI didn’t mention this in the presentation. However, you can find the example on OpenAI’s page where the company explores its capabilities.

Image Courtesy: OpenAI

gpt-4o text rendering capability in image generation — Image Courtesy: OpenAI

It can generate and add text to images effortlessly. The consistency in many samples is remarkable. You can also attach images and ask it to generate images from different angles of the same character, and it maintains consistency across all scenarios. It can also generate a 3D view of objects which you can combine to create a 3D render. Not to mention, it can generate fonts too.

Image Courtesy: OpenAI

Image Courtesy: OpenAI

Image Courtesy: OpenAI

Keep in mind that these capabilities are not available on ChatGPT yet. It still uses Dall -E 3 to generate images. OpenAI may unlock these features in the near future.

2. GPT-4o Can Process Videos Too

Image Processing: OpenAI

OpenAI didn’t mention that GPT-4o can handle videos too. Well, on the model page, OpenAI has demonstrated that you can upload a video and ask GPT-4o to summarize it. From transcription to bullet-point summary, it does everything. So it seems Gemini 1.5 Pro is not the only model that can process videos.

Related Articles

Why Spend on AI Gadgets When These AI Apps Can Do It All

Anshuman Jain May 9, 2024

3. GPT-4o Can Be Your Tutor

In a presentation with Khan Academy’s Sal Khan, OpenAI showcased a fascinating demo using the GPT-4o model. Basically, on an iPad, you can share your screen with ChatGPT 4o, and it can see everything on your screen.

Did you hear? @OpenAI's newest model can reason across audio, vision, and text in real time.

How does GPT-4o do with math tutoring?🤔@salkhanacademy and his son test it on a Khan Academy math problem.

You can get AI-powered math tutoring right now with Khanmigo:… pic.twitter.com/8NXoh0SwtU— Khan Academy (@khanacademy) May 13, 2024

You can now ask it to explain and help you find solutions to a problem. Be it mathematics, sciences, charts, maps, or anything else, ChatGPT 4o will be your personal teacher guiding you throughout your study session. That’s such a great application of AI, powered by GPT-4o’s multimodal vision capability. By the way, it also works with the ChatGPT desktop app for macOS.

4. ChatGPT 4o Can Be Your Meeting Companion

Meeting AI with GPT-4o pic.twitter.com/rHkQ316MYj— OpenAI (@OpenAI) May 13, 2024

In one of the demos, OpenAI showcased that you can have ChatGPT 4o as your live companion during meetings. You can share the screen with ChatGPT 4o, and it can see and hear all the participants. It can also give inputs and participants can also ask questions to the GPT-4o model. It replies spontaneously and stays engaged in the conversation. At the end, you can ask it to summarize the meeting as well. How cool is that?

5. Improved Non-English Language Performance

OpenAI has not just improved the performance of GPT-4o in the English language but also improved performance in regional languages. It has significantly improved the tokenizer that allows the model to compress non-English languages to fit more tokens.

Image Courtesy: OpenAI

To give some examples, Gujarati language takes up 4.4x fewer tokens, Hindi 2.9x fewer tokens, Telugu 3.5x fewer tokens, Urdu 2.5x fewer tokens, Russian 1.7x fewer tokens, and more. Basically, for regional languages, ChatGPT 4o has become even more powerful.

Related Articles

Llama 3 vs GPT-4: Meta Challenges OpenAI on AI Turf

Arjun Sha Apr 20, 2024

6. ChatGPT 4o Beats All Other AI Models

OpenAI didn’t discuss the benchmark numbers and focused on delivering new experiences. However, ChatGPT 4o’s benchmark numbers overshadow all other AI models from Google, Anthropic, Meta, etc. In fact, it performs better than its own GPT-4 Turbo model which was released a few months back.

Image Courtesy: OpenAI

From MMLU to HumanEval, GPQA, and DROP, ChatGPT 4o outranks both proprietary and open-source models. In the LMSYS arena too, the mysterious im-also-a-good-gpt2-chatbot model (which is actually the ChatGPT 4o model) got an overall ELO score of 1310, much higher than other AI models.

#Tags

#AI #chatGPT #featured #GPT-4o

Arjun Sha

Passionate about Windows, ChromeOS, Android, security and privacy issues. Have a penchant to solve everyday computing problems.

Comments 0

Leave a Reply

Exit mobile version