OpenAI has introduced a pathbreaking vision capability (GPT-4V) in ChatGPT. You can now upload and analyze images within ChatGPT. It had already received powerful features like Code Interpreter and the ability to connect to the internet on ChatGPT in the past. And with the new “Chat with images” feature, ChatGPT has become even more versatile and useful for users. Essentially, the GPT-4 model can now see, hear, and even speak with remarkable ease. So if you want to try ChatGPT’s new image analysis feature, follow our tutorial below.
Note: To use ChatGPT’s new image analysis feature, you must be subscribed to ChatGPT Plus, which costs $20 per month. Its knowledge cut-off is September 2021, the same as GPT-4.
Use ChatGPT’s Image Analysis Feature on the Web
1. Go ahead and open ChatGPT (visit) and log in to your account.
2. Next, move to the “GPT-4” model.
3. Hover your mouse over “GPT-4” and a drop-down menu will appear. Make sure you are in “Default” mode.
4. Now, as shown below, a “Chat with images” option will appear at the bottom left of the message box.
5. Click on the “image” button and upload an image. Now, ask questions to ChatGPT about the image.
6. For example, I uploaded an image of a hard disk and asked it to find the interface name and if I could use an SSD in place. It correctly identified the interface and informed me about the kind of SSD I could use as a replacement.
7. In another instance, I gave it a historical document with illegible handwriting and it did a great job of deciphering the text. It also pointed out the significance of the document in detail. There are many incredible use cases of GPT-4’s vision capability, which you can explore endlessly.
Use ChatGPT’s Image Feature on Android and iOS
The image capabilities of ChatGPT are not limited to the desktop website. You can also use the official ChatGPT app to upload images and ask questions with ease. Here is how it works:
2. Next, sign in with your OpenAI account and move to the “GPT-4” model.
3. Here, you will find a “+” button at the bottom-left corner. Tap on it.
4. You can then tap on the “camera” icon to take a live photo instantly or tap on the “image” icon to upload a photo from your gallery.
5. I took a live photo of a car’s tire and asked ChatGPT to explain the tire replacement process.
6. The GPT-4V model gave clear, step-by-step instructions on how to change the tire along with the tools I will need for this task.
7. Next, I uploaded an image to ChatGPT and asked it to explain the medical report. It recognized the text and correctly explained the findings. That said, do not rely on ChatGPT for medical diagnosis and consult a doctor instead.
So this is how you can use ChatGPT’s image analysis feature, both on your computer and smartphone. I found the GPT-4’s Vision model incredibly powerful and it’s less prone to hallucination, unlike Bard’s image processing capability. In some cases, it failed to identify texts from popular books, most likely due to copyright issues. You can read about the shortcomings of the vision feature in GPT-4V’s technical paper. Nevertheless, the “Chat with images” feature on ChatGPT is remarkable, and you should definitely try it.