While OpenAI is yet to release its most anticipated multimodal feature to GPT-4, which lets you upload images and ask questions related to them, unsurprisingly, Microsoft has rolled out early access to the image upload feature. Yeah, you can now upload images to Bing Chat and chat with the GPT-4 model. It works just like OpenAI demonstrated during the GPT-4 launch.
With the multimodal feature, Bing Chat has basically received vision capabilities, and it can now understand images as well. You can use it to study medical reports, get nutritional data about food, solve mathematical questions, and much more. Now, to learn how to use GPT-4’s multimodal capability in Bing Chat, follow along this tutorial.
1. First, launch Microsoft Edge and open Bing (visit) on your computer. You can also install the Bing app (Android and iOS, Free) on your smartphone too.
2. Next, click on “Chat” in the top-left corner.
3. Once you are here, move to the “Creative” mode as it lets you chat with the GPT-4 model for free.
4. Now, you will find an “image” button in the text field below. This will allow you to upload an image and access the GPT-4 multimodal feature.
5. Click on the image button and upload an image file. You can also paste the image URL if you want.
6. I have uploaded an image of a website that I quickly scribbled on a piece of paper. Now, let’s ask Bing Chat to create a website like this and generate HTML and CSS code for the website.
7. And well, there you have it. Based on GPT-4, Bing Chat uses its multimodal capabilities to generate the HTML and CSS code right away.
8. After pasting the code and running it, here is the website you get. Not bad, right? It correctly picked my handwriting and the layout is similar too. And that’s how GPT-4’s multimodal capability in Bing Chat works.
9. In another example, I uploaded a complex CAD design of a house and asked it several questions, ranging from iron quantity to design-related questions, and it did a fabulous job.
10. Next, I asked Bing Chat to solve two mathematical questions, and it solved both of them correctly.
11. Finally, to round up, I uploaded a funny cartoon and asked Bing Chat to explain the joke. But this time, it failed to get the joke. Nevertheless, GPT-4’s multimodal feature is insanely powerful and there are limitless use cases that you can try.