Google has been quietly improving Bard and adding new features every few weeks, bringing its capabilities up to par with ChatGPT. Now, the company has added the ability to upload images to Bard for a much broader experience besides text. Make no mistake, Google Bard is still a text-only large language model. However, the search giant has integrated Google Lens, reverse image search, and a few VQA systems (Visual Question Answering) to make Bard feel like a multimodal model. Nevertheless, Bard’s current vision capability is indeed somewhat surprising, and we have tested it below to learn about its capabilities. On that note, let’s take a look at some cool examples of image uploads in Google Bard.
1. Extract Texts from Images
The best utility of Bard’s image-handling ability is that now you can upload an image by clicking on the (+) button. It can then quickly grab texts from the uploads. Google Bard then automatically performs OCR and does an accurate job. That being said, despite a long list of language support in Bard, currently, the OCR functionality only works for the English language. I tried multiple international and regional languages, but it failed to grab texts from scanned images. Nevertheless, for quick text extraction from images, Bard can be very helpful.
2. Extract Tables with Formatting Intact
We all struggle when we have to extract tables from scanned images or documents. However, Google Bard can effortlessly extract tables with the formatting intact. In fact, you can export the table to Google Sheets as well and do further editing or data crunching. How cool is that? Having said that, currently, Bard hallucinates a lot, and in some cases, it fills the cells with the wrong data, so make sure to verify them before exporting it.
3. Generate Code for Websites/ Apps Using Mockups
To showcase GPT-4‘s multimodality feature, in March 2023, OpenAI demonstrated how its model understood the scribbled note and quickly created a mockup of the website from a piece of paper. While the multimodal feature is yet to come to GPT-4, Google Bard is able to generate code that matches the mockup. Keep in mind that Bard is not a multimodal model but uses image segmentation via Google Lens to understand the image. Nonetheless, Bard surprised us with its results.
I uploaded a screenshot of the Facebook landing page, and it quickly generated code in HTML and CSS that looked somewhat similar. I also uploaded an image of a simple website that I drew on paper, and Google Bard did a good enough job of recreating it. Further, you can use similar methods for recreating UIs for smartphone apps and other websites as well.
4. Google Bard Can Explain Images
Google Bard is good at explaining images and summarizing what is going on in them. You can upload obscure images, and it can produce reliable information quickly. I uploaded a low-quality image of a biological mechanism, and it correctly identified it as Cell Mitosis. It further explained the process step by step.
In another example, I uploaded a chart, and it correctly understood the image and explained the data. It even created a table of the data points so that I could work on it in Google Sheets. Particularly for students, Bard can be helpful in understanding concepts in science and other topics. You can simply upload an image and ask Bard about it.
5. Get Nutritional Information from Images
Using Bard’s image-handling capability, you can get the nutritional values of food. Simply upload the image of food on your plate, and it will calculate the total calorie within seconds. This can be immensely helpful for people who are on a regulated diet.
In my testing, it couldn’t gauge the portion size but gave examples so that you could calculate the total calorie intake by yourself. It seems Google is using image segmentation to categorize food items and come up with nutritional information.
6. Improvise Food Recipes
Another excellent use case is to add the image of raw food items and ask Google Bard to come up with various food recipes. You can also add images of food items in your refrigerator, and it will effortlessly create personalized recipes for you. Furthermore, you can ask Bard for particular cuisines from various parts of the world. And if you are on a diet, you can ask Google Bard to create fat-free, low-calorie food recipes for satiety.
7. Solve Mathematical Questions
You can use Google Bard to solve mathematical questions as well. You can upload an image of your maths problems to Bard, and it will try to solve the question for you. In my testing, Bard’s approach was right but due to notation issues, it came up with wrong answers only. I think it will require an update to its vision system to make Bard more suitable for handling mathematical notations and questions.
8. Explain Memes and Jokes
Google Bard can also explain memes and jokes. You can upload images of funny memes and cartoons and ask Bard what is funny about the same, and it will provide its own interpretation. I uploaded the same image that OpenAI demonstrated during the GPT-4 unveiling, and Bard rightly understood the hilarious absurdity behind the image.
In another instance, I uploaded an image to Google Bard from The New Yorker Cartoons and asked it to explain the joke. However, this time, it simply explained the scene and couldn’t tell why the image was funny. It entirely missed the email phrase that is commonly used in workplaces. I will suggest you try Google Bard yourself and check if it’s intelligent enough to understand wit and humor.
9. Translate Equations to LaTeX
It’s no secret that many people find it hard to write in LaTeX and prefer to use word processors. However, for scientific research papers and academic writing, LaTeX is required for adding complex equations and high-quality typesetting. In such a scenario, Google Bard can be helpful. You can add images of equations, and Bard can translate them to LaTeX code. That’s amazing, right? So, go ahead and translate the equations to LaTeX code in no time.
10. Upload Medical Reports and Ask Questions
Finally, you can upload images of your medical reports and scan them to Google Bard. You can then ask medical questions based on them. Some physicians on Twitter have shown that Bard is quite decent for differential diagnosis. It can also help users to understand their health and make sense of medical reports
That said, do keep in mind Google Bard is running on a general-purpose LLM called PaLM 2. The search giant has developed a separate medical-domain Med-PaLM 2 model, which is quite accurate and advanced, but it’s not available to general users yet. So I will recommend users stay away from any kind of self-diagnosis using Bard. It’s strongly recommended to consult a doctor. And finally, if you upload your personal medical reports to Bard, make sure to delete Bard chats to protect your privacy.