Since researchers started working on artificial intelligence (AI), one of the primary goals has always been an advanced image captioning system. Many companies are investing their precious resources in AI to develop better and better products. Now, Microsoft has come up with a new AI system that can caption and describe images more accurately than humans can.
The Redmond giant recently announced this breakthrough via an official blog post. Although image captioning is one of the hardest tasks for an AI system to master, Microsoft says that their new “Enhanced Image Captioning” AI has image captioning abilities as good as humans. And this breakthrough in AI technology will help the company boost its products and services in the market.
Image Captioning at Its Best
Now, automatic image captioning might not sound like an important feat, but believe me, it is. This nifty technology helps users access the content in an image, whether it is in your gallery or somewhere in a 5-page document. For instance, when you search for “dog” in your image gallery, the designated app uses its image recognition capabilities to sort out every picture that has a dog in it and then narrow it down to your search. This is one of the many tasks which requires a system to have great image recognition capabilities.
Microsoft’s new model can generate way better captions for images than its predecessors. And these captions are, indeed, similar to what a human would write to describe the images.
So, as you can see, the new “Enhanced Image Captioning” AI is much more accurate in its description of an image than before. Moreover, this new model can even recognize the context of an image. Check out this other image.
In this image above, the previous system gave a vague description without telling what the players are doing. However, the new model knows that the players are celebrating and are actually football players, not baseball players!
Accessibility: For the Visually-Impaired
Now, this image captioning capability is useful for users, but the people for whom this technology is of utmost importance are those who are blind or have low vision. These people have to rely on voice dictations when navigating through computer systems. So, image captioning helps them browse through social media or messages more easily.
“The use of image captioning to generate a photo description, known as alt text, in a web page or document is especially important for people who are blind or have low vision,” said Saqib Shaikh, a Software Engineer Manager at Microsoft’s AI division in Redmond.
As a result, the Windows maker is now integrating this new image captioning AI system into its talking-camera app, Seeing AI, which is made especially for the visually-impaired. This app uses the image captioning capabilities of the AI to describe pictures in users’ mobile devices, and even in social media profiles.
Apart from integration in the Seeing AI app, Microsoft also made the new AI-system available for Azure AI clients. It now exists as a part of the Azure Cognitive Services Computer Vision and developers can use its capabilities in their own apps and services, if they wish to.
Moreover, the AI image captioning tech will also make its way to Microsoft Office apps, such as Microsoft Word, PowerPoint, and Outlook, later this year.