Microsoft Builds AI Auto-Captions Images for the Visually Impaired
Microsoft has developed a new image-captioning algorithm that exceeds human accuracy in certain limited tests. The AI system has been used to update the company’s existing assistant app for the visually impaired, Seeing AI, and will soon be incorporated into other Microsoft products like Word, Powerpoint and Outlook.
Ideally, everyone would include alt text for all images in documents, on the web, in social media – as this enables people who are blind to access the content and participate in the conversation. But, alas, people don’t. So, there are several apps that use image captioning as way to fill in alt text when it’s missing.
The Seeing AI was first released in 2017 and it uses computer vision to describe the world as seen through a smartphone camera for the visually impaired. It can identify household items, read and scan text, describe scenes, and even your relationship to people. It can be used to describe images in other apps, including emailing, social media apps and messaging apps like WhatsApp. The new image-captioning algorithm is claimed to be twice as good as its predecessor.
The new algorithm is said to improve performance of Seeing AI significantly, as it’s able to not only identify objects but also more precisely describe the relationship between them. So, the algorithm can look at a picture and not just say what the image contains (e.g a person, a chair, a guitar) but also how they are interacting (e.g. a person sitting on a chair playing a guitar).
The new image-captioning AI. Image Credit: Infotechnews
The algorithm, which was described in a pre-print paper published in September, achieved the highest ever scores on an image-captioning benchmark known as “nocaps.” This is an industry-leading scoreboard for image captioning, though it has its own constraints.
The nocaps benchmark consists of more than 166,000 human-generated captions describing some 15.100 images taken from the Open Images Dataset. These images span a range of different scenarios, from sports to food to holidays and more. However, these benchmark capture only a tiny sliver of the complexity of image captioning as a general task
This breakthrough in image description improves the quality of alt-text on images in Microsoft 365, and makes the visual world more accessible to people who are blind.
Article Thumbnail Credit: https://eminetra.co.nz/microsoft-is-building-image-to-caption-ai-to-help-visually-impaired-colleagues-truly-understand-their-bosss-dislike-of-powerpoint/51227/