I’ve used the Computer Vision API in the past to help me automatically extract human readable tags, descriptions, and captions in social media data from platforms like Twitter and Instagram. For example, take the following image:
The Computer Vision API identifies “a woman holding a bottle posing for the camera”:
This information was then used to add further reporting data points for visual content.
Human Parity
A recent announcement that caught my attention was further capabilities to the image description feature of the Computer Vision API. You can see an example of this below:
Prior to the new capability, the description “a close up of a cat” would have been generated. We can see a more detailed description is now generated “a grey cat with its eyes closed”.
Improved Accessibility
Richer insights and descriptions can help improve accessibility by generating alt text tags that can be used by screen readers.
Further Reading
You can find out more information about the new capabilities that have shipped in this release of the Computer Vision API here.
Leave a Reply