Software Architect / Microsoft MVP (AI) and Technical Author

AI, Azure Functions, Cognitive Services, Computer Vision, Social Opinion

How To: Implementing Alt Text in SaaS Using Azure Computer Vision and Azure Functions

Recently I had to find a way to implement alt text in a SaaS application that I built. and maintain.

 

There were two main problems I had to solve.

Problem 1: Content Scheduling and Alt Text

The first area where alt text had to be implemented is within a content scheduler.

The content scheduler lets you supply text and upload images.

The content can then be posted to Twitter LinkedIn Facebook or Instagram.

You write it once then it will be automatically cross posted to the other platforms.

Here you can see the content scheduler in action:

The time slot of 07:59 contains sample text and an image. All images are stored in Microsoft Azure using BLOB storage.  You can see a list of images here:

Problem 2: Existing Evergreen Content

An additional problem is the content scheduler lets customers repurpose media by using what’s known as evergreen content. This content is often used to save time or for advertisements.

Evergreen content includes the text to be published and all associated images. These images are also stored in Azure BLOB storage.

 

To summarise, we have two main problems:

  1. the user interface in the content scheduler does not allow customers to supply alt text when scheduling content.
  2. existing images in evergreen content is not paired with accompanying alt text.

 

Two things are needed:

  1. A way for the customers to automatically generate all text for any content that is scheduled.
  2. A solution which runs in the background and generates alt text for existing evergreen content.

~

Example API Requests and Implementation Details: Creating an Image

Before digging into the solution, it’s useful to see how a typical post is created for one of the integrations and how alt text can be supplied in the form of metadata.

The following endpoint is used to programmatically create tweet media using the Twitter API:

POST https://upload.twitter.com/1.1/media/upload.json?

A response from this endpoint typically resembles the following :

{
    "media_id": 710511363345354753,
    "media_id_string": "710511363345354753",
    "media_key": "3_710511363345354753",
    "size": 11065,
    "expires_after_secs": 86400,
    "image": {
        "image_type": "image/jpeg",
        "w": 800,
        "h": 320
    }
}

 

A key field is the media_id.  The media_id is used by a separate endpoint to attach one or multiple images.

 

Setting ALT Text for Media

To set the alt text for an image the following endpoint can be used :

POST https://upload.twitter.com/1.1/media/metadata/create.json

To use this endpoint, a JSON object with the structure below is needed:

{

  "media_id":"692797692624265216",

  // image alt text metadata

  "alt_text": {

    "text":"dancing cat" // Must be <= 1000 chars

  }

}

 

This JSON object requires you to send the media ID (generated from the previous endpoint) and gives you the option of supplying alt text.

~

Using Azure Computer Vision to Generate Alt Text

The latest release of the Azure Computer Vision API can help generate alt text and has support for local or online images.

 

At the time of writing, the API endpoint can generate alt text is only available in a handful of regions.

Creating the Resource

The service is created like any other service in Microsoft Azure this is the one that you’ll need here:

After creating the service, obtain the endpoint URL and your API key from the Azure Portal:

~

Demo: Using Computer Vision with Azure Functions and Postman

An example is detailed here.  In this example, the method AltTextProcessoraccepts an image URL:

public static async Task<string> GenerateAltText(string imagUrl)
{

    // generate alt text for image

    var serviceOptions = new VisionServiceOptions(

    Environment.GetEnvironmentVariable("VISION_ENDPOINT"),
       new AzureKeyCredential(Environment.GetEnvironmentVariable("VISION_API_KEY")));


var imageSource = VisionSource.FromUrl(new Uri(imagUrl));
var analysisOptions = new ImageAnalysisOptions()
{
  Features = ImageAnalysisFeature.Caption | ImageAnalysisFeature.Text,
  Language = "en"
};

using var analyzer = 
new ImageAnalyzer(serviceOptions, imageSource, analysisOptions);
{
  var result = await analyzer.AnalyzeAsync();
  if (result != null)
  {
    if (result.Reason == ImageAnalysisResultReason.Analyzed)
    {
      if (result.Caption != null)
      {
        return result.Caption.Content;
      }
      else
      {
        return "No caption.";
      }
    }
    else if (result.Reason == ImageAnalysisResultReason.Error)
    {
       var errorDetails = ImageAnalysisErrorDetails.FromResult(result);
       return "No caption.";
    }
   }
 }
 return "No caption.";
}

This method is called within an using Azure Function using a GET request:

[FunctionName("AltTextProcessor")]
public static async Task<IActionResult> Run(

[HttpTrigger(AuthorizationLevel.Function, "get", Route = null)] HttpRequest req,
           ILogger log)

{
  log.LogInformation("C# HTTP trigger function processed a request.");

  string name = req.Query["imageurl"];

  string altText = await GenerateAltText(name);

  return new OkObjectResult(altText);
}

 

For reference, you can see it being tested in Postman using the following image: