I first started learning about Bayesian Theorem and Sentiment Analysis back in 2013 as part of a Masters degree I had been studying in my spare time. Some of the ideas and things I learned throughout that can be found in my earliest blog posts.
During that time, I built an API in C# which featured a Bayesian Classifier to help me parse social media data and placed a Windows Forms application on the front of that API.
One of the biggest problems I found was that whilst I could infer the general sentiment of a sentence, my API struggled to understand the nuance of multiple topics being discussed and different ranges of sentiment.
I often had to source new training data, cleanse, retrain and use the updated model and the predictive accuracy of my API ranged anywhere between 60-80%.
Shortly after creating this API Azure Cognitive Service Text Analytics caught my attention. I ended up swapping out my custom API for a few Text Analytics API calls. This removed some headaches in terms of maintaining my own training data and retraining my classifier but still didn’t let me drill down into the low-level detail that I wanted to.
An Example of Regular Sentiment Analysis
Here we can see an example of the Text Analytics API in action. I’m using Postman to send the following document:
{ "documents": [ { "id": "1", "text": "the new iphone is a great phone but the speaker is bad.My Android is better" } ] }
There are quite a few things being mentioned and different levels of sentiment being expressed in the above sentence. When we run this through the Text Analytics API, we get the following response:
{ "documents": [ { "id": "1", "sentiment": "mixed", "confidenceScores": { "positive": 0.5, "neutral": 0.0, "negative": 0.5 }, "sentences": [ { "sentiment": "negative", "confidenceScores": { "positive": 0.0, "neutral": 0.0, "negative": 1.0 }, "offset": 0, "length": 55, "text": "the new iphone is a great phone but the speaker is bad." }, { "sentiment": "positive", "confidenceScores": { "positive": 1.0, "neutral": 0.0, "negative": 0.0 }, "offset": 56, "length": 20, "text": "My Android is better" } ], "warnings": [] } ], "errors": [], "modelVersion": "2020-04-01" }
The above JSON response contains the following 2 sentences:
- the new iphone is a great phone but the speaker is bad. (negative)
- My Android is better (positive)
In terms of classifying the overall sentiment the Text Analytics API has done an OK job. It doesn’t tell us exactly what’s being discussed in the text. For example, “iphone” or “speakers”. The new version of the Text Analytics API let’s you go deeper than this.
Beyond Sentiment Analysis with Opinion Mining (preview)
The preview version performs Aspect-based Sentiment Analysis and provides more fine-grained information about products or services being discussed in a body of text.
The existences of nouns or verbs in a body of text are labelled as Targets.
Adjectives are labelled Assessments.
An Example of Sentiment Analysis with Opinion Mining
We can take our original text: “the new iphone is a great phone but the speaker is bad.My Android is better” and send it to the new Text Analytics API endpoint.
To get opinion mining information in the response we set the parameter opinionMining=true as part of the request to the Text Analytics API.
We can see this here:
https://[your-instance].cognitiveservices.azure.com/text/analytics/v3.1-preview.4/sentiment?opinionMining=true
We get the following response from the Text Analytics API after sending the request:
{ "documents": [ { "id": "1", "sentiment": "mixed", "confidenceScores": { "positive": 0.5, "neutral": 0.0, "negative": 0.5 }, "sentences": [ { "sentiment": "negative", "confidenceScores": { "positive": 0.0, "neutral": 0.0, "negative": 1.0 }, "offset": 0, "length": 54, "text": "the new iphone is a great but the speaker is terrible.", "targets": [ { "sentiment": "positive", "confidenceScores": { "positive": 1.0, "negative": 0.0 }, "offset": 8, "length": 6, "text": "iphone", "relations": [ { "relationType": "assessment", "ref": "#/documents/0/sentences/0/assessments/0" } ] }, { "sentiment": "negative", "confidenceScores": { "positive": 0.0, "negative": 1.0 }, "offset": 34, "length": 7, "text": "speaker", "relations": [ { "relationType": "assessment", "ref": "#/documents/0/sentences/0/assessments/1" } ] } ], "assessments": [ { "sentiment": "positive", "confidenceScores": { "positive": 1.0, "negative": 0.0 }, "offset": 20, "length": 5, "text": "great", "isNegated": false }, { "sentiment": "negative", "confidenceScores": { "positive": 0.0, "negative": 1.0 }, "offset": 45, "length": 8, "text": "terrible", "isNegated": false } ] }, { "sentiment": "positive", "confidenceScores": { "positive": 1.0, "neutral": 0.0, "negative": 0.0 }, "offset": 55, "length": 20, "text": "My Android is better", "targets": [], "assessments": [] } ], "warnings": [] } ], "errors": [], "modelVersion": "2020-04-01" }
There is a lot of information to unpack here. We can look at each part in turn.
The overall score for the text (document) we sent
Here we can see the overall sentiment for the text we sent: “the new iphone is a great but the speaker is terrible. My Android is better”
"documents": [ { "id": "1", "sentiment": "mixed", "confidenceScores": { "positive": 0.5, "neutral": 0.0, "negative": 0.5 },
Sentences
The Text Analytics API has identified the existence of two sentences. The sentiment for each is also included:
"sentences": [ { "sentiment": "negative", "confidenceScores": { "positive": 0.0, "neutral": 0.0, "negative": 1.0 }, "offset": 0, "length": 54, "text": "the new iphone is a great but the speaker is terrible.", …… }, { "sentiment": "positive", "confidenceScores": { "positive": 1.0, "neutral": 0.0, "negative": 0.0 }, "offset": 55, "length": 20, "text": "My Android is better", }
You can see from the above the first sentence isn’t entirely negative, the text expresses the phone is great, but the speaker is the issue.
Regardless of this mixed opinion, the overall sentiment of the phrase has been classified as negative. This isn’t accurate.
Targets
These are the nouns or verbs that exist in the text. This is where the preview version of the Text Analytics API really shines.
We can see targets that have been identified (iPhone and speaker). The sentiment for the individual products being discussed (iPhone and speaker) is also identified:
"text": "the new iphone is a great but the speaker is terrible.", "targets": [ { "sentiment": "positive", "confidenceScores": { "positive": 1.0, "negative": 0.0 }, "offset": 8, "length": 6, "text": "iphone", "relations": [ { "relationType": "assessment", "ref": "#/documents/0/sentences/0/assessments/0" } ] }, { "sentiment": "negative", "confidenceScores": { "positive": 0.0, "negative": 1.0 }, "offset": 34, "length": 7, "text": "speaker", "relations": [ { "relationType": "assessment", "ref": "#/documents/0/sentences/0/assessments/1" } ] } ]
Assessments
These are the adjectives that exist in the text which are present in the text and used to describe the targets we’ve just seen in the text “the new iphone is a great but the speaker is terrible..
You can see these here along with the confidence scorings for the emotion being expressed:
"assessments": [ { "sentiment": "positive", "confidenceScores": { "positive": 1.0, "negative": 0.0 }, "offset": 20, "length": 5, "text": "great", "isNegated": false }, { "sentiment": "negative", "confidenceScores": { "positive": 0.0, "negative": 1.0 }, "offset": 45, "length": 8, "text": "terrible", "isNegated": false } ]
By combining the overall sentiment for the document with the deeper insights provided by this preview version of the Text Analytics API you can really dig into the nuance of the written word.
There are many use cases for this such as:
- Augment chatbot intelligence – better understand how customers are interacting with conversational AI assistants and react accordingly.
- AdTech and MarTech – create real-time programmatic creative relevant to what’s been discussed.
- Customer Segmentation and Analytics – better understand what your customers care about and what they dislike.
I might introduce some of these capabilities into www.socialopinion.co.uk to support audience generation and enrichment.
Summary
In this blog post we’ve seen some of the new features being shipped with the preview version of Azure Cognitive Services. We’ve seen how you can surface the nuance of what’s being discussed at document level and down to the individual tokens (words) that form the sentence.
Find out more about this preview version of the Text Analytics API here.
Find out more information about Aspect-based Sentiment Analysis (ABSA) here.
Leave a Reply