Introduction
In a previous post, I successfully integrated Microsoft Cognitive Services “LUIS” with C# and SQL Server. To recap, I have been attempting to assign probabilities to tweets that express “commercial intent” as part of a solution I’m building for the Twitter Innovation Challenge 2017 – #Promote.
Users that share these tweets are then added to an “Audience List” which can subsequently be targeted with relevant ads (creatives) in their Twitter timeline. This post is brief (I’m on holiday!) but details:
- key data fields
- how this information is rendered
- what’s made its way into the “final” UI (it’s never finished)
- some of the key features
Raw Data
I recently shared an image on LinkedIn that detailed some of the fields that are being persisted by the service responsible for harvesting tweets and associated user metadata.
You can see some of the key data fields in the image below. (click image to enlarge)
Some points regarding this data:
- Microsoft Cognitive Services LUIS is being used to try and identify “commercial intent” based on tweet content and assigns a value to the ProbabilitySalesLead
- The ProbabilityPositive value is being calculated using different technology and is something that I studied during Masters Degree research (machine learning and sentiment analysis).
- The users Latitude and Longitude is captured for the purposes of mapping. In most cases, users disable GPS to preserve their devices battery, or more importantly, their privacy. So quite often this is inferred via other means.
- The contents of each tweets are also tokenized and have POS (Part of Speech) Tagging applied to further enrich the data and provide additional analytical insights. I covered POS Tagging and NLP (Natural Language Processing) in an earlier blog post.
How does it look?
You can see how this data gets rendered in the following screenshot. (click image to enlarge)
The grid is self-explanatory but you will notice two blue buttons. Depending on the device you’re reading this on it may be hard to read the text. So here is an explanation.
Refresh Target Users
- The SocialOpinion service generates an Audience List and adds users every 15 minutes.
- Users are only added to the Audience List based on the probability of them being a “sales lead”.
- By clicking this button, SocialOpinion will send these user accounts directly to Twitter via the AdsAPI whereby they will be added to your Tailored Audience on the Twitter servers.
- A lot of other things must be present prior to this occurring but this is out of scope for the purposes of this post. I’ll cover this in a subsequent post.
Download JiT Audience
- Downloads all users in the Audience List and respective metadata in CSV format.
Where is the old user interface?
If you’ve read some of the earlier posts, you will have noticed the interface has changed and may be wondering why. It simply was best to integrate “sales lead” functionality as a new module that sits alongside other features in the existing SocialOpinion stack (rather than build a new web solution).
Doing this will allow me to focus on the AI/ML elements to improve the accuracy of LUIS.
I may also add an option to disable LUIS (as it attracts a commercial element) and use other technology to identify commercial intent such as a Bayesian Classifier.
What about the SignalR component?
The SignalR component was a nice feature but you aren’t really supposed stop/start the Streaming API which was a problem.
Why?
Users can create multiple “Campaigns” via the web interface but will have no easy way to restart the Streaming API Windows Service without affecting other users of the system. It would have worked well for one user and one Campaign for specific keywords but simply can’t scale in this use case.
That said, SignalR is something that could be used if the data was being pushed from the local system database. It still may have a place.
Summary
Getting close to the end now for this MVP, some final integrations to make and tests to undertake. The next main task is to find time to ensure probabilities that are being predicted are accurate.
Leave a Reply