Dialogues are one of the key features in the Bot Framework SDK, they help you model conversations, promote code maintainability, readability and reuse.
When building a chatbot, its common to encapsulate conversational logic into collections of dialogues, each being responsible for tasks. For example, MakeABooking or PayMyBill.
You typically create each dialogue, add waterfall steps and prompts to collect user data. You might even branch off into child dialogues to carry out discrete tasks. Being able to model conversational logic in this manner is great but the code can be relatively static.
I’ve experimented with ways of making dialogues more dynamic in the past. Ideally, I’ve wanted to be able to have a series of actions in a database table, load these at run-time and hook these actions up to a given intent or event after the user has supplied an utterance.
I’ve struggled to build a generic solution around this…. that was until I started looking into Adaptive Dialogue!
In this blog post I:
- Introduce Adaptive Dialogue
- Explore the key features of Adaptive Dialogue
- Share some ideas how Adaptive Dialogue can be used to build database driven chatbot conversations
Introducing Adaptive Dialogue
Adaptive Dialogue is new to the Bot Framework SDK, at the time of writing it’s in Preview. It augments existing functionality in the SDK with a rich new set of features and gives you a new way to model conversations. It’s built on several new concepts that pave the way for you to build chatbots that can dynamically adjust to context of conversation.
At the heart of this functionality are 5 concepts:
- Recognizers
- Language Files and Generator
- Triggers
- Actions
- Inputs
Let’s look at these in a bit more detail.
Recognizer
A Recognizer helps you extract data from a given piece of information supplied by the user. They can raise events, for example, when an Intent has been identified. You have a few options in terms of implementing a recognizer:
- RegEx
- LUIS
- Multi-language
Tip: Implementing the RegEx recognizer is a quick way to build a bot and saves you from creating a LUIS Application whilst in the early stages of bot development. When you’re happy your main intents have been identified and are being successfully used by your bot, you can easily swap the RegEx recognizer out.
Here you can see a sample RegEx recognizer that can identify two commands:
var rootDialog = new AdaptiveDialog("rootDialog") { Recognizer = new RegexRecognizer() { Intents = new List<IntentPattern>() { new IntentPattern() { Intent = "HelpIntent", Pattern = "(?i)help" }, new IntentPattern() { Intent = "PayBill", Pattern = "(?i)pay|pay bill" } } } }
Find out more about recognizers here.
Triggers
These help you catch and respond to events accordingly with the Trigger OnEvent sitting at the top of the chain. There are many others but I’m finding the main ones I’ve been using are:
OnIntent
This Trigger fires when the Recognizer has been able to successfully identify an Intent. As a side point – all built-in Recognizers emit the OnIntent Trigger when an Utterance has been processed successfully.
OnUnknownIntent
This Trigger fires when the user has supplied an Utterance that doesn’t match any known Intents. You can then use this to prompt the user with friendly help or remind them of the main tasks your bot can perform. When a Trigger has been fired you can dynamically assign one or more Actions.
// Create a Trigger with the intent name var helpTrigger = new OnIntent("HelpIntent"); // Create Actions when the HelpIntent Trigger fires var helpActions = new List<Dialog>(); helpActions.Add(new SendActivity("Hi, I’m a bot, I can help you out you’re your credit card!")); helpTrigger.Actions = helpActions; // Add the help trigger to root dialog rootDialog.Triggers.Add(helpTrigger);
Find out more about Triggers here.
Actions
These are the building blocks of your conversation.
When a given Trigger has identified a specific event (e.g. OnIntent=“PayMyBill”), you can run a collection of Actions associated with that Trigger.
For example, a first action may be sending an activity to capture the user credit card number, a second action might be making an http request to validate the credit card number and so on.
An important point to note is that unlike a Waterfall dialog where each step is a function, each action in an Adaptive Dialogue is a dialog within its own right.
var payBillDialogue = new AdaptiveDialog("payBillDialogue"); payBillDialogue.Triggers.Add(new OnIntent() { Intent = "greetUPayBillser", Actions = new List<Dialog>() { new TextInput() { Property = "user.name", Prompt = new ActivityTemplate("What is your name?") }, new SendActivity("Hello, @{user.name}") } });
In my opinion, actions are where the real power comes into play as they let you build conversations programmatically and lend themselves to being created at run-time, whether it be via JSON configuration files or settings you load from a data model.
You can create a whole range of actions, from sending regular text, to manipulating variables in memory, executing business rules or even making calls to web services through the HttpRequest step.
Find out more about Actions here.
Inputs
These are wrappers around existing Bot Framework prompts. You add Inputs to Actions. Inputs let you ask the user for information. They ship with a few in built features out of the box to help you validate the user input such as integer ranges.
Find out more about Inputs here.
Language File
This deserves a blog post within its own right as the stuff you can do with this is impressive.
Language generation files let you extract (often hard coded) strings from you code and place these into one or many “.lg” files within your bot solution. They’re a good way to separate UI type artefacts from the conversational logic that resides in your chatbot code (although you can also execute basic business logic within a Language File too!)
The content of language generation files can even reference the current bot’s memory, thereby making it possible to dynamically inject in-memory variables. You can see an example of this in the code following snippet (courtesy of Microsoft):
# greetingTemplate
– Hello @{user.name}, how are you? |
Find out more about Language Files here
Generator
The final piece of the puzzle is the Generator. The Generator is responsible for mapping a given Language Generation file to a given Adaptive Dialogue. You point the Generator to a “.lg” file and go.
Coupled with the Recognizer, the Generator helps separate the language understand and language generation assets. In theory, it means you could have one person responsible for handling the overall tone of your chatbots conversation (less technical) and another person responsible for coding the back-end logic and NLP elements.
How does it all hang together?
We’ve ran through the main components that form an Adaptive Dialogue. To recap you have:
- Recognizer
- Dialogue
- Triggers
- Actions
- Inputs
- Language File & Generator
So, how does the run-time use all these?
High Level Runtime Process
Staying with our credit card theme from earlier on, imagine our user wants to pay their credit card bill. One if the first things the bot will need to do is run through security and ask for the users’ credit card number and expiry date. The bot can then use this information to find the correct account.
All this all takes place in a Dialogue called Root and the bot is delivered over the webchat channel.
The following flow would be executed:
- Recognizer identifies the intent “PayMyBillIntent”
- The Trigger “PayMyBillIntent” is fired
- Any Actions associated with the Trigger “PayMyBillIntent” are run.
- The user supplies the relevant information in each Action, the bot responds to user Input until the last Action has been reached.
- End Dialogue is called
That’s a quick overview of how the Bot Framework will process the Adaptive Dialogue. Let’s look at how you can leverage this to help implement a database driven conversation.
Use Adaptive Dialogue to build database driven conversations
You can think of the relationships between the key components like this:
- 1 Dialogue can have many Triggers
- 1 Trigger can have many Actions (each Action can be of a different Type)
- 1 Action can have many Inputs
- 1 Input can be of many Types
- 1 Input can have many Input Choice Items (optional but useful for storing Button options etc)
Taking this into account, you can build a data model to persists Dialogue, Trigger, Action and Input configuration data. What would that look like?
Database
The follow diagram gives you an overview of a schema that can store the necessary data for some of the concepts that we’ve just looked at:
Having the data in this format paves the way to let you write code that can hydrate conversational logic from the database. It opens a whole range of possibilities and lets you programmatically construct conversations at run-time which can:
- ask the user for input
- respond to user input
- adjust to context or events
- invoke dynamic business rules or 3rd party web services
- …and much more!
Modelling conversational logic like that can also help reduce the number of code changes or deployments you need to make. What might this data look like? Read on!
Sample Records
Staying with the credit card user story and the data model we’ve just looked at; the conversation configuration records can be stored like this (table names in red):
This configuration setup would support the instantiation of a conversation from the database that:
- identifies if the use has entered the text “pay” or “pay bill” as in intent
- tells the bot ask the user to supply their credit card number
- instructs the bot to thank the user for their input
- tells the bot to present 3 options to the user and ask what the user would like to do next
- stores the users selected choice in a variable
Breaking it down a bit, you can see the responsibility of each table here:
AdaptiveTrigger
- Captures the intent / trigger “pay / pay bill” by using a regular expression.
AdaptiveTriggerAction
- The bot asks the user to supply their credit card number (Action ID 1). Note – the human readable text displayed to the user for this is read from the respective section in Language File (“.lg”) file. The section annotated with [Prompt_ForCreditCardNumber] is used and injected at runtime.
- The bot sends a message thanking the user. (Action ID 2).
AdaptiveInput & AdaptiveChoiceItem
- The bot then sends the user a Choice Input (Action ID 3) and asks what they want to do next. The following options are displayed:
- Pay my credit card (ChoiceID: 1)
- Check my balance (ChoiceID: 2)
- Query transaction (ChoiceID: 4)
- When the user selects a Choice Item. This ID is stored in the variable user.nextAction.
Granted this is a simple example but by implementing some of the other Action types such as Http Request or logical expressions types (If, Switch etc), you can start to build more complex conversations.
For example, you could extend this credit card example with an additional Action that introduces a conditional expression to validate the credit card number has 16 digits. You could then read the output of this evaluation into other Action which will either let the bot proceed, or re-prompt the user to supply a valid credit card number.
Business Logic – Mapping from Custom Types to Adaptive Dialogue Types
Storing chatbot conversational logic in the database is one part of the puzzle. The next thing you need to do hydrate these records to objects you can work with in your application. After you’ve done this, you can map these objects to their respective Adaptive Dialogue counterparts.
The main components involved in this are:
- Custom Dialogue Manager and related data access class
- Bot Framework SDK
- Adaptive Dialogue
The Process Flow
In the UML diagram below, CustomDialogueManager is a class which is responsible for orchestrating the necessary calls to:
- Get the dialogue configuration
- Map these to Bot Framework and Adaptive Dialogue specific types
The class CustomDialogueData is a simple CRUD class which returns DTOs from SQL Server.
The process consists of the following steps:
- Bot loads for the first time. The OnTurn method fires as the user joins the conversation
- CustomDialogueManager is instantiated which makes a call to CustomDialogueData
- CustomDialogueData loads the Root Dialog configuration DTOs which contain all Trigger, Action and Input settings for the entire conversation.
- These are then returned to the CustomDialogueManager.
- The CustomDialogueManager maps the DTOs to the Bot Framework Adaptive Dialogue specific datatypes.
With DTO’s then mapped to Adaptive Dialogue types, the Bot Framework SDK can then go about its regular business and serve the conversation to the user. I’ve experimented with this approach and has reasonable success.
Closing Thoughts and Further Ideas
We’ve covered quite a bit so far, to recap we’ve:
- Introduced Adaptive Dialogue
- Look at a data model than can persist some basic dialogue configuration data
- Seen how this can be used to hydrate a conversation from settings stored in the database to Adaptive Dialogues
What’s been outlined is by no means and end to end solution but is enough to get up and running with a process of skeleton API that lets you implement database driven chatbot conversations.
Some other ideas to consider:
Child Dialogues – This example only deals with one root dialogue. Introduce child dialogues or create nested Actions with respective Inputs to handle additional branches of logic.
Expressions – Implement Actions with conditions coupled with expressions to help you implement business rules such as evaluating the contents of in-memory variables. Use that to then direct the flow of the conversation. Store these in the database to help add further levels of configuration to your chatbot!
Making calls to external systems – Use the Http Request Action to tap into logic in web services to further augment your chatbots capabilities. No sense in reinventing the wheel if a service out there already does most of the heavy lifting for your particular use case. Simple configure an Http Request Action and parse the response after it completes.
Visual Interface – Create a visual designer that lets you drag and drop widgets onto an ASP.NET canvas that model the conversations you need to create. Granted this will take a little time but being able to drag and drop these components in canvas of sorts would be a great user experience!
These are just some ideas to get you going and I’m sure you’ve probably got more of your own.
~~~
If there is further interest in what’s been outlined in this blog, I can code up what’s been discussed and share on GitHub.
Let me know in the comments below!
Rohan Rao
Hi.. Good post. Can you please write this using C# and post an article on it?
jamie_maguire
Hi Rohan,
Thanks for reading the article and I’m glad you got some value out of it. If there is enough demand I will finish the C# code for this article.
Are there any others that got value from the article?
Marc
+1 for C#
jamie_maguire
Ok. That’s another for C#!
If we get to over 10 readers asking for the C# code to be written, I can set some time aside to write the associate code. 🙂
Rahul
Excellent post. I would vote for C#
jamie_maguire
Glad you like it, Rahul. Your response is noted!
Vikram Sharma
Excellent post. I would vote for C#
jamie_maguire
Hi Vikram. I’m glad you got some value in the article and your response is noted!
Vasu
Good post. Very well explained.
jamie_maguire
HI Vasu.
Thanks, I’m glad you found it useful!
Kangmin
Great post! Can you also include example Http Request Action in C#? Thank you!
Kangmin
I figured it out, see here: https://stackoverflow.com/questions/62861153/how-to-convert-from-xml-to-json-within-adaptive-dialog-httprequest
jamie_maguire
You beat me to it. Glad to hear you figured it out.
Jasper
Another +1 for C# code. I’ve been looking into how to add dynamic dialog steps where the options are selected from a third party database and need to be queried based on previous answers given by the user. It’s not that trivial so far so if you’ve got things actually working some sample code would help a lot.
Justin Roby
What are you thoughts on .lg and .lu storage as these can get rather extensive the large the bot coverage? Maybe Azure Files to hydrate from? I would also give a +1 for C# code in github for this concept. I want to build a much larger conceptual language bot that handles many different utterances and intents so the idea of storing that in a backend database is a great idea. Do you think the database approach beneficial above and beyond declarative assets? I am just curious if the DB approach is the best long term scaled out solution for a much larger conversational bot that will handle IM, TTS, STT, voice, etc.
Pari
I also vote for c#
Titus
Tx Jamie… I’m struggling a bit trying to implement this… Did you make the C#?…
Harsiddh Dave
One more vote for C#!
Anjaneya Ponnapati
Thank You for good article. one more vote for C#.
Hemanth Babu
Excellent article. One more vote for C#.
UE Development Service
If some one desires expert view on the topic of blogging then i suggest him/her to pay a visit this website, Keep
up the good work.