Extending Chatbots with Azure Cognitive Services

Microsoft Bot Framework is an open source SDK and set of tools for developing chatbots. One of the advantages of building chatbots with the Bot Framework is that you can easily integrate your bot service with the powerful AI algorithms available through Azure Cognitive Services. This is a quick and easy way to give your chatbot super powers when you need them.

Microsoft Cognitive Services is an ever-growing collection of algorithms developed by experts in the fields of computer vision, speech, natural-language processing, decision assistance, and web search. The services simplify a variety of common AI-based tasks, which are then easily consumable through web APIs. The APIs are also constantly being improved and some are even able to teach themselves to be smarter based on the information you feed them.

Here is a quick highlight reel of some of the current Cognitive Services available to chatbot creators:

Language

People have a natural ability to say the same thing in many ways. Intelligent bots need to be just as flexible in understanding what human beings want. The Cognitive Service Language APIs provide language models to determine intent, so your bots can respond with the appropriate action.

The Language Understanding Service (LUIS) easily integrates with Azure Bot Service to provide natural language capabilities for your chatbot. Using LUIS, you can classify a speaker’s intents and perform entity extraction. For instance, if someone tells your bot that they want to buy tickets to Amsterdam, LUIS can help identify that the speaker intends to book a flight and that Amsterdam is a location entity for this utterance.

While LUIS offers prebuilt language models to help with natural language understanding, you can also customize these models for particular language domains that are pertinent to your needs. LUIS also supports active learning, allowing your models to get progressively better as more people communicate with it.

Decision assist services

Cognitive Services has knowledge APIs that extend your bot’s ability to make judgments. Where the language understanding service helps your chatbot determine a speaker’s intention, the decision services help your chatbot figure out the best way to respond. Personalizer, currently in preview, uses machine learning to provide the best results for your users. For instance Personalizer can make recommendations or rank a chatbot’s optional responses to select the best one. Additionally, the Content Moderator service helps identify offensive language, images, and video, filtering profanity and adult content.

Speech recognition and conversion

The Speech APIs in Cognitive Services can give your bot advanced speech skills that leverage industry-leading algorithms for speech-to-text and text-to-speech conversion, as well as Speaker Recognition, a service that lets people use their voice for verification. The Speech APIs use built-in language models that cover a wide range of scenarios with high accuracy.

For applications that require further customization, you can use the Custom Recognition Intelligent Service (CRIS). This allows you to calibrate the language and acoustic models of the speech recognizer by tailoring it to the vocabulary of the application and to the speaking style of your bot’s users. This service allows your chatbot to overcome common challenges to communication such as dialects, slang and even background noise. If you’ve ever wondered how to create a bot that understands the latest lingo, CRIS is the bot enhancement you’ve been looking for.

Web search

The Bing Search APIs add intelligent web search capabilities to your chatbots, effectively putting the internet’s vast knowledge at your bot’s fingertips. Your bot can access billions of:

· webpages

· images

· videos

· news

· local businesses

Image and video understanding

The Vision APIs bring advanced computer vision algorithms for both images and video to your bots. For example, you can use them to recognize objects, people’s faces, age, gender, or even feelings.

The Vision APIs support a variety of image-understanding features. They can categorize the content of images, determining if the setting is at the beach or at a wedding. They can perform optical character recognition on your photo, picking out road signs and other text. The Vision APIs also support several image and video-processing capabilities, such as intelligently generating image or video thumbnails, or stabilizing the output of a video for you.

Summary

While chatbots are already an amazing way to help people interact with complex data in a human-centric way, extending them with web-based AI is a clear opportunity to make them even better assistants for people. Easy to use AI algorithms like the ones in Microsoft Cognitive Services remove language friction and give your chatbots super powers.

Creating a Chatbot with Microsoft Azure QnA Maker and Alexa

QnA Maker is Microsoft’s easy-to-use, cloud-based API for turning a public-facing FAQ page, product manuals, and support documents into a natural-language bot service. Because it takes in pre-vetted data to use as its “smarts,” it’s one of the easiest ways to build a powerful bot for your company.

Alexa, of course, is the world’s most pervasive host for conversational bots. It’s found in homes, corporate boardrooms, and anywhere else people want easy access to web-based information.

In this article, I will show you how to attach the plumbing to push the Q&A information your company wants users to know onto the conversational bot devices that they are most frequently using.

Part 1: Creating a bot service with QnA Maker

To get started, I first created a free Azure account to play with. I then went to the QnA Maker portal page and clicked the Create a knowledge base tab at the top to set up the knowledge base for my bot. I then clicked the blue Create a QnA service button to make a new QnA service with my free Azure account.

I followed the prompts throughout the process, which made it easy to figure out what I needed to do at each step.

In step 2, I selected my Azure tenant, Azure subscription name, and Azure resource name associated with the QnA Maker service. I also chose the Azure QnA Maker service I’d just created in the previous step to host the knowledge base.

I then entered a name for my knowledge base and the URL of my company’s FAQ to use as the brains for my knowledge base. If you just want to test this part out, you can even use the FAQ for QnA Maker itself.

QnA Maker has an optional feature called Chit-chat that let me give my bot service a personality. I decided to go with “The Professional” for this, but definitely would like to try out “The Comic” at some point to see what that’s like.

The next step was just clicking the Create your KB button and waiting patiently for my data to be extracted and my knowledge base to be created.

Once that was done, I opened the Publish page in the QnA Maker portal, published my knowledge base, and hit the Create Bot button.

After filling out additional configuration information for Azure that was specific to my account, I had a bot deployed with zero coding on Microsoft Bot Framework v4. I could even chat with it using the built-in “Test in Web Chat” feature. You can find more details in this cognitive services tutorial.

Part 2: Making your bot service work on Alexa

To get the bot service I created above working with Alexa, I had to use an open-source middleware adapter created by the botbuilder community. Fortunately, the Alexa Middleware Adapter was available as a NuGet package for Visual Studio.

I went to the Azure portal and selected the bot I created in the previous section. This gave me the option to “Download Bot source code.” I downloaded my bot source code as a zip file, extracted it into a working directory, and opened it up in Visual Studio 2017.

When the bot is automatically generated, it’s created with references to the Microsoft.AspNetCore.App NuGet package and the Microsoft.AspNetCore.App SDK. Unfortunately, this had compatibility issues with the middleware package. To fix this, I right-clicked on the Microsoft.AspNetCore.App NuGet package in the Solution Explorer window and removed it. This automatically also removed the equivalent SDK. To get back all the DLLs I needed, I used NuGet Package Manager to install the Microsoft.AspNetCore.All (2.0.9) package instead. Be sure to install this specific version of the package to ensure compatibility.

After making those adjustments to the solution, I went to the Visual Studio menu bar and selected Tools -> Nuget Package Manager -> Manage Nuget Packages for Solution. I searched for Adapters.Alexa and installed the Bot.Builder.Community.Adapters.Alexa package.

If your downloaded app is missing its Program.cs or Startup.cs file, you will need to create these for you project in order to build and publish. In my case, I created a new Microsoft Bot Builder v4 project and copied these two files from there. In the Startup method of the Startup class I created a ConfigurationBuilder to gather my app settings.

Then in the ConfigureServices and Configure methods, I added a call to services.AddAlexaBot and UseAlexa in order to enable the Alexa middleware and set up a special endpoint for calls from Alexa.

Following these code changes, I published the Web App Bot back to my Azure account. The original QnA Bot Service now has an additional channel endpoint for Alexa. The Alexa address is the original Web App Bot root address with /api/skillrequests added to the end.

At this point, I was ready to go to my Amazon account and create a new Alexa skill. I went to: https://developer.amazon.com/alexa and signed in. (If you don’t already have a developer account, you will need to enter your information and agree to the developer EULA.) Next, I tapped the Alexamenu item at the top of the developer page and selected Alexa Skills Set. This took me to https://developer.amazon.com/alexa/console/ask, where I clicked the Create Skill button.

I wrote in a unique name for my skill, selected Custom for the model, and clicked Create skill. On the following screen, I selected Start from Scratchfor my template.

I selected JSON Editor.

Next, I opened another web browser and went to this source code, and copied the example JSON found in the README.md file.

I returned to the web browser that had the Amazon Alexa portal opened and pasted the JSON into the box. I change the invocationName to the name of my skill, clicked Save Model, and finally clicked Build Model.

After waiting patiently for the build to complete, I selected Endpoint in the left navigation window and clicked HTTPS. I then entered the address of the Azure App Service URL and added /api/skillrequests to the end.

To distribute my Alexa skill so people can use it on their own Amazon devices, I clicked the Distribution link in the Alexa developer console and followed the instructions from there.

And before I knew it, I was able to have a conversation with my company’s FAQ page, using the QnA Maker’s professional chit-chat personality, from my living room.

Microsoft’s convergence of chatbots and mixed reality

One of the biggest trends in mixed reality this year is the arrival of chatbots on platforms like HoloLens. Speech commands are a common input for many XR devices. Adding conversational AI to extend these native speech recognition capabilities is a natural next steps toward a future in which personalized virtual assistant backed by powerful AI accompany us in hologram form. They may be relegated to providing us with shopping suggestions, but perhaps, instead, they’ll become powerful custom tools that help make us sharper, give honest feedback, and assist in achieving our personal goals.

If you have followed the development of sci-fi artificial intelligence in television and movies over the years, the move from voice to full holograms will seem natural. In early sci-fi, such as HAL from the movie 2001: A Space Odyssey or the computer from the original Star Trek, computer intelligence was generally represented as a disembodied voice. In more recent incarnations of virtual assistance, such as Star Trek Voyager and Blade Runner 2049, these voices are finally personified by full holograms of the Emergency Medical Hologram and Joi.

In a similar way, Cortana, Alexa, and Siri are slowly moving from our smartphones, Echos, and Invoke devices to our holographic headsets. These are still early days, but the technology is already in place and the future incarnation of our virtual assistants is relatively clear.

The rise of the chatbot

For Microsoft’s personal digital assistant Cortana, who started her life as a hologram in the Halo video games for Xbox, the move to holographic headsets is a bit of a homecoming. It seems natural, then, that when Microsoft HoloLens was first released in 2016, Cortana was already built into the onboard holographic operating system.

Then, in a 2017 article on the Windows Apps Team blog, Building the Terminator Vision HUD in HoloLens, Microsoft showed people how to integrate Azure Cognitive Services into their holographic head-mounted display in order to provide smart object recognition and even translation services as a Terminator-like HUD overlay.

The only thing left to do to get to a smart virtual assistant was to tie together the HoloLens’s built-in Cortana speech capabilities with some AI to create an interactive experience. Not surprisingly, Microsoft was able to fill this gap with the Bot Framework.

Virtual assistants and Microsoft Bot Framework

Microsoft Bot Framework combines AI backed by Azure Cognitive Serviceswith natural-language capabilities. It includes a set of open source SDKs and tools that enable developers to build, test, and connect bots that interact naturally with users. With the Microsoft Bot Framework, it is easy can create a bot that can speak, listen, understand, and even learn from your users over time with Azure Cognitive Services. This chatbot technology is sometimes referred to as conversational AI.

There are several chatbot tools available. I am most familiar with the Bot Framework, so I will be talking about that. Right now, chatbots built with the Bot Framework can be adapted for speech interactions or for text interactions like the UPS virtual assistant example above. They are relatively easy to build and customize using prepared templates and web-based dialogs.

One of my favorite ways to build a chatbot is by using QnA Maker, which lets you simply point to an online FAQ page or upload product documentation to use as the knowledge base for your bot service. QnA Maker then walks you through applying a chatbot personality to your knowledge base and deploying it, usually with no custom coding. What I love about this is that you can get a sophisticated chatbot rolled out in about half a day.

Using the Microsoft Bot Framework, you also have the ability to take full control of the creation process to customize your bot in code. Bot apps can be created in C#, JavaScript, Python or Java. You can extend the capabilities of the Bot Framework with middleware that you either create yourself or bring into your code from third parties. There are even advanced capabilities available for managing complex conversation flows with branches and loops.

Ethical chatbots

Having introduced the idea above of building a Terminator HUD using Cognitive Services, it’s important to also raise awareness about fostering an environment of ethical AI and ethical thinking around AI. To borrow from the book The Future Computed, AI systems should be fair, reliable and safe, private and secure, inclusive, transparent, and accountable. As we build all forms of chatbots and virtual assistants, we should always consider what we intend our intelligent systems to do, as well as concern ourselves with what they might do unintentionally.

The ultimate convergence of AI and mixed reality

Today, chatbots are geared toward integrating skills for commerce like finding directions, locating restaurants, and providing help with a company’s products through virtual assistants. One of the chief research goals driving better chatbots is to personalize the chatbot experience. Achieving a high level of personalization will require extending current chatbots with more AI capabilities. Fortunately, this isn’t a far-future thing. As shown in the Terminator HUD tutorial above, adding Cognitive Services to your chatbots and devices is easy to do.

Because holographic headsets have many external sensors, AI will also be useful for analyzing all this visual and location data and turning it into useful information through the chatbot and Cognitive Services. For instance, cameras can be used to help translate street signs if you are in a foreign city or to identify products when you are shopping and provide helpful reviews.

Finally, AI will be needed to create realistic 3D model representations of your chatbot and overcome the uncanny valley that is currently holding back VR, AR, and MR. When all three elements are in place to augment your chatbot — personalization, computer vision, and humanized 3D modeling — we’ll be that much closer to what we’ve always hoped for — personalized AI that looks out for us as individuals.

Here is some additional reading on the convergence of chatbots and MR you will find helpful:

Increasing Business Reach with Azure Bot Service Channels

Where do bots live? It’s a common misconception that bots live on your Echo Dot, on Twitter, or on Facebook. To the extent bots call anywhere their home, it’s the cloud. Objects and apps like your iPhone and Skype are the “channels” through which people communicate with your bot.

Azure Bot Service Channels

Out of the box, Azure Bot Service supports the following channels (though the list is always growing):

  • Cortana
  • Email
  • Facebook
  • GroupMe
  • Kik
  • LINE
  • Microsoft Teams
  • Skype
  • Skype for Business
  • Slack
  • Telegram

Through middleware created by the Bot Builder Community, your business’s bots can reach additional channels like Alexa and Google.

With Direct Line, your developers can also establish communications through your bots and your business’s custom apps on the web and on devices.

Companies like Dixons Carphone, BMW, Vodafone, UEI, LaLiga, and UPS are already using Microsoft Azure Bot Service support for multiple channels to extend their Bot reach.

UPS Chatbot, for instance, delivers shipping information and answers customer questions through voice and text on Skype and Facebook Messenger. UPS, which invests more than $1 billion a year in technology, developed its chatbot in-house and plans to continue to update its functionality, including integration with the UPS My Choice® platform using Direct Line. In just the first eight months, UPS Bot has already had more than 200,000 conversations over its various channels.

LaLiga, the Spanish football league, is also reaching its huge and devoted fan base through multiple channels with Azure Bot Service. It is estimated that LaLiga touches 1.6 billion fans worldwide on social media.

Using an architecture that combines Azure Bot Service, Microsoft Bot Framework and multiple Azure Cognitive Services such as Luis and Text Analysis, LaLiga maintains bots on Skype, Alexa and Google Assistant that use natural language processing. NLP allows their chatbots to understand both English and Spanish, their regional dialects, and even the soccer slang particular to each dialect. They are even able to use a tool called Azure Monitor anomaly detection to identify new player nicknames created by fans and then match them to the correct person. In this and similar ways, LaLiga’s chatbots are always learning and adapting over time. LaLiga plans to deploy its chatbots to almost a dozen additional channels in the near future.

Conclusion

Because social media endpoints are always changing, developing for a single delivery platform is simply not cost-effective. Channels provide businesses with a way to develop a bot once but deploy it to new social media platforms as they appear on the market and gain influence. At the same time, your core bot features can constantly be improved, and these improvements will automatically benefit the pre-existing channels people use to communicate with you.