May 22, 2018
Amazon Alexa, the popular voice assistant helping with everyday tasks, has become a best friend to many households in the US and elsewhere. The list of available features for Alexa development, released by both Amazon and third parties, keeps growing steadily, but according to reports from January of 2018, that the total number of skills (apps) working in the US only 25,784. More vendors are looking to integrate their product with the voice assistant—currently, you can link it to your Gmail account, your calendar, phone, or even home lighting.
Ordering groceries or a cab? Alexa can do that for you. Some even claim that the technology may become the industry standard for personal assistants everywhere. A truly fascinating prospect.
Given the above, I decided to do some quick research on Alexa and learn how to extend Alexa’s features by writing apps--called “Skills”--ultimately putting together a proof of concept in the form of a simple app. If we ignore the fact that the debugging part still needs some work, that testing with text chat doesn’t work, and that you’ll probably feel you’re going a little insane after talking to your computer for a couple of days, then the conclusion is that developing skills for Alexa is quite easy.
In the following tutorial, I’ll show you how to deploy a sample Alexa Skill called “Conf Room Manager” and explain how it works. Afterwards, you should be able to write your own Skills and use them in your project.
Most of the examples available online don’t use alexa-sdk (npm package), but I went with it because it provides a nice DSL that allows you to write less code. I will also use the default data storage for that SDK (DynamoDB) and Amazon Lambda as a backend (Alexa can use any compatible backend via HTTP).
My example won’t include any real integration with the calendar or anything because that’s beyond the scope of this piece. There are, however, many resources out there that will help you implement genuine Google Calendar integration. Additionally, I’m using DynamoDB only to save results but you can read data in a likewise manner and create more complex skills. One last thing—I haven’t used any JS transpiler but it’s definitely a better idea to send one compiled js file than zipping the entire dir ;-)
Getting Ready for Alexa Development
During the steps above, you will probably have a lot of AWS permissions errors, missing group/users etc. This is because each feature (DynamoDB, Lambda, etc.) requires some permissions that are not granted by default. But don’t worry, just send the error message to your AWS admin and they’ll know how to help.
An Alexa Skill comprises two parts: skill configuration in the skill editor and the endpoint (code on AWS Lambda or on your server).
In the skill editor, you define:
Let’s start with the FreeConfRooms intend.
It doesn’t have any variables. We just need to specify sample utterances in the skill editor that will trigger that intend, e.g. “Find a free conf room” or “Are there any free conf rooms now?” That’s it. Now, saying “Alexa, ask conf room manager if there is a free conf room” should trigger this intend and we just need to handle it in a Lambda function.
The second intend, RoomBooking, is a little bit more complex.
There are two variables (intend slots): period and room. Now, when we define sample utterances, we can use those variables, e.g. “Book the {room} conf room for {period}.” But we don’t have to use all variables in each utterance. Actually, we shouldn’t. We should define utterances with only some of them, e.g. “Book the {room} conf room,” or even without any variables at all, such as “Book a conf room.” Then, in Intent Slots -> Edit we can define questions and sample answers for each slot, e.g. “For how long?” “For {period}.” Those questions will be asked when you don’t provide the necessary values.
Now we need to write handlers for each intend in the AWS Lambda function (see handlers in index.js). Let’s start with the simplest one:
'FreeConfRooms': function() {
this.emit(':tell', "Maybe there are...");
},
So now when you ask the conf room manager about free conf rooms, it will just answer “Maybe there are…”
Of course, in a real-life example, you’d have to perform some requests to the calendar API and then call this.emit(':tell', ...)
in a promise. This is quite straightforward.
The handler for RoomBooking, however, is a little bit more complex and definitely not as straightforward.
'RoomBooking': function() {
let intent = this.event.request.intent;
const period = _.get(intent.slots, "period.value");
const room = getRoomId(intent);
if (period && room) {
let periodMinutes = moment.duration(period).asMinutes();
saveBooking({period, room}, this)
this.emit(':tell', `Ok, I will book ${room} room for ${periodMinutes} minutes`);
} else {
this.emit(':delegate', intent);
}
}
First, we are trying to get slot values from this.event.request.intent.slots.period
is a duration encoded in ISO8601 format. Using getRoomId()
, we are reading room IDs hidden deep inside the object.
Then, if both values are present, the handler converts the duration to minutes, saves the data to DynamoDB, and tells us that we’ve successfully booked the room. If not, we need to call this.emit(':delegate', intent)
to instruct Alexa to ask you additional questions defined in the slot, such as “For how long?”
Alexa won’t do this automatically, to allow you to react in a custom way when some slot values are missing—for example, by setting default values. Additionally, getRoomId()
will remove the room value from the slots if it doesn’t have the ID to ask again about the room using :delegate
The last thing we need to do is implement handlers for some common intends:
By using a storage adapter for alexa-sdk, we can easily save data by setting values in this.attributes
object. It will be saved in DynamoDB with the ID of a given user and when that user uses our skill again, we’ll have access to these attributes.
Unfortunately, all your console.logs will go to Amazon CloudWatch logs, so debugging is quite a pain. Additionally, if you are an introvert like me, you’ll probably feel exhausted after a couple of hours of talking to Alexa—the text chat debugging tool doesn’t work.
It seems that Amazon's Alexa features will become more ubiquitous in the near future. Before the end of 2018, Amazon is planning to release at least 8 new Alexa-powered devices including microwave oven, receiver, subwoofer and car gadgets making voice assistants a part of everyday life.
As Jeff Bezos, the CEO of Amazon stated in July:
We want customers to be able to use Alexa wherever they are. There are now tens of thousands of developers across more than 150 countries building new devices using the Alexa Voice Service, and the number of Alexa-enabled devices has more than tripled in the past year.
You can set up your Alexa-enabled devices, listen to music, order groceries or manage your lights. Source: AppStore
So it's good to see that writing voice apps might be easier than you’d think. The biggest issue right now? During my development efforts, I was dreaming about possibility of debugging my code in a manner similar to the browser developer console, where I can easily inspect any object during code execution.
Have you tried coding your own Alexa skills yet? We’d love it if you shared your experiences with us.
External links:
Want to build meaningful software?
Pay us a visit and see yourself that our devs are so communicative and diligent you’ll feel they are your in-house team. Work with experts who will push hard to understand your business and meet certain deadlines.