A Beginner‘s Guide to the New AWS Python SDK for Alexa

Alexa devices

Alexa, the popular virtual assistant by Amazon, isn‘t just for checking the weather or playing music. Thanks to Alexa skills, developers can extend Alexa‘s capabilities to create custom voice-powered experiences.

The new AWS SDK for Python (Alexa Skills Kit) makes building these skills easier than ever. In this guide, we‘ll walk through everything you need to know to start creating your own skills with Python.

What is an Alexa Skill?

An Alexa skill is like an app for Alexa. Just as mobile apps extend the functionality of your phone, skills add new capabilities to Alexa-enabled devices like the Echo.

Skills define how Alexa responds to voice commands. For example, a nutrition tracking skill could allow Alexa to answer questions like "How many calories are in an apple?" or "Add milk to my food log."

Under the hood, skills consist of:

  1. An interaction model defining the voice commands the skill accepts
  2. A hosted service that receives these commands and tells Alexa how to respond (e.g. an AWS Lambda function)

Anatomy of an Alexa Skill with Python

Here‘s a high-level look at the components of a Python Alexa skill:

Alexa Python skill architecture

  1. Skill Invocation: The user speaks a phrase like "Alexa, open Nutrition Tracker"
  2. Interaction Model: Alexa maps this utterance to an intent defined in the skill‘s interaction model
  3. Python Handler: Alexa sends a POST request with the intent to the skill‘s backend Lambda function, which uses the Python SDK to handle the request
  4. Response: The Lambda handler tells Alexa how to respond by returning JSON. Alexa speaks this response and/or updates the companion app.

The Python SDK provides a clean, abstracted way to write Lambda handlers for your skill. It takes care of the low-level JSON parsing and generates responses, allowing you to focus on your skill‘s logic.

Setting Up Your Environment

Before we start coding, let‘s get your development environment ready. You‘ll need:

  1. An Amazon Developer account for creating your skill
  2. An AWS account for hosting your skill‘s backend in Lambda
  3. Python 3.6 or newer
  4. The ASK SDK for Python (pip install ask-sdk)
  5. A code editor (e.g. VS Code, PyCharm)

Once you‘ve installed Python and the ASK SDK library, you‘re good to go! We‘ll walk through configuring your skill in the Amazon Developer Console and AWS as we build it.

Building a Fact Skill

As our first skill, we‘ll create a simple fact skill that tells the user a random fact when they ask. Here‘s what we‘ll do:

  1. Create a new skill in the Alexa Developer Console
  2. Define our interaction model
  3. Create a Lambda function for our skill backend
  4. Write our skill code with the Python SDK
  5. Test it out!

Let‘s get started.

1. Creating Your Skill

Head over to the Alexa Developer Console and click "Create Skill". Give your skill a name like "Fun Facts" and choose "Custom" model.

For the template, select "Start from scratch". This will create an empty skill for us to fill in.

Create a new Alexa skill

2. Defining the Interaction Model

Next, we need to define our skill‘s interaction model. This specifies the voice commands our skill will accept.

Replace the contents of the JSON Editor with the following:

{
  "interactionModel": {
    "languageModel": {
      "invocationName": "fun facts",
      "intents": [
        {
          "name": "AMAZON.CancelIntent",
          "samples": []
        },
        {
          "name": "AMAZON.HelpIntent",
          "samples": []
        },
        {
          "name": "AMAZON.StopIntent",
          "samples": []
        },
        {
          "name": "GetNewFactIntent",
          "slots": [],
          "samples": [
            "a fact",
            "a fun fact",
            "tell me a fact",
            "tell me a fun fact",
            "give me a fun fact",
            "tell me trivia",
            "tell me something interesting"
          ]
        },
        {
          "name": "AMAZON.NavigateHomeIntent",
          "samples": []
        }
      ],
      "types": []
    }
  }
}

This defines a few key things:

  • Invocation name: How users launch your skill (e.g. "Alexa, open fun facts")
  • Built-in intents: Standard commands like help, stop, and cancel
  • Custom intents: For our skill, we define a GetNewFactIntent to handle requests for new facts

Save and build your interaction model before moving on.

3. Creating the Lambda Function

Now let‘s create the Lambda function for our skill‘s backend.

In your AWS console, navigate to Lambda and create a new function. Choose "Author from scratch" and name it something like alexa-fun-facts-skill. Under "Runtime" select "Python 3.6" or higher.

For permissions, create a new role from templates and search for "simple microservice permissions". This will give your function permission to log to CloudWatch.

Click "Create function" and you should see a bare Python function scaffold. We‘ll add our skill code here in a moment.

Create a new Lambda function

Before we write our code, we need to hook our Lambda up to our Alexa skill.

Back in the Alexa Developer Console, go to "Endpoint" and select "AWS Lambda ARN". Paste in your Lambda‘s ARN (found in the top right of the Lambda screen).

Set your skill's endpoint to Lambda

Make sure to save your endpoints before moving on! Our skill and Lambda are now wired together.

4. Writing the Skill Code

It‘s time for the fun part — let‘s bring our skill to life with code!

Back in your Lambda function, replace the default code with the following:

from ask_sdk_core.skill_builder import SkillBuilder
from ask_sdk_core.dispatch_components import AbstractRequestHandler
from ask_sdk_core.utils import is_request_type, is_intent_name
from ask_sdk_model import Response

# Define our list of facts
FACTS = [    
    "Banging your head against a wall burns 150 calories an hour.",
    "In the UK, it is illegal to eat mince pies on Christmas Day!",
    "Pteronophobia is the fear of being tickled by feathers!",
    "A flock of crows is known as a murder.",
    "It is physically impossible for pigs to look up into the sky."
]

# Handler for launch requests
class LaunchRequestHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        return is_request_type("LaunchRequest")(handler_input)

    def handle(self, handler_input):
        speech = "Welcome to Fun Facts! You can say ‘tell me a fun fact‘ to hear an interesting tidbit."
        handler_input.response_builder.speak(speech).ask(speech)
        return handler_input.response_builder.response

# Handler for GetNewFact intent 
class GetNewFactHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        return is_intent_name("GetNewFactIntent")(handler_input)

    def handle(self, handler_input):
        fact = random.choice(FACTS)
        speech = fact                 
        handler_input.response_builder.speak(speech)
        return handler_input.response_builder.response

# Handler for Help intent 
class HelpIntentHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        return is_intent_name("AMAZON.HelpIntent")(handler_input)

    def handle(self, handler_input):
        speech = "You can say ‘tell me a fun fact‘ to hear an interesting fact. What would you like to do?"        
        handler_input.response_builder.speak(speech).ask(speech)
        return handler_input.response_builder.response

# Handler for Stop and Cancel intents 
class CancelOrStopIntentHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        return (is_intent_name("AMAZON.CancelIntent")(handler_input) or
                is_intent_name("AMAZON.StopIntent")(handler_input))

    def handle(self, handler_input):
        speech = "Okay, see you next time!"        
        handler_input.response_builder.speak(speech)
        return handler_input.response_builder.response

# Handler for any other unmatched intents 
class FallbackIntentHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        return True

    def handle(self, handler_input):
        speech = "Hmm, I‘m not sure about that. You can say ‘tell me a fun fact‘ to hear an interesting fact."        
        handler_input.response_builder.speak(speech)
        return handler_input.response_builder.response

# Register all handlers and make the Lambda function in one step
lambda_handler = SkillBuilder().add_request_handler(
        LaunchRequestHandler()
    ).add_request_handler(
        GetNewFactHandler()
    ).add_request_handler(
        HelpIntentHandler()
    ).add_request_handler(
        CancelOrStopIntentHandler()
    ).add_request_handler(
        FallbackIntentHandler()
    ).lambda_handler()

Let‘s break this down:

First, we import the necessary modules from the Python SDK. The key classes are:

  • SkillBuilder for registering our request handlers and creating the Lambda handler entry point
  • AbstractRequestHandler which we subclass to create handlers for our intents
  • is_request_type and is_intent_name to map requests to the proper handlers

Next, we define a simple list of sample facts. In a real skill you‘d likely load these from an external file or database.

The bulk of the skill is in defining request handlers for our intents. For each intent we create a class overriding can_handle() and handle().

can_handle() is where we decide if this handler can service the current request. We use is_request_type() and is_intent_name() to check against the request JSON.

If can_handle() returns True, the handler‘s handle() method is invoked. Here‘s where we define Alexa‘s response.

We get access to a response_builder that makes it easy to compose Alexa‘s reply. We can:

  • speak() text for Alexa to say
  • ask() a question to keep the session open and await the user‘s reply
  • set_card() to show text in the Alexa companion app
  • Add audio, images, and video to enrich the response

Finally, we register our handlers and create our Lambda function using the SkillBuilder.

5. Testing Your Skill

With our code in place, let‘s see our skill in action!

First, we need to deploy the code. In Lambda, make sure to "Save" then click "Deploy". Now our backend is live.

In the Alexa Developer Console, go to the "Test" tab and enable testing for your skill. This lets you chat with your skill right from the browser.

Type or say "open fun facts" and you should see Alexa respond with our opening message. Then try asking for a fact — cool, it works! You can also test the help and stop intents.

Testing our Alexa skill

Congratulations, you just built your first Alexa skill with Python! From here you can customize the responses, add additional intents, include cards, audio, and visuals, and much more.

Next Steps

In this guide, we covered the fundamentals of building Alexa skills with the Python SDK. You learned how to:

  • Create a skill in the Alexa Developer Console
  • Define an interaction model for your skill
  • Set up a Lambda backend for your skill
  • Use the Python SDK to define request handlers
  • Test your skill in the Alexa simulator

You now have a solid foundation for bringing your own voice experience ideas to life. Some natural next steps:

  • Flesh out a complete voice interaction model for a unique skill idea
  • Enhance your responses with SSML for richer speech
  • Use the Alexa Skills Kit to add visuals, audio, and account linking
  • Persist session attributes in a database like DynamoDB
  • Apply dialog management for multi-turn conversations
  • Explore the Alexa Developer docs to see what else is possible

The Alexa Skills Kit Python SDK makes skill-building approachable to Python developers of all levels. As you dive deeper, remember to design with voice-first principles, prioritize the user experience, and have fun!

Similar Posts