Custom Rasa Policies

Karen Farbman

One of our favorite chatbot frameworks to use is Rasa. Since it is open-sourced, it is possible to customize any aspect of your chatbots. Rasa provides excellent tutorials on creating custom NLP components and custom actions. The other major aspect of a chatbot interaction is determining what the next action should be. In Rasa, this is done with custom policies. Custom policies are rarer than custom actions or NLP components, but, we see two major times when they’re warranted:

– You want to experiment with a novel method for determining the next steps, in a research capacity.

– You want the decision of what the chatbot should do next to be based on some sort of external information.

Today, we will be using the example of a bot who is not a night owl. It refuses to talk to people between midnight and 8am, in the local time where it’s deployed. This is a bit of an extreme example, but you could imagine a policy which checks the health of certain external resources needed in the conversation, and stalls the user with small talk if they need time to come back up or scale. In both of these examples, you could implement the same functionality by creating a custom NLP component to set a specific entity. Then you could use a rule that is triggered when the corresponding slot is set. This mixes up the logic of what to do next with the NLU pipeline and should be avoided. If the process method of your custom component does not access the message parameter in its body, then a custom policy is most likely more appropriate. Now, on to the example!

This example is built on top of one of the Rasa starter kits, retail-demo. If you want to follow along, you can clone the repo using:

A custom policy inherits from the Policy class and must implement the following 4 methods

train
predict_action_probabilities
_metadata_filename
_metadata

First we will get our class setup:

rasa-demo/sleep_policy.py

Besides importing some necessary modules, this sets up the constructor. The constructor reads the policy arguments from the config.yaml file used for the bot. Notice that we inherited from Policy.

The first method we are required to have is train. In this case, there is no training to be done, so we will simply pass:

The major part of our policy is predict_action_probabilities. For each policy in the configuration, Rasa will call this method, and then choose the action whose probability is the highest across all policies. To force an action to happen, we can set its probability to 1.

The tracker object holds the entire history of our conversation, but we really only care about what the last action taken was, hence the call to last_action_name. If we don’t include the step, our policy will run the utter_go_away action over and over again in an infinite loop. Luckily Rasa has a circuit breaker mechanism to prevent this from being truly infinite, but it doesn’t make for a good conversation! So make sure to predict the next action as listen when it’s appropriate. Regardless of where the user is, the rest of the code checks if the time on the Rasa server is between the sleep and wake hours. If it is, it sets the probability for utter_go_away to 1 and the probability of every other action to 0. 

The other two methods we need to define are _metadata_filename and _metadata. The load and persist methods of Policy will use these so we don’t have to redefine them. 

This is all that is required to make a custom policy in Python! Place this in a file named sleep_policy.py at the top level in the bot. To make the bot use the new policy, we need to modify two files.

In config.yaml, change the policies section to look like the following:

Notice that we specified the priority of our new policy as 6, which is the highest priority a policy can have in Rasa. We have also decreased RulePolicy’s priority from its default of 6 down to 5. We need our policy to be higher than RulePolicy or it will never run. To learn more about policy priorities in Rasa, check out the documentation.

Finally, in domain.yaml, specify something to say for utter_go_away in the responses section.

Now, using a custom policy, you have a bot that will tell you to go away whenever you message between the defined hours! If you message it right before and after the wake time, you should see the following.

As you can see, these policies allow you to customize the bot even more than just custom actions or components. They may allow you to do things with the bot that you otherwise wouldn’t have been able to. For example, you can enable the inclusion of external information, or simplify the way your custom bot works. Custom policies aren’t covered in as much detail as custom components and actions, so we hope this example can give some more insight into how they work.