Build Slack Bot CLI Using Context Free Grammar & Pyparsing | HashedIn

Build Slack Bot CLI using Context Free Grammar and Pyparsing

Technology - 29 Mar 2017
Harish Thyagarajan

At HashedIn, we use Slack for our internal communication. We wanted to build a Slack Bot to automate follow-ups. Here are some example commands we wanted.

Why Not NLP?

We actually started with trying to write the command line like we are used to on linux terminal and evaluated several libraries for that clickdocoptArgparse and OptParse. However, it turned out to be too hard to use for non-tech users. Our HR wouldn’t have used it and we wanted them to!

Then we then went all the way around and ran a trial for pure natural language parsing using wit.ai and api.ai. It was too verbose and ambiguous. Our audience would use this many times a day and can learn specific syntax instead of having to explain themselves to NLP.

The solution lied someone between pure NLP and linux shell like command line.

Designing Context Free Grammar

Our first proof of concepts started with regular expressions. That soon became too complicated. It was time to write our own small DSL and Context Free Grammar for parsing.

To define a Context Free Grammar (CFG), you need a set of generative rules for any possible valid sentence in your language. We used the excellent PyParsing library for writing the grammar. Here is our grammar.py

Quick highlights on how the grammar was written.

  1. Define lowest level tokens as Group e.g. USERCHANNELDAYSRELATIVE_DAYINTERVAL_DAYS etc.
  2. Define combinations of one or more tokens e.g TIME_OF_DAYTIME_ETATIME_INTERVALUSERSCHANNELS etc.
  3. Define sentences COMMAND_ASK_DEADLINECOMMAND_ASK_REPEATCOMMAND_LINE
  4. Define parse actions. There are methods to parse a token and return python value e.g. tomorrow is parsed to datetime, 2d to timedelta and so on.

That completes the grammar to generate all sentences of the library. Pyparsing magic can use also use it to parse a string. Not really magic if you remember Compilers 101, but here it is anyway.

That’s our command with all arguments! We can write a simple python method to execute it.

That’s it. Our command DSL and executor are ready!

Error handling

When parsing fails, users want to know exactly why and where. This can be done as follows.

Error Stops These are - signs instead of + sign in the grammar above. Error stops prevent back-tracking at those breakpoints. If everything before the error stop matches, parser assumes that match as final and wouldn’t try another alternative even if rest of the string fails to match. That helps because you know which sentence failed, instead of an ambiguous couldn’t match error. That said, error stops make the grammar more limiting, so use them with care.

Token Parsing is easier. Just make sure parse actions throw the right exception with the right message which can be shown to the user.

Generic Tokens Tokens should be as generic as possible. In the example grammar, we have defined time of day as morning, evening etc. Instead we could have defined it as a string and manage error handling in parse action. That would allow mispellings to be corrected or reported in the parser e.g. morn instead of morning.

Summary

Thats it. CFGs and pyparsing is a powerful way to quickly build expressive DSLs. You define a generative grammar and use pyparsing to parse command strings into method names and arguments. With all the excitement around bots, I expect using it more often (at least until NLP becomes what its supposed to be).

Free tag for commerce

E-book on Digital Business Transformation