29 Mar 2017
Build Slack Bot CLI using Context Free Grammar and Pyparsing
Harish Thyagarajan
#Integration | 6 min read
Build Slack Bot CLI using Context Free Grammar and Pyparsing
Harish Thyagarajan

At HashedIn, we use Slack for our internal communication. We wanted to build a Slack Bot to automate follow-ups. Here are some example commands we wanted.

ask @user1 @user2 every day evening
Please fill timesheets for today by evening?
ask #general every week friday
What was one thing you learned last week?
ask #general every month first
Submit your expense report by end of day.
ask #team by 18h
Confirm once you are done with your changes for code-freeze
ask #marketing by friday
Read this book - link-to-book
ask @user1 @user2 by today
Please review this blog I wrote - link-to-blog

Why Not NLP?

We actually started with trying to write the command line like we are used to on linux terminal and evaluated several libraries for that clickdocoptArgparse and OptParse. However, it turned out to be too hard to use for non-tech users. Our HR wouldn’t have used it and we wanted them to!


Then we then went all the way around and ran a trial for pure natural language parsing using and It was too verbose and ambiguous. Our audience would use this many times a day and can learn specific syntax instead of having to explain themselves to NLP.

The solution lied someone between pure NLP and linux shell like command line.


Designing Context Free Grammar

Our first proof of concepts started with regular expressions. That soon became too complicated. It was time to write our own small DSL and Context Free Grammar for parsing.


To define a Context Free Grammar (CFG), you need a set of generative rules for any possible valid sentence in your language. We used the excellent PyParsing library for writing the grammar. Here is our

from pyparsing import Word, Group, ZeroOrMore, nums, alphanums, OneOrMore, CaselessLiteral,
FollowedBy, Regex, Optional
USER = Group(Regex(r"@[A-Z0-9a-z_]+")).setParseAction(parse_username)
CHANNEL = Group(Regex(r"#[A-Z0-9a-z_-]+")).setParseAction(parse_user_group)
DAYS = Group(CaselessLiteral('monday') | 'mon'
| 'tuesday' | 'tue'
| 'wednesday' | 'wed'
| 'thursday' | 'thur'
| 'friday' | 'fri'
| 'saturday' | 'sat'
| 'sunday' | 'sun'
).setParseAction(lambda tokens: date_parse(tokens[0][0]))
RELATIVE_DAY = Group(CaselessLiteral('today')
| 'tomorrow'
| 'yesterday'
| 'day-after-tomorrow'
).setParseAction(lambda tokens: date_parse(tokens[0][0]))
RELATIVE_TIME = Group(Regex(r"\d+[mdh]")
).setParseAction(lambda tokens: interval_parse(str(tokens[0][0])))
SPECIAL_CHARACTERS = "~`!@#$%^&*()'?"
MULTILINE_TEXT = Group(Regex(r".*")).setParseAction(lambda tokens: str(tokens[0][0]))
TIME_OF_DAY = (Word(nums, max=2) + FollowedBy(' ')
| Word(nums, max=2) + ':' + Word(nums, exact=2))
TIME_OF_DAY_LABEL = (CaselessLiteral('morning')
| 'evening'
| 'afternoon'
| 'night'
INTERVAL_RELATIVE_TIME = Group(Regex(r"\d+[mdh]")).setParseAction(
lambda tokens: repeated_interval_parse(str(tokens[0][0])))
INTERVAL_DAYS = Group(CaselessLiteral('monday') | 'mon'
| 'tuesday' | 'tue'
| 'wednesday' | 'wed'
| 'thursday' | 'thur'
| 'friday' | 'fri'
| 'saturday' | 'sat'
| 'sunday' | 'sun'
).setParseAction(lambda tokens: repeated_date_parse(tokens[0][0]))
INTERVAL_DAY = Group(CaselessLiteral('day') | 'week').setParseAction(lambda tokens:
USERS = Group(OneOrMore(USER))
Ask Command Formats
ask by tomorrow @user1 @user2 What are you up to?
ask every day evening @user1 Please update the time sheet and let me know.
COMMAND_ASK_DEADLINE = (CaselessLiteral('ask')('command') +
ZeroOrMore(USERS('users')) +
ZeroOrMore(CHANNELS('channels')) +
CaselessLiteral('by')('type') -
TIME_ETA('eta') +
ZeroOrMore(TIME_OF_DAY)('time_of_day') +
COMMAND_ASK_REPEAT = (CaselessLiteral('ask')('command') +
ZeroOrMore(USERS('users')) +
ZeroOrMore(CHANNELS('channels')) +
CaselessLiteral('every')('type') -
TIME_INTERVAL('interval') +
Optional(TIME_OF_DAY)('time') +

Quick highlights on how the grammar was written:

  1. Define lowest level tokens as Group e.g. USERCHANNELDAYSRELATIVE_DAYINTERVAL_DAYS etc.
  2. Provide combinations of one or more tokens e.g TIME_OF_DAYTIME_ETATIME_INTERVALUSERSCHANNELS etc.
  4. State the parse actions. There are methods to parse a token and return python value e.g. tomorrow is parsed to datetime, 2d to timedelta and so on.


That completes the grammar to generate all sentences of the library. Pyparsing magic can use also use it to parse a string. Not really magic if you remember Compilers 101, but here it is anyway.


 >>> command_str = "ask by tomorrow @user_id Question text goes here"
>>> print(json.dumps(COMMAND_LINE.parseString(command_str).asDict()))
 "command" : "ask",
 "users": ["user_id"],
 "type": "by",
 "eta": datetime(2016,01,01) # next day’s date
 "question": "Question text goes here"


That’s our command with all arguments! We can write a simple python method to execute it.

def ask(type, users=[], channels=[],, question):
 # setup a reminder for that user(s) at the right times

That’s it. Our command DSL and executor are ready!

[/et_pb_text][et_pb_text admin_label=”Error handling” _builder_version=”3.0.86″ background_layout=”light” border_style=”solid” box_shadow_position=”outer”]

Error handling

When parsing fails, users want to know exactly why and where. To do this, follow these steps:


Error Stops These are - signs instead of + sign in the grammar above. Error stops prevent back-tracking at those breakpoints. If everything before the error stop matches, parser assumes that match as final and wouldn’t try another alternative even if rest of the string fails to match. That helps because you know which sentence failed, instead of an ambiguous couldn’t match error. That said, error stops make the grammar more limiting, so use them with care.


Token Parsing is easy. Just make sure parse actions throw the right exception with the right message which can be shown to the user.


Generic Tokens Tokens should be as generic as possible. In the example grammar, we have defined time of day as morning, evening etc. Instead we could have defined it as a string and manage error handling in parse action. That would allow mispellings to be corrected or reported in the parser e.g. morn instead of morning.


Thats it. CFGs and pyparsing is a powerful way to quickly build expressive DSLs. You define a generative grammar and use pyparsing to parse command strings into method names and arguments. With all the excitement around bots, I expect using it more often (at least until NLP becomes what its supposed to be).

Read similar blogs

Need Tech Advice?
Contact our cloud experts

Need Tech Advice?
Contact our cloud experts

Contact Us

PHP Code Snippets Powered By :