OK, so a few days ago I decided that the current IMAP servers out there (namely Courier) were a pain in the ass, so being a maniac, I decided to write my own. I figured if I implemented it in Python, did a minimum approach to security, made the server read-only, and a few other shortcuts, I could get the server done rather quickly, and I could read my e-mail with any client I wish.
Well, the protocol is complicated enough that it is quite hard to do any serious processing with simple regular expressions. I first tried incrementally building regular expressions (like the following):
reloginargs = re.compile(_loginargs) _loginargs = r"(?:%s %s)" % (_userid, _password) _userid = r"(?:%s)" % (_astring) _astring = r"(?:%s|%s)" % (_atom, _string) _atom = r"(?:%s+)" % (_ATOM_CHAR) _ATOM_CHAR = r"(?:%s)" % (_atom_specials) _atom_specials = r"(?:[a-zA-Z0-9/])" # HACK ...
This worked reasonably well, except for extracting the data from the matched string was very difficult. I finally decided to bite the bullet and try to use a full parser generator. I initially tried to get reaccustomed to lex/yacc, just to see if it was possible, but I couldn’t quite get my brain around them. Eventually I found a LL(1) parser generator Yapps (Yet Another Python Parser System), which is written in python and produces python code. After looking at the examples for a little bit, along with some experimentation, I was finally able to start on the IMAP grammar file. I spent a few hours, learnt some LL(1) tricks, and I actually have a grammar which understands a good portion of the IMAP protocol (as much or more as the regular expression-based one, and certainly more RFC compliant). It is especially cool because the grammar closely mirrors the EBNF in the “Formal Syntax” portion of the IMAP RFC.
Since I use various clients (Outlook Express and mutt primarily) to test the current state of the IMAP server (and also to see what commands they use, to minimize the commands I must implement), and I’ve pretty much exhausted simply parsing the commands I know, I must move on to actually acting on a parsed command. This should be fairly simple once I think about the right way to do it.
Well, even if this is a big waste of time, I think I’ve learned quite a bit.
Recent Comments