[ragel-users] Parsing PokerHand-History file (kind of log file withactions)
ragel-user at jgoettgens.de
ragel-user at jgoettgens.de
Sat Jul 23 16:00:19 UTC 2011
Jens!
It looks as if the “poker” language mainly consists of a simple list of key
value pairs, so a plain tokenizer might suffice. Compared to tools like
flex, or the scanner part of ANTLR, the focus is more on states and
transitions compared to entire. When matching arbitrary text you probably
also need to have a look at the longest-match Kleen star operator. With
Ragel you can do many things “on the fly”, but if you are just transforming
a list into a different format, you may not need this power. Likewise you
wouldn't need the power of an LR or LL(*) parser (though LL(*) grammars are
very easy to code and the speed penalty might be acceptable). You could use
the tokenizer of the C runtime library and subsequently match keywords using
the output from gperf. Simple and not slow.
I am using both techniques to remotely control an Asterisk PBX server
(telphony system) using the AMI protocol
(http://www.voip-info.org/wiki/view/Asterisk+manager+API). The AMI protocol
shares a lot with your "poker" language. The main difference is that I am
dealing with a real-time system (asynchronous communication, timing issues,
net problems, etc.) and I know that a valid input stream always has
terminating characters (or I insert a "timeout" token at the socket level,
so no need for expr**). Unfortunately not all Asterisk modules follow the
AMI protocol exactly (instead of violating the protocol view this as
extending the protocol) and there are a couple of exceptions that makes the
handwritten code now very ugly. This is where Ragel starts to shine.
There are also various text transformation tools out there. I think it could
be possible to transform your key value lists into SQL code without any
written line of source code (if you don't count the code the transformation
specification).
If you have a well behaved source, the system supplied tokenizer (+ gperf)
is probably preferrable, otherwise Ragel. Ragel is more fun, though. There's
a graphviz installer for your windows machine and starting from your simple
example you could add some output for all available actions to see what's
going on during execution time. It won't take long before you get a feeling
how things work and where you must be careful.
Of course, if Adrian had a lot of spare time left over, he could add an
"instrumentation" option to Ragel by adding diagnostic code to all states
and transitions (essentially adding any allowable action with some code). In
the simplest form there would be just some console output. A better solution
would be to fire events though a socket and with a little extra work to
control the input, one could write a nice graphics tool to visualize the FSM
and its transitions. This would be helpful for beginners, but generally
would be useful as a teaching tool for a college level CS class.
Happy tokenizing,
jg
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users
More information about the ragel-users
mailing list