Grammar testing proposal
Colin Fleming
colin.flem... at coreproc.com
Sat Sep 16 15:53:01 UTC 2006
Hehe, TXL does actually look really interesting - and potentially
appropriate! I'd be worried about raising the bar to Ragel though, it
looks like TXL might take some time to wrap your head round.
BTW do you have a grammar spec for Ragel?
On 9/15/06, Adrian Thurston <thurs... at cs.queensu.ca> wrote:
>
> Colin, great idea. One issue might be specifying language independent
> actions. This could get tough if in the future we support non c-like
> languages. For example, there was mention of supporting Ruby.
>
> Perhaps TXL (http://www.txl.ca/) might be useful. It could be used to define
> a mini toy language and to write transformations to the host languages.
> Though I'm connected to that project so I'm biased in regard to it being
> appropriate :)
>
> -Adrian
>
> Colin Fleming wrote:
> > Hi all,
> >
> > I've been thinking about various ways to test Ragel and the generated
> > grammars, here's what I've come up with. I'm really interested in any
> > feedback. I'm currently developing a couple of grammars that I'm
> > primarily interested in using with Java. The Java generation is still
> > a bit experimental, so I'd like to be able to use acceptance tests
> > that confirm that a) the grammar works as expected, b) the results are
> > consistent across Java/C++/whatever, and c) that the results are also
> > consistent across different code generation strategies.
> >
> > This last one is probably currently more useful to Adrian than anyone,
> > but I'm probably going to reimplement rlcodegen in Java shortly, so it
> > will be great for testing that as well as testing code generation
> > implementations for any new languages, or new code generation
> > strategies.
> >
> > So, I propose a parser class generator that will take a raw Ragel
> > grammar and generate an rl file for whichever of the supported
> > languages the user requests. This rl file will generate a basic
> > parsing class, with the standard methods: init, execute, finish. The
> > Ragel syntax would be slightly extended to specify features of the
> > generated class, and these extensions stripped out when the rl file is
> > written. This would actually probably be pretty generally useful too,
> > a lot of people just want a support class that they can integrate into
> > a larger project, I imagine.
> >
> > The whole point of this thing is testing, so unit test data and
> > expected values would be encoded in the source file. Either a test
> > class or just the parser could be generated, or both.
> >
> > An example is worth a thousand words, so here goes:
> >
> > %%{
> > # Variables for the generated class, initialised in init() method
> > # public vars generate getters
> > public int val = 0;
> > private boolean neg = true;
> >
> > action see_neg {
> > neg = true;
> > }
> >
> > action add_digit {
> > val = val * 10 + (fc - '0');
> > }
> >
> > main :=
> > ( '-'@see_neg | '+' )? ( digit @add_digit )+
> > '\n' @{ fbreak; };
> >
> > test {
> > input "1\n";
> > output "1";
> > }
> >
> > test {
> > input "213 3213\n";
> > output "unexpected char ' ' in input";
> > failure;
> > }
> > }%%
> >
> > Obviously one concern here is overloading the Ragel syntax, maybe a
> > prefix would be good to highlight the new keywords as preprocessor
> > directives.
> >
> > A few more thoughts:
> >
> > It would be good to be able to specify variables of the alphabet type:
> > public alphtype character;
> >
> > It would also be interesting to track the states the machine moves
> > through on each run, they could be compared to ensure that the
> > different strategies are behaving equally.
> >
> > I'm also not sure about having the test code in with the actual
> > grammar, but I guess an include directive would make that easier.
> >
> > Any thoughts or ideas?
> >
> > Cheers,
> > Colin
> >
> >
>
> >
>
More information about the ragel-users
mailing list