Grammar testing proposal

Colin Fleming colin.flem... at coreproc.com
Sat Sep 16 15:53:01 UTC 2006


Hehe, TXL does actually look really interesting -  and potentially
appropriate! I'd be worried about raising the bar to Ragel though, it
looks like TXL might take some time to wrap your head round.

BTW do you have a grammar spec for Ragel?

On 9/15/06, Adrian Thurston <thurs... at cs.queensu.ca> wrote:
>
> Colin, great idea. One issue might be specifying language independent
> actions. This could get tough if in the future we support non c-like
> languages. For example, there was mention of supporting Ruby.
>
> Perhaps TXL (http://www.txl.ca/) might be useful. It could be used to define
> a mini toy language and to write transformations to the host languages.
> Though I'm connected to that project so I'm biased in regard to it being
> appropriate :)
>
> -Adrian
>
> Colin Fleming wrote:
> > Hi all,
> >
> > I've been thinking about various ways to test Ragel and the generated
> > grammars, here's what I've come up with. I'm really interested in any
> > feedback. I'm currently developing a couple of grammars that I'm
> > primarily interested in using with Java. The Java generation is still
> > a bit experimental, so I'd like to be able to use acceptance tests
> > that confirm that a) the grammar works as expected, b) the results are
> > consistent across Java/C++/whatever, and c) that the results are also
> > consistent across different code generation strategies.
> >
> > This last one is probably currently more useful to Adrian than anyone,
> > but I'm probably going to reimplement rlcodegen in Java shortly, so it
> > will be great for testing that as well as testing code generation
> > implementations for any new languages, or new code generation
> > strategies.
> >
> > So, I propose a parser class generator that will take a raw Ragel
> > grammar and generate an rl file for whichever of the supported
> > languages the user requests. This rl file will generate a basic
> > parsing class, with the standard methods: init, execute, finish. The
> > Ragel syntax would be slightly extended to specify features of the
> > generated class, and these extensions stripped out when the rl file is
> > written. This would actually probably be pretty generally useful too,
> > a lot of people just want a support class that they can integrate into
> > a larger project, I imagine.
> >
> > The whole point of this thing is testing, so unit test data and
> > expected values would be encoded in the source file. Either a test
> > class or just the parser could be generated, or both.
> >
> > An example is worth a thousand words, so here goes:
> >
> > %%{
> >   # Variables for the generated class, initialised in init() method
> >   # public vars generate getters
> >   public int val = 0;
> >   private boolean neg = true;
> >
> >   action see_neg {
> >     neg = true;
> >   }
> >
> >   action add_digit {
> >     val = val * 10 + (fc - '0');
> >   }
> >
> >   main :=
> >     ( '-'@see_neg | '+' )? ( digit @add_digit )+
> >     '\n' @{ fbreak; };
> >
> >   test {
> >     input "1\n";
> >     output "1";
> >   }
> >
> >   test {
> >     input "213 3213\n";
> >     output "unexpected char ' ' in input";
> >     failure;
> >   }
> > }%%
> >
> > Obviously one concern here is overloading the Ragel syntax, maybe a
> > prefix would be good to highlight the new keywords as preprocessor
> > directives.
> >
> > A few more thoughts:
> >
> > It would be good to be able to specify variables of the alphabet type:
> > public alphtype character;
> >
> > It would also be interesting to track the states the machine moves
> > through on each run, they could be compared to ensure that the
> > different strategies are behaving equally.
> >
> > I'm also not sure about having the test code in with the actual
> > grammar, but I guess an include directive would make that easier.
> >
> > Any thoughts or ideas?
> >
> > Cheers,
> > Colin
> >
> >
>
> >
>



More information about the ragel-users mailing list