[ragel-users] How do I act on eof in state charts
Adrian Thurston
thurston at complang.org
Fri Oct 23 02:04:24 UTC 2009
If you are unioning tokens together and then doing a kleene star then
you could do this;
action lit {}
action space {}
string_literal = ( "'" ( [^'] | "''" )* "'" ) %lit;
ws = ' ' @space;
main := ( ws | string_literal )**;
If you're not and you want to make yourself a self-contained string lit
that is safe to use regardless which operation it is used in next, then
do the following (you seem to have something like this already).
action lit {}
action space {}
string_literal = ( "'" ( [^'] | "''" $1 )* "'" ) %0 %lit;
ws = ' ' @space;
main := ws string_literal ( ws | string_literal )*;
-Adrian
Antony Blakey wrote:
> On 22/10/2009, at 12:10 AM, Antony Blakey wrote:
>
>> string_literal_body =
>> start: (
>> "'" -> seen_quote |
>> [^'] -> start
>> ),
>> seen_quote: (
>> "'" -> start |
>> [^'] @{ fhold; } -> final
>> );
>> string_literal = "'" string_literal_body %{ puts "string_literal" } ;
>>
>> The problem occurs when a string literal ends at eof. How do I
>> specify the eof 'match' in the seen_quote state such that all the
>> leaving-transition actions that are in place above the
>> string_literal are executed, such as the 'puts' on the
>> 'string_literal' machine. I don't want to manually duplicate the
>> parent code because multiple machines reference string_literal, with
>> different leaving-transition actions.
>>
>> I couldn't get it to work using priorities - the terminator needs
>> lookahead to disambiguate; the following doesn't work:
>>
>> string_literal = "'" [^']* ( "''" [^']* )* '"'
>
> I ended up doing this:
>
> string_literal_unqoted = "'" [^']* "'" ;
> string_literal = string_literal_unqoted+ $(longest, 1) %(longest,
> 0) % { puts "string_literal" };
>
> which works. I would have thought that this:
>
> string_literal = string_literal_unqoted string_literal_unqoted** %
> { puts "string_literal" };
>
> would work, but '**' doesn't work for me - if I use a main like this:
>
> main := space* ( string_literal space* )* ;
>
> then I get two string_literals from 'a''b' rather than one, while the
> explicit priorities do give me one string_literal.
>
> Of course this works as well:
>
> main := space* ( string_literal <: space* )* ;
>
> but given the pervasiveness of whitespace handling in my grammar (a
> full Smalltalk parser) it's a real PITA because everything you want
> greedy consumption for has to be thus annotated wherever it is used.
>
> Still, I'm interested in how you specify EOF transitions in an
> explicit state machine. I also had success appending a 0 onto the
> input for use as an EOF marker that can be matched in the state
> machine, but I'm not sure yet if/how that will interfere with true EOF
> functioning.
>
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> In anything at all, perfection is finally attained not when there is
> no longer anything to add, but when there is no longer anything to
> take away.
> -- Antoine de Saint-Exupery
>
>
>
> _______________________________________________
> ragel-users mailing list
> ragel-users at complang.org
> http://www.complang.org/mailman/listinfo/ragel-users
More information about the ragel-users
mailing list