[ragel-users] simple parser for #include statements
Adrian Thurston
thurston at colm.net
Wed Apr 25 18:18:08 UTC 2018
Oh I see. In that case you could use dnl as the default rule, but be
sure to add it to the end of pattern. That would guarantee achor on
beginning of line. A question then arises though, do you want to allow
comments ahead of include statements?
On 2018-04-25 11:37, Mark Olesen wrote:
> Hi Adrian,
>
> Your explanation starts to make some sense. Using 'any' machine
> instead my 'dnl' machine should be a similar speed (the position of
> looping and testing for '\n' has just shifted about a bit).
>
> However, if I rewrite it:
>
> %%{
> main := |*
> space*;
>
> white* '#' white* 'include' white*
> (dquot dqarg >buffer %process dquot) dnl;
>
> '//' dnl; # 1-line comment
> '/*' any* :>> '*/'; # Multi-line comment
>
> any # Discard
> *|;
> }%%
>
> How do I ensure that the '#include' is properly anchored? This is what
> I was attempting with the 'dnl' machine: an attempt to enforce
> line-based processing, but combined with swallowing multi-line
> comments.
>
> As a regex, I'd specify my match like this
>
> /^\s*#\s*include\s+"(.*?)".*$/
>
> For my ragel machine, should I be doing something different such as
> having a begin-of-line state that I initialize into and reset every
> time I cross a newline?
> With vague hand waving:
>
> %%{
> main := |*
>
> '#' white* 'include' white*
> (dquot dqarg >buffer %process dquot) dnl;
>
> '//' dnl; # 1-line comment
> '/*' any* :>> '*/'; # Multi-line comment
>
> (space %isbol | any %notbol) # Discard
> *|;
> }%%
>
> Not that I really understand what I'd do next with this.
>
> Cheers,
> /mark
>
>
> On 04/25/18 15:45, Adrian Thurston wrote:
>> Hi Mark,
>>
>> So the thing to remember here is that a scanner will always try for
>> the longest match possible, and only in the case of matches of equal
>> length will it choose the pattern that appears ahead of the others. So
>> in this case the dnl at the end is taking precedence over the comment
>> rules. It doesn't interfere with the include matching rule because it
>> also has a dnl at the end.
>>
>> For the catch all you want to use just the any machine. It will go one
>> char at a time and this may seem less efficient, but ragel does its
>> best to optimize this.
>>
>> In regards to the slightly tighter machine that you mentioned, it
>> would be interesting to see before and after grammars in full to see
>> what's going on. On their own they produce the same machine, but in
>> the context of something larger there might be something preventing
>> it, or it could be a missed opportunity for optimization.
>>
>> -Adrian
>
> _______________________________________________
> ragel-users mailing list
> ragel-users at colm.net
> http://www.colm.net/cgi-bin/mailman/listinfo/ragel-users
More information about the ragel-users
mailing list