[ragel-users] simple parser for #include statements
Mark Olesen
Mark.Olesen at esi-group.com
Wed Apr 25 15:37:07 UTC 2018
Hi Adrian,
Your explanation starts to make some sense. Using 'any' machine instead
my 'dnl' machine should be a similar speed (the position of looping and
testing for '\n' has just shifted about a bit).
However, if I rewrite it:
%%{
main := |*
space*;
white* '#' white* 'include' white*
(dquot dqarg >buffer %process dquot) dnl;
'//' dnl; # 1-line comment
'/*' any* :>> '*/'; # Multi-line comment
any # Discard
*|;
}%%
How do I ensure that the '#include' is properly anchored? This is what I
was attempting with the 'dnl' machine: an attempt to enforce line-based
processing, but combined with swallowing multi-line comments.
As a regex, I'd specify my match like this
/^\s*#\s*include\s+"(.*?)".*$/
For my ragel machine, should I be doing something different such as
having a begin-of-line state that I initialize into and reset every time
I cross a newline?
With vague hand waving:
%%{
main := |*
'#' white* 'include' white*
(dquot dqarg >buffer %process dquot) dnl;
'//' dnl; # 1-line comment
'/*' any* :>> '*/'; # Multi-line comment
(space %isbol | any %notbol) # Discard
*|;
}%%
Not that I really understand what I'd do next with this.
Cheers,
/mark
On 04/25/18 15:45, Adrian Thurston wrote:
> Hi Mark,
>
> So the thing to remember here is that a scanner will always try for the
> longest match possible, and only in the case of matches of equal length
> will it choose the pattern that appears ahead of the others. So in this
> case the dnl at the end is taking precedence over the comment rules. It
> doesn't interfere with the include matching rule because it also has a
> dnl at the end.
>
> For the catch all you want to use just the any machine. It will go one
> char at a time and this may seem less efficient, but ragel does its best
> to optimize this.
>
> In regards to the slightly tighter machine that you mentioned, it would
> be interesting to see before and after grammars in full to see what's
> going on. On their own they produce the same machine, but in the context
> of something larger there might be something preventing it, or it could
> be a missed opportunity for optimization.
>
> -Adrian
More information about the ragel-users
mailing list