Stack handling for fcall and fret
Steve Horne
stephenhorne... at aol.com
Sun Jul 22 23:07:24 UTC 2007
I was recently toying with the idea of using Ragel to create a
recursive descent parser. I figured that for simple DSLs, this would
be convenient - allowing the scanner and parser to be developed using
the same language and tools. While thinking about this, I noticed a
wider issue with Ragels stack handling.
The stack consists of the user-defined variables 'stack' and 'top',
with presumably 'top' being a stack pointer subscript into 'stack'.
Fine - but Ragel doesn't appear to know how big the stack is, and
therefore cannot detect when it is full.
This is a problem for recursive descent parsing as there is no way to
resize the stack when needed, but then I'm not sure Ragel can handle
recursive descent parsing of expressions etc anyway. But potential
stack overflow is also a wider problem.
I understand that Ragel is being used to generate protocol handlers,
where security is potentially a major concern. A stack overflow is
basically a buffer overflow, which is of course the classic security
flaw.
Most likely this is a phantom issue - such code probably doesn't use
fcall/fret and doesn't have a stack at all. Even if it does, in many
cases the stack will have a fixed maximum possible size (though you
are still dependent on the user working this out accurately).
Even so, having a mechanism to allow the push and pop code to be
modified seems like a good idea. Maybe something like...
%%stack push { stk.push ( $ ); }
%%stack pop { $ = stk.pop (); }
In the above, the content of the braces is like action code, except
that the '$' is replaced by the source/target expression by Ragel.
Also, could you add a small section to the "Interface to Host" section
of the manual, that lists all the variables that the user may need to
declare (cs, p, pe etc) with short descriptions and links to
appropriate sections of the manual.
I find that I tend to work out all the regular expression stuff first,
and then come back to add actions and code later. It's so easy to
forget that, for instance, since I used a |* ... *| construct I need
the 'act' variable - tokstart and tokend I always remember, but act
loves to drive me mad!
More information about the ragel-users
mailing list