Minimisation question
Colin Fleming
colin.flem... at coreproc.com
Wed Sep 13 23:48:49 UTC 2006
Hi Adrian,
Ok, great that there's no downside to it. Here's some stats (I've
attached the grammar in case you're interested, it's not finished but
it's close):
The part that gives the problem is the doctype. The smallest part I
could get to compile was doctypedecl:
No minimisation:
time ragel asciixml.rl -s -n -M doctypedecl | rlcodegen -V > test.dot
fsm name : AsciiXml
num states: 1361
real 1m58.401s
user 1m56.850s
sys 0m1.481s
Memory use peaks at about 765MB. If I try and use the next largest
production it allocates up to 1.8GB and then malloc fails.
With minimisation:
time ragel asciixml.rl -s -M doctypedecl | rlcodegen -V > test.dot
fsm name : AsciiXml
num states: 269
real 1m58.358s
user 1m56.792s
sys 0m1.453s
More or less the same time and memory usage.
However with incremental minimisation:
time ragel asciixml.rl -s -e -M doctypedecl | rlcodegen -V > test.dot
fsm name : AsciiXml
num states: 269
real 0m0.076s
user 0m0.069s
sys 0m0.010s
It's practically instantaneous and works a charm. It also easily
allows me to compile the whole grammar, which is significantly more
complex:
time ragel asciixml.rl -s -e | rlcodegen -V > test.dot
fsm name : AsciiXml
num states: 445
real 0m0.124s
user 0m0.119s
sys 0m0.010s
This is a cut-down grammar that only uses ASCII characters, the full
XML spec requires Unicode, this makes the machine much more complex
because all the character ranges are treated properly (i.e. the same
number of states but a lot more transitions). Using incremental
minimisation allows that machine to be generated in 2.199s.
Assuming it's reliable, I can't see a reason not to use it, or to have
it turned on by default.
Cheers,
Colin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: asciixml.rl
Type: application/octet-stream
Size: 6017 bytes
Desc: not available
URL: <http://www.colm.net/pipermail/ragel-users/attachments/20060913/e9e98fee/attachment-0001.obj>
More information about the ragel-users
mailing list