parsing a netstring
Chuck Remes
cremes.devl... at mac.com
Sun Oct 7 18:16:27 UTC 2007
I'm suddenly finding all sorts of uses for ragel!
I want to write a parser for netstrings. The definition of a
netstring is pretty simple. It comes in the following format:
size_in_decimal':''string array size_in_decimal bytes long'','
I wrote a machine to parse through this and capture every byte, but
I'm unclear how to terminate my get_string machine. Right now I have
it call the action store_string as a finishing action for each byte
processed. The action stores the byte and increments a counter
variable. When the counter variable exceeds the number of bytes to be
processed, I want to advance out of that machine and move to the next
machine to confirm the byte array was terminated properly.
I'm not sure I'm doing this correctly. From the docs (section 6.5) it
appears using a 'semantic condition' would make sense here, but that
part of the documentation is unclear to me so I'm using this
alternate methodology. Am I on the right track? Also, is there a way
to skip 'N' bytes forward instead of copying them one by one into a
new array (super slow!)? I'm thinking I can directly modify the 'p'
variable but I'm not sure this is the right way.
Secondly, I'm not sure how to capture errors. I'm already using the
form '@action' to do some work in a machine. Can I specify an error
action using the same operator in the same machine? E.g - get_size =
( digit @store_size @err(size_error) )+;
Thanks for any input. My sample machine is listed below.
%%{
machine parse_netstring;
# snipped out some actions for the sake of brevity
action store_size {
size = ( size * 10 ) + fc; # accumulate string length
};
action alloc_buffer {
buffer = Array.new(size);
i = 0;
};
action store_string {
buffer[i] = fc;
i = i + 1;
fnext get_string_terminator if i > size;
};
get_size = ( digit >validate_not_zero ) . ( digit @store_size )*;
get_delimeter = ( ':' @alloc_buffer );
get_string = ( any @store_string )*;
get_netstring_terminator = ',' @finalize;
main := get_size . get_delimeter . get_string;
}%%
More information about the ragel-users
mailing list