[ragel-users] Re: parsing a netstring
Adrian Thurston
thurs... at cs.queensu.ca
Wed Oct 10 16:19:11 UTC 2007
Chuck,
That's right, I should just point out a subtlety. With error actions
"inputs not handled" is computed when the final machine is complete. If
get_size is just on its own in main then this will be non-digits. If
get_size were unioned with something else then this would be non-digits
and non-something else (for the start state at least).
If you use local error actions instead @lerr(...) then the "inputs not
handled" are computed when the get_size machine is constructed. In this
case any non-digit.
So global error actions are executed when the whole machine fails and
local error actions are executed when the current definition fails.
-Adrian
Chuck Remes wrote:
> Adrian,
>
> thanks for the reply. It's encouraging to have my guesses confirmed.
>
> I'm still a bit fuzzy on the last piece regarding error actions.
>
> Let's use the example I contrived from the original email:
>
>
>> get_size = ( digit @store_size @err(size_error) )+;
>>
>
> I'm interpreting your comment to mean that @err(size_error) will
> *only* get called if the get_size machine receives an input it isn't
> configured to handle. For example, if it receives an alphanumeric
> then action size_error will be called to handle it.
>
> cr
>
> On Oct 9, 2007, at 4:10 PM, Adrian Thurston wrote:
>
>> Hi Chuck,
>>
>> Yes, using fnext to call out of the string consuming machine is one
>> way
>> to do it. The code looks good to me.
>>
>> As you said you can use conditions as well. I think one of the
>> examples
>> in the manual deals with variable length fields. So there is that
>> route.
>>
>> And also yes, you can modify p to jump ahead of the area. Just be
>> mindful of jumping past pe. If you have all the data at once this
>> isn't
>> a problem, but if you get your data in blocks then you have to
>> watch out
>> and hack in some solution.
>>
>> With error actions you have to keep in mind that the operators have
>> slightly different meanings because they select states as opposed to
>> transitions. The error action embedding operators let you handle the
>> case of 'no transition' in the states they select.
>>
>> Adrian
>>
>> Chuck Remes wrote:
>>> I'm suddenly finding all sorts of uses for ragel!
>>>
>>> I want to write a parser for netstrings. The definition of a
>>> netstring is pretty simple. It comes in the following format:
>>>
>>> size_in_decimal':''string array size_in_decimal bytes long'','
>>>
>>> I wrote a machine to parse through this and capture every byte, but
>>> I'm unclear how to terminate my get_string machine. Right now I have
>>> it call the action store_string as a finishing action for each byte
>>> processed. The action stores the byte and increments a counter
>>> variable. When the counter variable exceeds the number of bytes to be
>>> processed, I want to advance out of that machine and move to the next
>>> machine to confirm the byte array was terminated properly.
>>>
>>> I'm not sure I'm doing this correctly. From the docs (section 6.5) it
>>> appears using a 'semantic condition' would make sense here, but that
>>> part of the documentation is unclear to me so I'm using this
>>> alternate methodology. Am I on the right track? Also, is there a way
>>> to skip 'N' bytes forward instead of copying them one by one into a
>>> new array (super slow!)? I'm thinking I can directly modify the 'p'
>>> variable but I'm not sure this is the right way.
>>>
>>> Secondly, I'm not sure how to capture errors. I'm already using the
>>> form '@action' to do some work in a machine. Can I specify an error
>>> action using the same operator in the same machine? E.g - get_size =
>>> ( digit @store_size @err(size_error) )+;
>>>
>>> Thanks for any input. My sample machine is listed below.
>>>
>>> %%{
>>> machine parse_netstring;
>>>
>>> # snipped out some actions for the sake of brevity
>>>
>>> action store_size {
>>> size = ( size * 10 ) + fc; # accumulate string length
>>> };
>>>
>>> action alloc_buffer {
>>> buffer = Array.new(size);
>>> i = 0;
>>> };
>>>
>>> action store_string {
>>> buffer[i] = fc;
>>> i = i + 1;
>>> fnext get_string_terminator if i > size;
>>> };
>>>
>>> get_size = ( digit >validate_not_zero ) . ( digit @store_size )*;
>>>
>>> get_delimeter = ( ':' @alloc_buffer );
>>>
>>> get_string = ( any @store_string )*;
>>>
>>> get_netstring_terminator = ',' @finalize;
>>>
>>> main := get_size . get_delimeter . get_string;
>>> }%%
>>>
>
>
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google Groups "ragel-users" group.
> To post to this group, send email to ragel-users at googlegroups.com
> To unsubscribe from this group, send email to ragel-users-unsubscribe at googlegroups.com
> For more options, visit this group at http://groups.google.com/group/ragel-users?hl=en
> -~----------~----~----~----~------~----~------~--~---
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
URL: <http://www.colm.net/pipermail/ragel-users/attachments/20071010/5846bc0f/attachment-0001.sig>
More information about the ragel-users
mailing list