[ragel-users] Re: syntax improvement, new operators
Erich Ocean
er... at atlasocean.com
Fri Feb 9 04:40:11 UTC 2007
Well, take the first User Action example in the Ragel manual on page
28: 3.1.1 Entering Action:
action A {}
main := ( lower* >A ) . ;
Let's modify it to add a Pending Out (Leaving) Action, and then make
that machine optional:
action ENTER_TRANSITION {} # Entering Action
action LEAVE_TRANSITION {} # Pending Out (Leaving) Action
main := ( lower* >ENTER_TRANSITION %LEAVE_TRANSITION )? . ' ';
If the first character recognized by main happens to be a space, then
LEAVE will be executed, but ENTER won't.
I think it's confusing to a user that a machine will execute its
Leaving action (to use your terminology) without first executing its
Entering.
The confusion goes away if you've learned that the > action will only
be executed on the first character.
The % action isn't a character action, it's a machine action (to use
my terminology). So a user would naturally reason that it could be
executed even though no character was recognized, as is this case:
action FIRST_CHAR {} # Executed on recognition of the first character
action MACHINE_ACCEPT {} # Executed when the machine accepts a match
main := ( lower* >FIRST_CHAR %MACHINE_ACCEPT )? . ' ';
I use the Match/Accept terminology because any given machine can make
a whole bunch of matches while it's recognizing characters, and the @
action is executed every single time the machine recognizes a match.
The % action, on the other hand, is only executed when the machine
finally accepts one of those matches. The @ action (Match) is a
character action because it is always and only triggered upon the
recognition of a character, whereas the Accept action is a machine
action because is only ever executed once, when the machine accepts a
match, regardless of whether or not a character has been recognized.
It's character-independent.
Hope this explains some of the reasoning behind the categorization
and new terminology.
Best, Erich
On Feb 8, 2007, at 7:48 PM, Adrian Thurston wrote:
>
> Hi Erich,
>
> I'm glad to see you are still working with Ragel! By the way, I've
> updated your name in the CREDITS file and elsewhere.
>
>> Character Actions
>> =============
>>
>>> aka First -- This action will be executed on the first character
>>> the machine recognizes.
>> $ aka Each -- This action will be executed on each character the
>> machine recognizes.
>> @ aka Match -- This action will be executed on characters the machine
>> recognizes that puts the machine into a match state.
>> < aka Continue -- (New) This action will be executed on the next
>> character the machine recognizes when the machine is in a match
>> state.
>
> So it seems that you prefer to express these operators in terms of the
> characters of the input string that is processed. This is distinct
> from
> my approach, where I talk about the transitions of a regular
> expression's corresponding state machine.
>
> I prefer to express the operators in terms of transitions because I
> find
> it to be very precise. For example, with "entering transition actions"
> you can go and look at the graphviz drawing and find the transitions
> which take you into the machine. That's me though, and I would very
> much
> like to hear what others think. Is it better to talk about the
> transitions that the actions are put into, or is better to talk about
> the characters that are moved over when the actions are executed?
>
> The < operator you have given I find interesting. As I understand it,
> this would embed the action on the transitions which leave final
> states
> (but stay in the machine). Could you give an example of when it is
> useful?
>
>
>> Machine Actions
>> ============
>>
>> % aka Accept -- This action will only be executed when the machine
>> accepts a match.
>
> The word "accept" I find to be somewhat ambiguous. It doesn't
> strike me
> that it means only one of "on the last character" or "on the next
> character." It seems to me that it could easily be interpreted as
> either
> of those. I chose the word "leaving" for this operator because it's
> clear to me that it means on the next character.
>
>> %\ aka Fail -- (New) This action will only be executed when the
>> machine fails to either: (a) recognize a character, or (b) accept a
>> match.
>
> I'm not quite sure what you mean with (b). I would assume you mean the
> same as above, what is currently known as the leaving (or pending out)
> operator. But then I believe this new operator would be the same as
> the
> $! operator. Could you clarify?
>
>> %? aka Skip -- (New) This action will be executed instead of Fail
>> when
>> either the Optional operator or the Kleene Star operator is
>> applied to
>> the machine.
>
> I'm not sure I understand this operator. If you write:
>
> ( expr %? skip_act )?
>
> Is it the same as writing the following?
>
> ( expr | "" %skip_act )
>
> Could you give us an example of the kind of problem that motivated
> these
> operators? Especially the part about setting and clearing external
> state
> flags to do proper resource acquisition and release. An example would
> really help me to understand the issue.
>
> Regards,
> Adrian
>
>
> >
More information about the ragel-users
mailing list