<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7654.12">
<TITLE>RE: [ragel-users] Parsing a template language</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>I'm actually working on a similar sounding task. <BR>
<BR>
Try the strong subtraction operator<BR>
Untested:<BR>
<BR>
main := |*<BR>
'[[' lower+ ']]' => action<BR>
( any* -- '[[' ) => action<BR>
*|;<BR>
<BR>
<BR>
( any* -- '[[' ) will match the longest possible string that doesn't have '[[' as a substring.<BR>
<BR>
-----Original Message-----<BR>
From: ragel-users-bounces@complang.org on behalf of Tobias Lütke<BR>
Sent: Tue 7/27/2010 6:54 PM<BR>
To: ragel-users@complang.org<BR>
Subject: Re: [ragel-users] Parsing a template language<BR>
<BR>
Depends on the answers in this thread I suppose :-)<BR>
<BR>
<BR>
<BR>
On Tue, Jul 27, 2010 at 3:42 AM, Magnus Holm <judofyr@gmail.com> wrote:<BR>
> (A little off-topic, but whatever:<BR>
><BR>
> So Liquid will finally get a proper parser? :-))<BR>
><BR>
> // Magnus Holm<BR>
><BR>
><BR>
><BR>
> On Tue, Jul 27, 2010 at 03:15, Tobias Lütke <tobi@leetsoft.com> wrote:<BR>
>> I've been working on a parser for simple template language. I'm using Ragel.<BR>
>><BR>
>> The requirements are modest. I'm trying to find [[tags]] that can be<BR>
>> embedded anywhere in the input string.<BR>
>><BR>
>> I'm trying to parse a simple template language, something that can<BR>
>> have tags such as {{foo}} embedded within HTML. I tried several<BR>
>> approaches to parse this but had to resort to using a Ragel scanner<BR>
>> and use the inefficient approach of only matching a single character<BR>
>> as a "catch all". I feel this is the wrong way to go about this. I'm<BR>
>> essentially abusing the longest-match bias of the scanner to implement<BR>
>> my default rule ( it can only be 1 char long, so it should always be<BR>
>> the last resort ).<BR>
>><BR>
>> %%{<BR>
>><BR>
>> machine parser;<BR>
>><BR>
>> action start { tokstart = p; }<BR>
>> action on_tag { results << [:tag, data[tokstart..p]] }<BR>
>> action on_static { results << [:static, data[p..p]] }<BR>
>><BR>
>> tag = ('[[' lower+ ']]') >start @on_tag;<BR>
>><BR>
>> main := |*<BR>
>> tag;<BR>
>> any => on_static;<BR>
>> *|;<BR>
>><BR>
>> }%%<BR>
>><BR>
>> ( actions written in ruby, but should be easy to understand ).<BR>
>><BR>
>> How would you go about writing a parser for such a simple language? Is<BR>
>> Ragel maybe not the right tool? It seems you have to fight Ragel tooth<BR>
>> and nails if the syntax is unpredictable such as this.<BR>
>><BR>
>><BR>
>> Regards<BR>
>> -- tobi<BR>
>><BR>
>> _______________________________________________<BR>
>> ragel-users mailing list<BR>
>> ragel-users@complang.org<BR>
>> <A HREF="http://www.complang.org/mailman/listinfo/ragel-users">http://www.complang.org/mailman/listinfo/ragel-users</A><BR>
>><BR>
><BR>
> _______________________________________________<BR>
> ragel-users mailing list<BR>
> ragel-users@complang.org<BR>
> <A HREF="http://www.complang.org/mailman/listinfo/ragel-users">http://www.complang.org/mailman/listinfo/ragel-users</A><BR>
><BR>
<BR>
_______________________________________________<BR>
ragel-users mailing list<BR>
ragel-users@complang.org<BR>
<A HREF="http://www.complang.org/mailman/listinfo/ragel-users">http://www.complang.org/mailman/listinfo/ragel-users</A><BR>
<BR>
</FONT>
</P>
</BODY>
</HTML>