<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

<HTML>

<HEAD>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">

<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7654.12">

<TITLE>RE: [ragel-users] Parsing a template language</TITLE>

</HEAD>

<BODY>

<!-- Converted from text/plain format -->


<P><FONT SIZE=2>I'm actually working on a similar sounding task. <BR>

<BR>

Try the strong subtraction operator<BR>

Untested:<BR>

<BR>

main := |*<BR>

  '[[' lower+ ']]' => action<BR>

  ( any* -- '[[' ) => action<BR>

*|;<BR>

<BR>

<BR>

( any* -- '[[' ) will match the longest possible string that doesn't have '[[' as a substring.<BR>

<BR>

-----Original Message-----<BR>

From: ragel-users-bounces@complang.org on behalf of Tobias Lütke<BR>

Sent: Tue 7/27/2010 6:54 PM<BR>

To: ragel-users@complang.org<BR>

Subject: Re: [ragel-users] Parsing a template language<BR>

<BR>

Depends on the answers in this thread I suppose :-)<BR>

<BR>

<BR>

<BR>

On Tue, Jul 27, 2010 at 3:42 AM, Magnus Holm <judofyr@gmail.com> wrote:<BR>

> (A little off-topic, but whatever:<BR>

><BR>

> So Liquid will finally get a proper parser? :-))<BR>

><BR>

> // Magnus Holm<BR>

><BR>

><BR>

><BR>

> On Tue, Jul 27, 2010 at 03:15, Tobias Lütke <tobi@leetsoft.com> wrote:<BR>

>> I've been working on a parser for simple template language. I'm using Ragel.<BR>

>><BR>

>> The requirements are modest. I'm trying to find [[tags]] that can be<BR>

>> embedded anywhere in the input string.<BR>

>><BR>

>> I'm trying to parse a simple template language, something that can<BR>

>> have tags such as {{foo}} embedded within HTML. I tried several<BR>

>> approaches to parse this but had to resort to using a Ragel scanner<BR>

>> and use the inefficient approach of only matching a single character<BR>

>> as a "catch all". I feel this is the wrong way to go about this. I'm<BR>

>> essentially abusing the longest-match bias of the scanner to implement<BR>

>> my default rule ( it can only be 1 char long, so it should always be<BR>

>> the last resort ).<BR>

>><BR>

>> %%{<BR>

>><BR>

>>  machine parser;<BR>

>><BR>

>>  action start      { tokstart = p; }<BR>

>>  action on_tag      { results << [:tag, data[tokstart..p]] }<BR>

>>  action on_static  { results << [:static, data[p..p]] }<BR>

>><BR>

>>  tag  = ('[[' lower+ ']]') >start @on_tag;<BR>

>><BR>

>>  main := |*<BR>

>>    tag;<BR>

>>    any      => on_static;<BR>

>>  *|;<BR>

>><BR>

>> }%%<BR>

>><BR>

>> ( actions written in ruby, but should be easy to understand ).<BR>

>><BR>

>> How would you go about writing a parser for such a simple language? Is<BR>

>> Ragel maybe not the right tool? It seems you have to fight Ragel tooth<BR>

>> and nails if the syntax is unpredictable such as this.<BR>

>><BR>

>><BR>

>> Regards<BR>

>> -- tobi<BR>

>><BR>

>> _______________________________________________<BR>

>> ragel-users mailing list<BR>

>> ragel-users@complang.org<BR>

>> <A HREF="http://www.complang.org/mailman/listinfo/ragel-users">http://www.complang.org/mailman/listinfo/ragel-users</A><BR>

>><BR>

><BR>

> _______________________________________________<BR>

> ragel-users mailing list<BR>

> ragel-users@complang.org<BR>

> <A HREF="http://www.complang.org/mailman/listinfo/ragel-users">http://www.complang.org/mailman/listinfo/ragel-users</A><BR>

><BR>

<BR>

_______________________________________________<BR>

ragel-users mailing list<BR>

ragel-users@complang.org<BR>

<A HREF="http://www.complang.org/mailman/listinfo/ragel-users">http://www.complang.org/mailman/listinfo/ragel-users</A><BR>

<BR>

</FONT>

</P>


</BODY>

</HTML>