[ragel-users] Is this the right way to do it ?

Adrian Thurston thurs... at cs.queensu.ca
Wed Oct 31 15:35:14 UTC 2007


Hi Gaspard,

The other way to catpure token text is to set pointers to mark the start and end of tokens. It is faster but requires that you be careful about buffer boundaries.

In my opinion this is a valid way to parse and the motivation is speed. However if speed is not a requirement and you're dealing with a token stream I would suggest that you use the more traditional lexer+parser approach.

Adrian

-----Original Message-----
From: Gaspard Bucher <gaspard at teti.ch>

Date: Wed, 31 Oct 2007 07:58:21 
To:ragel-users <ragel-users at googlegroups.com>
Subject: [ragel-users] Is this the right way to do it ?



I am implementing a parser to read commands from user (interactive) or
from a stored file. The idea is to build the objects and their
relation inside rubyk (http://rubyk.org). Some examples of the syntax:

create a metronome object: m1 = Metro(120)
create a metronome object: m1 = Metro(metro:120) # same as above
create a note out object:     n  = NoteOut(velocity:80 port:"funk")
create a script object:         cooking = Script(".... Lua code ....")
create links:               m1.1 => 1.cooking, cooking.1 =>
1.n

Here is a rough prototype to implement the parsing using ragel (have
been using flex/lemon).

Am I doing this right ? More precisely :
1. is there a better way to extract token values ( instead of by
repeated @a appends) ?
2. would it be simpler to use ragel only for building the tokens and
let lemon handle the actions ?

Thanks for your answers.

Gaspard

=================== prototype.rl ========
#include <iostream>
#include <cstdio>
#define MAX_BUFFER_SIZE 2048

%%{
  machine foo;
  write data noerror;
}%%

class Command
{
public:
  void parse(char * str)
  {
    char *p = str; // data pointer
    char *pe = str + strlen(str); // past end
    int cs;        // machine state
    int len = 0;
    char token[MAX_BUFFER_SIZE + 1];

    %%{
      action a {
        if (len >= MAX_BUFFER_SIZE) {
          std::cerr << "Buffer overflow !" << std::endl;
          // stop parsing
          return;
        }
        token[len] = fc; /* append */
        len++;
      }

      action set_var {
        token[len] = '\0';
        mVariable = token;
        len = 0;
      }

      action key {
        token[len] = '\0';
        std::cout << "[key   :" << token << "]" << std::endl;
        len = 0;
      }

      action set_klass {
        token[len] = '\0';
        mClass = token;
        len = 0;
      }

      action space {
        printf(" ");
      }

      action ret {
        printf("\n");
      }

      action set_string {
        token[len] = '\0';
        mValue = token;
        len = 0;
      }

      action set_float {
        token[len] = '\0';
        mValue = token;
        len = 0;
      }

      action set_integer {
        token[len] = '\0';
        mValue = token;
        len = 0;
      }

      action set_from {
        mFromPort = atoi(mValue.c_str());
        mFrom = mVariable;
      }

      action create_instance {
        std::cout << "NEW  (" << mVariable << "=" << mClass << "()" <<
")" << std::endl;
      }

      action create_link {
        mToPort = atoi(mValue.c_str());
        mTo   = mVariable;
        std::cout << "LINK (" << mFrom << "." << mFromPort << "=>" <<
mToPort << "." << mTo << ")" << std::endl;
      }

      ws     = (' ' | '\n' | '\t')+;

      identifier = 'a'..'z' @a (digit | alpha | '_')* @a;

      var    = identifier %set_var;

      klass  = 'A'..'Z' @a (digit | alpha | '_')* @a %set_klass;

      string  = '"' ([^"\\] | '\n' | ( '\\' (any | '\n') ))* @a
%set_string '"';
      float   = ('1'..'9' @a digit* @a '.' @a digit+ @a) %set_float;
      integer = ('1'..'9' @a digit* @a) %set_integer;

      value  = (string | float | integer);

      key    = identifier %key;

      param  = (key ':' ws* value);

      parameters = value | (param ws*)+;

      create_instance = var ws* '=' ws* klass '(' parameters? ')'
@create_instance;

      create_link = var '.' integer @set_from ws* '=>' ws* integer '.'
var @create_link;

      main := ((create_instance | create_link) ws*)+  ;

      write init;
      write exec;
    }%%

    printf("\n");
  }
private:
  std::string mVariable, mFrom, mTo, mClass, mValue;
  int         mFromPort,     mToPort;
};

int main()
{
  Command cmd;
  cmd.parse("a=Value() b=Super(23.3)c=This(hey:\"mosdffasl\" come:
3)\na.1=>1.b a.2=>2.b");
}
===========================






More information about the ragel-users mailing list