[ragel-users] Detect keywords with a ragel scanner
Alec Tica
alexandru.tica at gmail.com
Thu Jul 14 21:20:42 UTC 2011
Hi,
I'm new to Ragel and I'm trying to figure out how to solve,
apparently, a very simple problem. Let's say I have the following
text:
"select 1 from dual;select 2 from dual;/*comment*/select 3 from dual;select"
I want to detect all "select" keywords using a scanner but taking into
consideration the word boundaries. "select" is a keyword only if:
1. starts at: the very beginning of the text or it has a whitespace
before or a comment or a statement separator (;)
2. ends at: the very end of the text or it has a whitespace after or a
comment or a statement separator (;)
3. is not within quotes
4. is not part of a comment
Till now I have:
<code>
%%{
machine example;
action is_eof {
true if p == eof - 1
}
# eof
EOF = zlen when is_eof;
# strings
squoted_string = ['] ( (any - [''])** ) ['];
dquoted_string = '"' ( any )* :>> '"';
# comments
ml_comment = '/*' ( any )* :>> '*/';
sl_comment = '--' ( any )* :>> ('\n' | EOF);
comment = ml_comment | sl_comment;
tail = space | comment | ';' | EOF;
# keyword
select = 'select' . tail;
main := |*
squoted_string;
dquoted_string;
comment;
select => { puts "found at #{ts}-#{te}" };
any;
*|;
}%%
%% write data;
data = 'unselect 1 from dual;select 2 from dual;/*comment*/select 3
from dual;select'
# convert the provided string in a stream of chars
stream_data = data.unpack("c*") if(data.is_a?(String))
eof = stream_data.length
%% write init;
%% write exec;
</code>
Of course, the above scanner incorrectly matches the "unselect" word
from the data. Anyway, I feel that I'm not on the right track
therefore I'd like to ask for your advice.
Many thanks in advance!
--
talek
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users
More information about the ragel-users
mailing list