<div> </div><div>All, help. I've R'd TFM all week trying to figure this out, but am still confused (so please pardon the potential n00bness.) </div><div> </div><div>I have to parse a config file for an app I'm working on, whose format is basically of the format:</div>
<div>group MyGroup {</div><div> tcpclient( host: foo, port: 49152 );</div><div> udp( host: bar, port: 49152 ) > tcpserver( port: 11111 );</div><div> udp:foo:49152.nonblocking = true;</div><div>}</div><div> </div><div>
>From what I've read on the Intertubes, it seems that the SOP for processing this is to define a main := which will match a particular line of the text and then upon matching call a another machine to "scan" the message. However, I'm not sure how to do that because it seem that regardless of whether I define main as a matcher or a scanner, executing the parser always seems to consume the text as it matches. For instance, when I parse the group definition, I can simply match on the word "group" and then pass the rest of the line (up to the {) in to the scanner and I can get 'MyGroup' out relatively easily. However, when I try to parse the first encapsulated line, I don't know whether I'm dealing with a string of the first line form or third line form (or if the command is "chained" as in the second line) until I've done a kleene star match of the entire line (up to the ;) at which point it seems that the parser has already consumed the entire line and when I pass it into a scanner the pointers are already at the next line. Do I need the store the starting pointer before the first main scan (and if so, how?) and then how would I tell the downstream scanner where to start? I thought of making a number of nested c++ "parser objects" but that just seem inherently wrong.</div>
<div> </div><div>Below is what I've written so far--just enough to hopefully pass the first two cases. Again, I don't know if I'm only a character or so off or if my mindset is completely off. Any help would be appreciated.</div>
<div> </div><div>--<br clear="all">Rob Harris<br> Technological Pragmatist<br> rob period harris shift-2 gmail decimal-point com<br> "The universe tends towards maximum irony." --Jamie Zawinsky<br>
</div><div> </div><div><font face="courier new,monospace"> %%{ <br> machine sas_scanner;</font></div><div><font face="courier new,monospace"> ml_comment = '/*' ( any )* :>> '*/';<br> sl_comment = '//' [^\n]* '\n';<br>
comment = ml_comment | sl_comment;<br> wspace = comment | space+ ;</font></div><div><font face="courier new,monospace"> integer = [0-9]*;<br> float = [0-9]* '.' [0-9]*;<br> identifier = [a-zA-Z][a-zA-Z0-9]*;<br>
fqsm = [a-zA-Z] ( [a-zA-Z0-9:][a-zA-Z0-9_] )*; <br> sqstring = '\'' [^\n]* :>> '\'';<br> dqstring = '\"' [^\n]* :>> '\"';<br> strvalue = ( integer | float | identifier | sqstring | dqstring );</font></div>
<div><font face="courier new,monospace"> action DEBUG { fprintf( stderr, "state: %4d, char: %c\n", cs, *p ); }</font></div><div><font face="courier new,monospace"> action RESET { reset(); }<br> action CRLF { std::cout << std::endl << std::endl; }</font></div>
<div><font face="courier new,monospace"> action NAME { m_name.append( 1, fc ); }<br> action KEY { m_key.append( 1, fc ); }<br> action VAL { m_val.append( 1, fc ); }<br> action QKV <br> { <br> printf( "[%s]=>[%s]\n", m_key.c_str(), m_val.c_str());<br>
m_kvMap[ m_key ] = m_val;<br> m_key.clear();<br> m_val.clear();<br> } <br> action SNAME { printf( "NAME: [%s]\n", m_name.c_str() ); }</font></div><div><font face="courier new,monospace"> kvpair = ( identifier space* ':' space* strvalue );</font></div>
<div><font face="courier new,monospace"> kvlist = ( space+ | kvpair | ',' space+ kvpair );</font></div><div><font face="courier new,monospace"> instantiation = ( identifier '(' kvlist* ')' );</font></div>
<div><font face="courier new,monospace"></font> </div><div><font face="courier new,monospace"> instantiation_chain = ( <br> instantiation $NAME ( space* '>' space* instantiation )*<br> ) $NAME >RESET ';' @SNAME;</font></div>
<div><font face="courier new,monospace"></font> </div><div><font face="courier new,monospace"> inst_chain_scanner :=<br> |* <br> space+;<br> identifier => { diff(); };<br> strvalue => { diff(); };<br>
*|; </font></div><div><font face="courier new,monospace"></font> </div><div><font face="courier new,monospace"> group_name = ( 'g' 'r' 'o' 'u' 'p' );</font></div><div><font face="courier new,monospace"> group_id = ( identifier - group_name ) @NAME;</font></div>
<div><font face="courier new,monospace"> group_line = ( group_name space+ group_id :>> space* '{' );</font></div><div><font face="Courier New"></font> </div><div><font face="courier new,monospace"> group_scanner :=<br>
|* <br> space+ => { m_name.clear(); };<br> group_name;<br> group_id => { printf( ">> %s\n", m_name.c_str() ); };<br> '{' => { fret; };<br> *|; </font></div><div>
<font face="Courier New"></font> </div><div><font face="courier new,monospace"> main :=<br> |* <br> wspace+;<br> group_name => { fcall group_scanner; };<br> instantiation_chain => { fcall inst_chain_scanner; };<br>
*|; <br></font></div>