<html><body><div style="color:#000; background-color:#fff; font-family:arial, helvetica, sans-serif;font-size:12pt"><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">Really
like ragel. I have a question about the best way to implement a parser
for the following format which can be repeated in a given file<br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">ZCZC</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><well formatted fixed width text> = meta-data machine<br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">ZEM</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">ID:</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"> <any <span class="yiv1447488596tab">punct. etc</span>></div><div style="font-family:arial, helvetica,
sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">IDXXX:</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><span class="yiv1447488596tab"> <any, characters, punct. etc></span></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br><span class="yiv1447488596tab"></span></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><span class="yiv1447488596tab">ID WITH SPACE:</span></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br><span class="yiv1447488596tab"></span></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><span class="yiv1447488596tab"> <any-text><br></span></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div
style="font-family:arial, helvetica, sans-serif;font-size:16px;">NNNN<br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">A very simplified version of my machine is:</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">report = ('ZCZC' meta-data 'ZEM' body :>> 'NNNN')</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">main := report*</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">For
the body machine I am struggling to define a machine that captures the
identifier (always at start of line and has ':' character) and reads
until the next occurrence of any identifier. The identifiers may or may
not be present for example one file may have ID, IDXXX and the next can
have ID, ID WITH SPACE. Really I'm just looking for text at the
beginning of a line with a ':' character. It's challenging b/c there is
no way to tell when I'm done reading "any" and start a new ID
<any> block.</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">Thanks
for any help or insight you may provide. I was thinking that scanner
may be the only way for this type of input where I scan for the tokens
and read in my host language until the next token, but this seems sorta
"hack-tackulous" <br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">My
scanner would be (which I have tested and seems to for) and then I
would use smaller machines to further process down the input</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">action token {</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><span class="yiv1447488596Apple-tab-span" style="white-space:pre;"> </span>// read line by line until line that starts w/ '\n' [A-Z ] ':' is reached</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><span class="yiv1447488596Apple-tab-span" style="white-space:pre;"> </span>// insert each line in buffer/some key value pair map etc.</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">}</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">body := |*</div><div style="font-family:arial, helvetica,
sans-serif;font-size:16px;"><span class="yiv1447488596Apple-tab-span" style="white-space:pre;"> </span>'\n' [A-Z ]{3, 10} ':' @token // the {3,10} is b/c tokens at beginning of line are no longer than 10 characters</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;">*|;</div><div style="font-family:arial, helvetica, sans-serif;font-size:16px;"><br></div>Dan</div></body></html>