[ragel-users] ragel and encodings
Adrian Thurston
thurston at complang.org
Tue May 26 02:47:14 UTC 2009
Some people express multibyte sequences directly in ragel with a char or
unsigned char alphtype. There is contributed script in examples called
unicode2ragel.rb that generates ragel definitions for ranges of unicode
code points in utf8 or ucs4.
As a side note, it shoudl probably be in contrib. I'm going to move that
now for anyone following the SVN directly.
-Adrian
Robert Lemmen wrote:
> On Thu, May 21, 2009 at 11:34:35AM -0400, Wil Macaulay wrote:
>> Depends on your platform, but my approach to this problem (on the Mac)
>> was to detect
>> the encoding, and convert to UTF-8 before parsing. I also converted
>> line-endings (\r\n -> \n)
>> and ensured a newline at the end of the data at the same time.
>
> how do you handle utf-8 in your ragel code? do you use a single-byte
> alphtype and then handle the utf-8 sequences manually?
>
> cu robert
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> ragel-users mailing list
> ragel-users at complang.org
> http://www.complang.org/mailman/listinfo/ragel-users
More information about the ragel-users
mailing list