[DFDL-WG] ICU Parsing Behaviour
Richard Schofield
richard_schofield at uk.ibm.com
Tue Sep 11 10:14:30 EDT 2012
Attached below is a brief summary of ICU's number parsing behaviour (data
collected using ICU4C 49.1.2)
The following data will be parsed successfully regardless of the pattern
or strict/lenient parsing mode
Digits 0-9
Decimal Separator
Exponent Separator
Grouping separators are only permitted in the data, if specified in the
pattern
When in STRICT mode the following will cause a parse failure (using a
pattern of "#,##0.#")
Leading or doubled grouping separators
',123' and '1,,234" fail
Groups of incorrect length when grouping is used
'1,23' and '1234,567' fail, but '1234' passes
Grouping separators used in numbers followed by exponents
'1,234E5' fails, but '1234E5' and '1,234E' pass ('E' is not an exponent
when not followed by a number)
'Other' characters in the pattern must be present in the data. For
example, with a pattern of 'z'## :-
z12 will parse successfully, 12 will only parse successfully in
lenient mode.
Regards
Richard
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120911/b1211c08/attachment.html>
More information about the dfdl-wg
mailing list