[DFDL-WG] Action 313: Plus '+' sign and lax textNumberCheckPolicy

Mike Beckerle mbeckerle.dfdl at gmail.com
Thu Aug 29 14:12:59 EDT 2019


Looks like ICU changed behavior....

From: Steve Lawrence <slawrence at apache.org>
Sent: Thursday, August 29, 2019 1:30 PM
To: users at daffodil.apache.org
Subject: Re: Plus '+' sign and lax textNumberCheckPolicy - was: Re: How to
model a fixed-length integer that may be padded with space on the left?

I think this is a difference in ICU version?

A little grepping through ICU source, I found a change [1] to their
number parsing logic in Dec 2017:

+        if (!isStrict) {
+            parser.addMatcher(WhitespaceMatcher.getInstance());
+            parser.addMatcher(new PlusSignMatcher());
+        }

That looks to me like a change to make it so plus signs are always
matched in lax/lenient mode regardless of the pattern (Daffodils current
behavior). A couple minor changes have been made to that section, but
nothing that allows you to turn if off if lenient is on.

It's hard to tell in the git history what release that was in, but it
looks like around version 61, which is relatively new (Daffodil is on
version 62).

Also, the latest version of DecimalFormatProperties.java (looks to be an
internal implementation, so no online javadocs), has javadocs that
states that plus signs are always allowed in lenient/lax mode [2].

I think this is a change in ICU behavior in newer versions.

- Steve

[1]
https://github.com/unicode-org/icu/commit/68340c8464bd988477d6c88f46f9dfe4562a6d02#diff-565b07c255337881b4e06f766691667cR119-R122
[2]
https://github.com/unicode-org/icu/blob/master/icu4j/main/classes/core/src/com/ibm/icu/impl/number/DecimalFormatProperties.java#L53-L54
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20190829/c706bb8b/attachment-0001.html>


More information about the dfdl-wg mailing list