[DFDL-WG] Clarification needed: textNumberFormat quoting and prefix/suffix

Mike Beckerle mbeckerle at apache.org
Wed Jan 18 12:10:29 PST 2023


Yes, all Daffodil through release 3.4.0 have no clue about P or V
characters, and so will not require them to be quoted.
(We need to release note this, as our textNumberPattern property is
actually changing in a non-backward compatible way, as people used to be
able to put P/V in without quoting them.)

I hope as of Daffodil 3.5.0 we'll have P and V fully implemented. So
regardless of whether you are using P or V, the pattern must still be
scanned and any P/V outside of the numeric part must be quoted.

We will need to figure out the spec changes for this clarification, as the
language in the spec suggests single chars only.

And for Zoned, I assume we are allowing only the +/- signs for the position
of the overpunched sign, and no quoted strings or any such.

On Wed, Jan 18, 2023 at 11:52 AM Steve Hanson <smh at uk.ibm.com> wrote:

> Hi Mike
>
> Using IBM DFDL I can validly enter a pattern like this and it validates
> and parses. Suffix works the same. Any slight deviation like omitting to
> quote a P or E or V gives a validation error.
>
> 'POSITIVE'##0.##;N'E'GATI'VE'##0.0##
>
> I can even use a pair of quotes inside a quoted string, to indicate a
> literal single quote.
>
> My conclusion is that this is allowed by ICU, and that DFDL spec should
> allow it as well.
>
> Btw I think there is a bug in Daffodil's pattern validator, as P and V are
> also special characters and therefore need escaping.
>
> Regards
>
> Steve Hanson
>
> IBM Integration, Hursley, UK
> Architect, IBM DFDL
> Co-Chair, OGF DFDL Working Group
> smh at uk.ibm.com
> tel:+44-7717-378890
> Note: I work Tuesday to Friday
>
> -----Original Message-----
> *From*: Mike Beckerle <mbeckerle at apache.org
> <Mike%20Beckerle%20%3cmbeckerle at apache.org%3e>>
> *Reply-To*: mbeckerle at apache.org
> *To*: DFDL-WG <dfdl-wg at ogf.org <DFDL-WG%20%3cdfdl-wg at ogf.org%3e>>
> *Subject*: [EXTERNAL] [DFDL-WG] Clarification needed: textNumberFormat
> quoting and prefix/suffix
> *Date*: Wed, 11 Jan 2023 17:44:22 -0500
>
> I created an ICU DecimalFormat string where the positive sign is the word
> POSITIVE, and the negative sign is the word NEGATIVE. Per unit test below,
> this works in ICU, so long as they are escaped, because "E" is a special
> character
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> I created an ICU DecimalFormat string where the positive sign is the word
> POSITIVE, and the negative sign is the word NEGATIVE.
>
> Per unit test below, this works in ICU, so long as they are escaped,
> because "E" is a special character in the ICU pattern language.
>
> But...  I can find no example of this online for ICU nor DFDL.
>
> All examples are of a sign being just a single character, or a pair of
> single characters before and after for negative values.
>
> So my question: Do we intend DFDL to support this full generality, or only
> single characters for the prefix/suffix regions of the textNumberPattern?
>
> Right now I think Daffodil requires the sign to be a single character,
> which can be one of the special characters only by quoting it individually.
> E.g., if I wanted the positive sign to be P and negative sign to be N, then
> I'd use textNumberPattern:
>
> 'P'#0.0#;'N'#0.0#
>
> I don't have to quote the N, but I did just for consistency.
>
> If we want the full generality, I'd appreciate a realistic motivating
> example of it.
>
> /**
>  * This test shows that quoting of characters in ICU text number patterns
>
>  * and hence DFDL text number patterns can quote strings, not just individual
>  * characters.
>  *
>  * I found no examples of this anywhere. Everything shows patterns where
>  * quotes surround only single characters, or are doubled up for self quoting as in
>  * the o''clock example.
>  */
> @Test def testICUDecimalFormatQuoting_01(): Unit = {
>   {
>     // Note that E is a pattern special character
>     val pattern = "'POSITIVE' #.0###;'NEGATIVE' #.0###"
>     val df = new DecimalFormat(pattern)
>     val actual = df.format(-6.847)
>     assertEquals("NEGATIVE 6.847", actual)
>
>   }
>
>   {
>     // here we quote only the E, not the other characters.
>     val pattern = "POSITIV'E' #.0###;NEGATIV'E' #.0###"
>     val df = new DecimalFormat(pattern)
>     val actual = df.format(6.847)
>     assertEquals("POSITIVE 6.847", actual)
>   }
> }
>
>
> Mike Beckerle
> Apache Daffodil PMC | daffodil.apache.org
> OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> Owl Cyber Defense | www.owlcyberdefense.com
>
>
> --
>
>   dfdl-wg mailing list
>
>   dfdl-wg at lists.ogf.org
>
>   https://lists.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 11953 bytes
Desc: not available
URL: <https://lists.ogf.org/pipermail/dfdl-wg/attachments/20230118/412648ee/attachment.txt>


More information about the dfdl-wg mailing list