[DFDL-WG] Clarification needed: textNumberFormat quoting and prefix/suffix

Steve Hanson smh at uk.ibm.com
Wed Jan 18 08:52:07 PST 2023


Hi Mike

Using IBM DFDL I can validly enter a pattern like this and it validates and parses. Suffix works the same. Any slight deviation like omitting to quote a P or E or V gives a validation error.

'POSITIVE'##0.##;N'E'GATI'VE'##0.0##

I can even use a pair of quotes inside a quoted string, to indicate a literal single quote.

My conclusion is that this is allowed by ICU, and that DFDL spec should allow it as well.

Btw I think there is a bug in Daffodil's pattern validator, as P and V are also special characters and therefore need escaping.

Regards

Steve Hanson

IBM Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com<mailto:smh at uk.ibm.com>
tel:+44-7717-378890
Note: I work Tuesday to Friday

-----Original Message-----
From: Mike Beckerle <mbeckerle at apache.org<mailto:Mike%20Beckerle%20%3cmbeckerle at apache.org%3e>>
Reply-To: mbeckerle at apache.org<mailto:mbeckerle at apache.org>
To: DFDL-WG <dfdl-wg at ogf.org<mailto:DFDL-WG%20%3cdfdl-wg at ogf.org%3e>>
Subject: [EXTERNAL] [DFDL-WG] Clarification needed: textNumberFormat quoting and prefix/suffix
Date: Wed, 11 Jan 2023 17:44:22 -0500

I created an ICU DecimalFormat string where the positive sign is the word POSITIVE, and the negative sign is the word NEGATIVE. Per unit test below, this works in ICU, so long as they are escaped, because "E" is a special character
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd

I created an ICU DecimalFormat string where the positive sign is the word POSITIVE, and the negative sign is the word NEGATIVE.

Per unit test below, this works in ICU, so long as they are escaped, because "E" is a special character in the ICU pattern language.

But...  I can find no example of this online for ICU nor DFDL.

All examples are of a sign being just a single character, or a pair of single characters before and after for negative values.

So my question: Do we intend DFDL to support this full generality, or only single characters for the prefix/suffix regions of the textNumberPattern?

Right now I think Daffodil requires the sign to be a single character, which can be one of the special characters only by quoting it individually. E.g., if I wanted the positive sign to be P and negative sign to be N, then I'd use textNumberPattern:

'P'#0.0#;'N'#0.0#

I don't have to quote the N, but I did just for consistency.

If we want the full generality, I'd appreciate a realistic motivating example of it.

/**
 * This test shows that quoting of characters in ICU text number patterns

 * and hence DFDL text number patterns can quote strings, not just individual
 * characters.
 *
 * I found no examples of this anywhere. Everything shows patterns where
 * quotes surround only single characters, or are doubled up for self quoting as in
 * the o''clock example.
 */
@Test def testICUDecimalFormatQuoting_01(): Unit = {
  {
    // Note that E is a pattern special character
    val pattern = "'POSITIVE' #.0###;'NEGATIVE' #.0###"
    val df = new DecimalFormat(pattern)
    val actual = df.format(-6.847)
    assertEquals("NEGATIVE 6.847", actual)

  }

  {
    // here we quote only the E, not the other characters.
    val pattern = "POSITIV'E' #.0###;NEGATIV'E' #.0###"
    val df = new DecimalFormat(pattern)
    val actual = df.format(6.847)
    assertEquals("POSITIVE 6.847", actual)
  }
}


Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org<http://daffodil.apache.org/>
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl<http://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl>
Owl Cyber Defense | www.owlcyberdefense.com<http://www.owlcyberdefense.com/>



--

  dfdl-wg mailing list

  dfdl-wg at lists.ogf.org<mailto:dfdl-wg at lists.ogf.org>

  https://lists.ogf.org/mailman/listinfo/dfdl-wg

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 13472 bytes
Desc: not available
URL: <https://lists.ogf.org/pipermail/dfdl-wg/attachments/20230118/a11461bc/attachment.txt>


More information about the dfdl-wg mailing list