[DFDL-WG] Clarify/Simplify - description of textNumberPattern with V, P

Steve Hanson smh at uk.ibm.com
Thu Jan 5 09:52:02 PST 2023


Erratum issue https://github.com/OpenGridForum/DFDL/issues/38 created - please review.



Regards

Steve Hanson

IBM Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com<mailto:smh at uk.ibm.com>
tel:+44-7717-378890
Note: I work Tuesday to Friday

-----Original Message-----
From: Steve Hanson <smh at uk.ibm.com<mailto:Steve%20Hanson%20%3csmh at uk.ibm.com%3e>>
To: mbeckerle at apache.org<mailto:mbeckerle at apache.org>, DFDL-WG <dfdl-wg at ogf.org<mailto:DFDL-WG%20%3cdfdl-wg at ogf.org%3e>>
Subject: Re: [EXTERNAL] [DFDL-WG] Clarify/Simplify - description of textNumberPattern with V, P
Date: Wed, 04 Jan 2023 15:40:13 +0000

Hi Mike

I've read through the section and I agree that there are some improvements that can be made. I think we can go further than you suggest too.

Here's the syntax from spec Fig 4 ...

vpinteger := pinteger | (vinteger exponent?)
pinteger := ('P'* integer) | (integer 'P'* )
vinteger := ('V'? integer) |
            ('#'* 'V'? integer)|
            ('#'* '0'* 'V'? '0'* '0')|
            (integer 'V'?)
integer := '#'* '0'* '0'

Here are the bullets ...

• A pattern with a V symbol must not have # symbols to the right of the V symbol.
• A pattern with P symbols at the left end must not have # symbols .
• A pattern with P symbols at the right end can have # symbols.
• A pattern with a V symbol must not have @ or * symbols.
• A pattern with P symbols must not have @ or E or * symbols.

1) I would question why we use 'V'? and 'P'* instead of 'V' and 'P' 'P'*. As written, it allows the V or P to be omitted entirely, but omission is already catered for by the number term (as '.' fraction is optional)

2) If 'V'? becomes 'V' then clause ('#'* 'V'? integer) is invalid as it contradicts bullet #1 as integer allows '#'.

3) Clause ('V'? integer) similarly, but it's redundant anyway as it is implied by the clause in 2)

4) Clause ('P'* integer) can be rewritten as  ('P' 'P'* '0'* 0)  which removes need for bullet #2

5) As you observe, bullets #3, #4 & #5 add nothing.

I think we can rewrite the syntax as below and remove all the bullets.

vpinteger := pinteger | (vinteger exponent?)
pinteger := ('P' 'P'* '0'* '0') | (integer 'P'* 'P' )
vinteger := ('#'* '0'* 'V' '0'* '0') | (integer 'V')

Note that I have avoided the obvious and more convenient use of 'X'+ because the + qualifier is not used elsewhere in Fig 4, I suspect to avoid confusion with the '+' symbol.

I would also suggest that the exponent term is incorrect. Surely you need at least an 'E' or a '+' .

exponent := ('E' | '+' | 'E+') '0'* '0'

And what does the text " - special Characters" and " - quote" signify in Fig 4?

Regards

Steve Hanson

IBM Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com<mailto:smh at uk.ibm.com>
tel:+44-7717-378890
Note: I work Tuesday to Friday

-----Original Message-----
From: Mike Beckerle <mbeckerle at apache.org<mailto:Mike%20Beckerle%20%3cmbeckerle at apache.org%3e>>
Reply-To: mbeckerle at apache.org<mailto:mbeckerle at apache.org>
To: DFDL-WG <dfdl-wg at ogf.org<mailto:DFDL-WG%20%3cdfdl-wg at ogf.org%3e>>
Subject: [EXTERNAL] [DFDL-WG] Clarify/Simplify - description of textNumberPattern with V, P
Date: Thu, 22 Dec 2022 21:55:12 -0500

I am implementing dfdl: textNumberPattern characters V and E for Apache Daffodil. The DFDL Spec says (Section 13. 6. 1. 1)  It is a Schema Definition Error if any symbols other than "0", "1" through "9" or # are used
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd
I am implementing dfdl:textNumberPattern characters V and E for Apache Daffodil.

The DFDL Spec says (Section 13.6.1.1)

It is a Schema Definition Error if any symbols other than "0", "1" through "9" or # are used in the vpinteger region of the pattern.

Note that the vpitneger region of the pattern excludes the leading and/or trailing sign indicator characters which are the prefix and suffix.

The prefix and suffix can contain any character except the special characters of the pattern language.

Later in the section the spec has a number of detailed statements that are subsumed by the above I believe.


  *   A pattern with a V symbol must not have @ or * symbols.
  *   A pattern with P symbols must not have @ or E or * symbols.

So those latter 2 statements could be omitted since @ and * are "special characters" so aren't allowed in the prefix/suffix, and per above stipulation can't be in the vpinteger region either.

Can't we simplify and just say a pattern with V or P cannot have any special characters other than 0-9 and # ?

Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org<http://daffodil.apache.org/>
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl<http://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl>
Owl Cyber Defense | www.owlcyberdefense.com<http://www.owlcyberdefense.com/>



--

  dfdl-wg mailing list

  dfdl-wg at lists.ogf.org<mailto:dfdl-wg at lists.ogf.org>

  https://lists.ogf.org/mailman/listinfo/dfdl-wg

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 16663 bytes
Desc: not available
URL: <https://lists.ogf.org/pipermail/dfdl-wg/attachments/20230105/a1af0a24/attachment.txt>


More information about the dfdl-wg mailing list