[DFDL-WG] Spec correction ? - Section 9.3.2.1 - second list missing "empty" representation

Steve Hanson smh at uk.ibm.com
Thu Aug 2 07:16:30 EDT 2018


As a general point, section 9 was constructed so that the definitions of 
the representations are in 9.2 and then 9.3 can simply refer to them. We 
shouldn't be re-stating detail from 9.2 in 9.3 as it over-complicates the 
text and makes it hard to read. 

Section 9.3.2.2 makes it clear that a complex type must be traversed if it 
is not nil rep.

Delimited scanning. All you can do when scanning is look for a list of 
in-scope delimiters. If you don't find a terminator when it is needed for 
normal rep but not for nil/empty rep, then you can deduce this because the 
delimiter you found was not the terminator. If you can't reliably deduce 
this then the format is ambiguous and not parse-able. I don't believe that 
IBM DFDL does anything more than that, and it works for us with the 
formats we encounter.  If you think that this is not obvious and that more 
words are needed in the spec to convey this, then we can add them, but we 
must be careful to maintain the readability of section 9.3.

Regards
 
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday 



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     Steve Hanson <smh at uk.ibm.com>
Cc:     dfdl-wg at ogf.org
Date:   01/08/2018 19:18
Subject:        Re: [DFDL-WG] Spec correction ? - Section 9.3.2.1 - second 
list missing "empty" representation



Ah,

So 9.3.2 talks only about the content region being zero length. I get that 
now. I was not reading the word "content" specifically enough. 

I also get that the second numbered list under 9.3.2.1 can't discuss empty 
representation, because the content region has been identified as non-zero 
length. 

I think there are still some problems here.

In 9.3.2, I believe it's point about delimited lengthKind is problematic. 
You don't even know whether you are scanning for a terminator or not 
unless you entertain whether nillable, empty/defaultable, or normal - 
because of nilValueDelimiterPolicy, emptyValueDelimiterPolicy, and 
initiator/terminator. It's not correct to just look for a separator (if 
there even is one) here. I think if the lengthKind is 'delimited', then 
you must consider the nil representation with framing, empty 
representation with framing, and normal representation with framing (for 
string and hexBinary), in order to even find, positively, the content 
region in order to say that it is zero length. This is particularly true 
if dfdl:initiator and dfdl:terminator are the same string. 

Delimiter "immediately found" isn't even a valid concept for a 
non-nillable complexType, as one must recurse into the type. So I think we 
need to be clear that a non-nillable complex type is simply not ever 
considered "trivially zero length". 

I think the clearest thing to fix 9.3.2 is for 'delimited' is to provide 
the case analysis:

'delimited' => if nillable, is content zero length after initiator and 
before terminator framing as required by nilValueDelimiterPolicy
                     if simpleType - is content zero length after 
initiator and before terminator framing framing as required by 
emptyValueDelimiterPolicy
                     if simpleType xs:string or xs:hexBinary - is content 
zero length after initiator and before terminator framing (if such framing 
is defined.) 
                     otherwise - not trivially zero length. 

I think 9.3.2.1 needs to make the implied point about being about the 
content only, and then about having the corresponding required framing 
explicit e.g., the first sentence could change to:

     "If the content is length zero as described above, the representation 
is then established by checking, in order for:"

After the numbered lists, a concluding sentence can say "In all cases 
above establishing a representation includes that it must be surrounded by 
its corresponding initiator/terminator if required for that 
representation's framing based on dfdl:nilValueDelimiterPolicy, 
dfdl:emptyValueDelimiterPolicy, dfdl:initiator, and dfdl:terminator."

9.3.2.2 Similarly needs to specify that "Establishing nil representation 
includes that it must be surrounded by its corresponding 
initiator/terminator if required based on dfdl:nilValueDelimiterPolicy, 
dfdl:initiator, and dfdl:terminator."




Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy


On Wed, Aug 1, 2018 at 7:45 AM, Steve Hanson <smh at uk.ibm.com> wrote:
Mike 

9.3.2.1 

First numbered list. It's implicit, as is the non-reference to 
nilValueDelimiterPolicy for nil rep. 
Second numbered list. Should not mention empty - can't be empty as not 
zero length. 

Don't understand what you mean by empty rep is non-zero length. We are 
talking about the content region only here. 

Regards
 
Steve Hanson 
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday 



From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        dfdl-wg at ogf.org 
Date:        17/07/2018 15:11 
Subject:        [DFDL-WG] Spec correction ? - Section 9.3.2.1 - second 
list missing        "empty" representation 
Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org> 




In section 9.3.2.1, there are two numbered lists. 

The first numbered list should qualify its mention of "empty 
representation" with "(when the emptyValueDelimiterPolicy and initiator 
and terminator policies are defined such that zero-length is allowed as 
the empty representation.)" 

The second numbered list should mention nil (literal), nil (logical), 
empty, and normal representations. 

It is missing discussion of non-zero-length "empty" representations (due 
to emptyValueDelimiterPolicy not none, and initiator and terminator 
defined such that the empty representation is non-zero length). 

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com 
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy 
--
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20180802/e2f3c42a/attachment-0001.html>


More information about the dfdl-wg mailing list