[DFDL-WG] Consolidated Notes including both DFDL WG Call 2018-08-07 along with other recent clarification email threads

Steve Hanson smh at uk.ibm.com
Wed Aug 15 10:12:00 EDT 2018


Mike, thanks for writing this up.

I am in the process of incorporating it into the minutes of the WG call. 
There is one paragraph that on re-reading and comparing with what is in 
the spec already, I am not so sure about.

Section 9.2.2
    The sentence:  "The empty representation is special in DFDL, because 
when parsing it is this condition that can trigger the creation of a 
default value for an element occurrence." replace with: "The empty 
representation is special in DFDL because when parsing it it is used to 
determine when default values are created in the Infoset, and when 
optional recurring elements are omitted from the Infoset. The empty 
representation can require initiators or terminators be present so as to 
enable data formats to explicitly distinguish empty-string/hexBinary 
values (which might cause default values to be used) from emptiness 
meaning the absence of any representation."
     (This is to clarify an error of omission - prior language suggested 
that EVDP is only relevant when the element has a default value, because 
only that need was mentioned.)

What is the significance of 'optionally recurring elements'? I would have 
thought that is just 'optional occurrences' as it applies to (0,1) 
elements too.

Actually I'm not convinced that clause is needed at all. The point of this 
paragraph is to call out defaulting. Not adding occurrences to the infoset 
happens for absent and missing occurrences too, as described in later 
paragraphs.

I would prefer:

"The empty representation is special in DFDL because when parsing it is 
used to determine when default values are created in the Infoset. The 
empty representation can require initiators or terminators be present so 
as to
 enable data formats to explicitly distinguish occurrences with empty 
string/hexBinary values from occurrences that are missing or are absent."

Regards
 
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday 



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     DFDL-WG <dfdl-wg at ogf.org>
Date:   07/08/2018 18:46
Subject:        [DFDL-WG] Consolidated Notes including both DFDL WG Call 
2018-08-07 along with other recent clarification email threads
Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org>



This message includes two parts. 

Part 1 is the things we discussed on the DFDL WG Call of 2018-08-07. 
Part 2 is the other recent emails where the conclusion from the email 
thread is repeated here for finalization/refinement, and for consolidation 
so all these changes can be considered together. 

================================

Part 1 - Discussed on the call.  
Re: [DFDL-WG] Clarification discussion points for next call(s) - DFDL Spec 
issues around nil, empty, normal, absent, defaulting, and separator 
suppression.
Dated August 2nd

Conclusions:

Section 9.3.1.1
   Delete phrase "...this of course implies that....". Note that there is 
already a correction to create numbered bullets of 3 sentences. The 
sentence containing this phrase will be bullet #3 of that list.

Section 9.4
   Item 2 under "For elements and element refs:" Change to: "dfdl:element 
following property scoping rules, which includes establishing 
representation as described in Section 9.3.2 and conversion to element 
type for simple types."

Section 9.3.2
  The phrase "The first step is to see if the content is trivaill of 
length zero." Change to: "The first step is to see if the SimpleContent or 
ComplexContnet region is of length zero as a first approximation."
  The bullet "delimited => length is zero (delimiter is immediately 
encountered)" Insert "in scope" after the open parenthesis. 

Section 9.4.2.3.
  We agreed that the paragraphs beginning with "For both required and 
optional..." need to be better tied to the material above. Wording TBD - 
pending Steve Hanson doing some tests on IBM DFDL.

============================

Part 2 - Below are conclusions from the prior email threads. This is for 
email review in lieu of discussion on this week's call. 
Re: [DFDL-WG] clarification: on suppressed ZL string/hexBinary - do we 
keep variable assignments?
Of Aug 6 (last date in the thread)

These corrections apply:

Sections 9.4.2.2 and 9.4.2.3
   The phrase "Optional occurrence: If dfdl:emptyValueDelimiterPolicy is 
not 'none'[12],"  Change to "Optional occurrence: if 
dfdl:emptyValueDelimiterPolicy is applicable and is not 'none',...." 
(retaining the footnote)
 
Section 9.4.2
   Before the final phrase "There are three main cases to consider:" 
Insert this sentence: "The sections below indicate when an item is added 
to the infoset, and whether it has a default or other value. If there is 
no processing error then regardless of whether an item is added to the 
infoset or not, any side-effects due to dfdl:discriminator statements 
evaluating to true, or dfdl:setVariable statements, are retained."

Section 12.2
   For property emptyValueDelimiterPolicy, before the phrase "It is a 
schema definition error if...", insert this sentence: "The value of 
dfdl:emptyValueDelimiterPolicy  should only be checked if there is a 
dfdl:initiator or dfdl:terminator in scope. If so, and 
dfdl:emptyValueDelimiterPolicy is not set, it is a schema definition 
error. If dfdl:initiator is not "" and dfdl:terminator is "" and 
dfdl:emptyValueDelimiterPolicy is 'terminator' it is a schema definition 
error. If dfdl:terminator is not "" and dfdl:initiator is " and 
dfdl:emptyValueDelimiterPolicy  is 'initiator' it is a schema definition 
error."

Section 13.16
    For property nilValueDelimiterPolicy, before the phrase "It is a 
schema definition error if...", insert this sentence: "The value of 
dfdl:nilValueDelimiterPolicy  should only be checked if there is a 
dfdl:initiator or dfdl:terminator in scope. If so, and 
dfdl:nilValueDelimiterPolicy is not set, it is a schema definition error. 
If dfdl:initiator is not "" and dfdl:terminator is "" and 
dfdl:nilValueDelimiterPolicy is 'terminator' it is a schema definition 
error. If dfdl:terminator is not "" and dfdl:initiator is " and 
dfdl:nilValueDelimiterPolicy  is 'initiator' it is a schema definition 
error."

Section 9.2.2
    The phrase  "the occurrence's content in the data..." replace with 
"the occurrence's SimpleContent or ComplexContent region in the data..."
    The sentence:  "The empty representation is special in DFDL, because 
when parsing it is this condition that can trigger the creation of a 
default value for an element occurrence." replace with: "The empty 
representation is special in DFDL because when parsing it it is used to 
determine when default values are created in the Infoset, and when 
optional recurring elements are omitted from the Infoset. The empty 
representation can require initiators or terminators be present so as to 
enable data formats to explicitly distinguish empty-string/hexBinary 
values (which might cause default values to be used) from emptiness 
meaning the absence of any representation."
     (This is to clarify an error of omission - prior language suggested 
that EVDP is only relevant when the element has a default value, because 
only that need was mentioned.)
    
Re: [DFDL-WG] Clarification needed: separator for empty sequence
Of Aug 2

Section 14.2
   For property dfdl:separator. The sentence: "Separators occur in the 
data either before, between or after all occurrences of the elements or 
groups that are the children of the sequence." replaced with "Separators 
occur in the data either before, between or after all occurrences of 
represented elements (that is, elements without the dfdl:inputValueCalc 
property) or model groups that are the children of the sequence. Elements 
with dfdl:inputValueCalc have no representation in the data stream, and so 
never have separators. Children of a sequence that are model groups are 
always separated, even if they are empty (meaning have no children of 
their own - which is allowed for sequence groups), or both the model group 
child and its contained children occupy zero-length in the data stream." 
   (note: Some of the above is redundant with stipulations in the 
dfdl:inputValueCalc property description, but I believe it is wise to have 
this little redundancy.)

======================

These email threads are mentioned here to indicate that they are resolved 
by one or another of the above corrections:

Re: [DFDL-WG] clarification needed - ambiguity about empty string and 
optional element
Of Aug 2
Re: [DFDL-WG] Spec correction ? - Section 9.3.2.1 - second list missing 
"empty" representation
Of Aug 2

-----------

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20180815/5579630a/attachment-0001.html>


More information about the dfdl-wg mailing list