[DFDL-WG] Omitted array occurrences

Tim Kimber KIMBERT at uk.ibm.com
Thu Nov 19 06:06:47 CST 2009


What should the DFDL unparser do when some or all of the elements of an 
array are missing?

I have found the following statements in v0.36 which seem relevant:

Section 5.2.1
The minOccurs value is used:
·       to determine if an element declaration or reference is scalar or 
array
·       to determine the required minimum number of occurrences of an 
array both when parsing and unparsing

Section 16.13 
( Note : this definition of 'required' is a repeat of the defintion in 
section 3 )
Definition: 'required'
We define the term 'required' as follows: 
·       A scalar element is required. 
·       An element of a fixed-occurrence array is required. 
·       An element of a variable-occurrence array is required if its index 
is less than or equal to the value of minOccurs. 
All other elements are not required.
...
On unparsing, if an element is required, and is not part of the logical 
data and the element has a default value specified then it is used, 
otherwise it is a processing error. 

Section 17.3.1 : Sequence groups and separators

re: the combination of separatorPolicy="suppressAtEnd" and 
sequenceKind="ordered":
All separators must be found in the data except that when the sequence has 
trailing optional items, the separators are suppressed for any final 
missing items. Note suppressAtEnd can only be used when there is no clash 
with delimiters from the containing structure.

My interpretation of the specification is:
a) if separatorPolicy="require" then the unparser should output a 
separator for all missing required elements ( whether array members or not 
)
b) if separatorPolicy="suppressAtEnd" then the unparser should output a 
separator for all non-trailing missing required elements
c) separators for missing elements must be output regardless of whether 
the element is required/optional, simple/complex, does/does not have a 
default value etc. I assume this because the term 'missing' is used rather 
than the very clearly-defined term 'required'.

Reading between the lines, I also infer the following rules:
d) if an array has maxOccurs="unbounded" and it is missing from the 
infoset then the unparser will not output any separators for the array
e) if an array has maxOccurs!="unbounded" and it is missing from the 
infoset then  the unparser will output a separator for each missing 
occurrence ( so it will output maxOccurs separators ).
f) if an element contains a child group, and none of the group members are 
present in the infoset, then the group is 'missing' and the unparser will 
output a separator for it. 

Suggested changes to the specification:
- As a minimum, I think it would be useful for the specification to 
include a definition of 'missing'. 
- DFDL does not allow min/maxOccurs on groups, so they implicitly have 
cardinality 1:1. Specification should specify the behaviour of the 
unparser when none of a group's members are present in the infoset.
- The wording in 17.3.1 could be more accurate. I don't think the word 
'optional' should be there ( if validation is off then the unparser will 
tolerate missing required elements ). I think the words 'trailing' and 
'final' are intended to mean the same - we should standardize on 
'trailing'.

regards,

Tim Kimber, Common Transformation Team,
Hursley, UK
Internet:  kimbert at uk.ibm.com
Tel. 01962-816742 
Internal tel. 246742






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20091119/8ae185f6/attachment.html 


More information about the dfdl-wg mailing list