[DFDL-WG] Behaviour for lengthKind 'endOfparent' is still not fully specified
Steve Hanson
smh at uk.ibm.com
Tue Sep 4 09:08:05 EDT 2012
Noted when I reviewed latest spec - endOfParent and unparsing is not fully
thought through.
The spec today says that I can use endOfParent with binary data. There is
a restriction in section 12.3.8, but it only applies when an element is
endOfParent and its parent is lengthKind delimited.
There are a couple of cases to consider:
1) Binary data of restricted length (see list in other email "proposed
clarification/narrowing - delimited binary data should decimal"). I don't
think it makes sense to allow these. We don't allow these binary reps for
delimited.
2) Text data of variable length when unparsing. Box scenario. If the data
in the infoset is shorter than the space in the box, what we do? I think
we should pad to box length with appropriate padChar, according to
justification, as that is effectively a 'specified length'. Error if
textPadKind is 'none'. Use parent's lengthUnits.
3) HexBinary data of variable length when unparsing. Box scenario. If the
data in the infoset is shorter than the space in the box, what we do? I
think we should right-pad to box length with fill byte, as that is
effectively a 'specified length'.
4) Packed/BCD binary data of variable length when unparsing. Box scenario.
If the data in the infoset is shorter than the space in the box, what we
do? I think we should pad to box length with zero bytes, according to
justification, as that is effectively a 'specified length'. (Must be zero
bytes and not fill byte as must be numeric in order to be parsed).
In relation to 2 - 4, note that lengthKind 'endOfParent' can only be used
with a parent lengthKind of 'explicit', 'pattern', 'prefixed' or
'endOfParent' or a choice with choiceLengthKind 'explicit', so the box
scenario when unparsing therefore occurs only when lengthKind is
'explicit' or choiceLengthKind is 'explicit' - these are the cases when
the length is known. Also note that when there are nested 'endOfParent'
elements (which is allowed) then all padding must be done on the simple
element (ie, the innermost element), to ensure that what is output can be
parsed.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120904/77403a67/attachment.html>
More information about the dfdl-wg
mailing list