[DFDL-WG] Spec section 12.3.7.2 - binary reps and zero length
Steve Hanson
smh at uk.ibm.com
Thu Jul 10 04:29:56 EDT 2014
Actually I think my interpretation is not correct and there is not a
problem here.
Zero length processing (ie, checking for nil, checking for default) is
independent of type. It takes place as soon as the rep is extracted from
the data. Any rep may be extracted with zero length, and could potentially
be nilled in the infoset, or defaulted in the infoset if a required
occurrence with a default, or not put in the infoset if an optional
occurrence. It's only when none of those are true that we see whether
length 0 is ok for the logical type. And for all Numbers and Calendars,
length 0 is not ok and is a processing error because you can't turn it
into the rep into a logical value.
So please ignore my original email. Both the first and second DFDL
examples that I included are fine.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl at gmail.com>
To: Steve Hanson/UK/IBM at IBMGB,
Cc: "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date: 08/07/2014 17:37
Subject: Re: [DFDL-WG] Spec section 12.3.7.2 - binary reps and zero
length
This is one more among a number of problematic issues in the way the DFDL
Spec describes length.
I did a bunch of work on this a while back, but we decided that the
required changes to improve on what we have would be too extensive for
this draft of the spec.
I do believe we need to specify where length of zero is allowed or not. In
your analysis below you suggest that currently we are allowing packed
numbers of length zero. Since we allow prefixed and delimited for packed
numbers I think allowing length of zero is ok, and probably needed to
avoid all sorts of corner case checking.
It doesn't bother me that types/property combinations that disallow length
0 implicitly disallow use of default values. It does suggest a need for a
schema-definition error, however, if a default value is specified, but the
types and properties are such that it cannot be defaulted when parsing.
Can the default value end up being used when unparsing? I would think in
the case of array with minOccurs > 0 one might use it when unparsing.
The intention of Table 22 is that it gives the minimum length in bits, but
if length units is bytes, then one must divide by 8 and round up. So if
length units is 'bytes' then the minimum length for type xs:short is 1,
and the maximum length is 2.
To fix this, probably the simplest thing is to add two more columns to the
table giving minimum length when length units is bytes, and maximum length
when length units is bytes. That will be clearer than whatever prose we
try to come up with to explain this.
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
On Tue, Jul 8, 2014 at 10:55 AM, Steve Hanson <smh at uk.ibm.com> wrote:
Section 9 of the specification talks about empty etc representations,
where the meanings of a length of zero are defined. However section
12.3.7.2 has sub-sections for certain binary reps where it is illegal to
have a length of zero. Is this deliberate? The cases where 0 is allowed
are also those which can have lengthKind 'delimited' so that might be the
reason. Note that disallowing length of zero prevents use of a default
value.
12.3.7.2.1: Binary numbers. Table 22 has minimum length in bits => 0 not
allowed.
12.3.7.2.2: Floating point numbers: Says length must be exactly 4 bytes or
8 bytes => 0 not allowed.
12.3.7.2.3: Packed numbers. No minimum length stated => 0 allowed.
12.3.7.2.4: Binary booleans. Refers to 12.3.7.2.1 => 0 not allowed.
12.3.7.2.5: Binary calendars. Says length must be exactly 4 bytes or 8
bytes => 0 not allowed.
12.3.7.2.6: Packed calendars. No minimum length stated => 0 allowed.
12.3.7.2.7: Opaque binary. No minimum length stated => 0 allowed.
Noticed this when modelling an example of a header with user-defined
information at the end with the length given by a binary count.
<xs:complexType name="Type_Header">
<xs:sequence>
<xs:element dfdl:length="75" name="Filler" type="xs:hexBinary"/>
<xs:element dfdl:length="2" name="HeaderLength"
type="xs:nonNegativeInteger"/>
<xs:element dfdl:length="30" name="Filler2" type="xs:hexBinary"/>
<xs:element dfdl:length="{../HeaderLength - 108}" name="UserInfo"
type="xs:hexBinary"
minOccurs="0" dfdl:occursCountKind="implicit"/>
</xs:sequence>
</xs:complexType>
This could equally be modelled using occursCountKind 'expression' but that
seems unnecessary if the length is 0:
<xs:complexType name="Type_Header">
<xs:sequence>
<xs:element dfdl:length="75" name="Filler" type="xs:hexBinary"/>
<xs:element dfdl:length="2" name="HeaderLength"
type="xs:nonNegativeInteger"/>
<xs:element dfdl:length="30" name="Filler2" type="xs:hexBinary"/>
<xs:element dfdl:length="{../HeaderLength - 108}" name="UserInfo"
type="xs:hexBinary"
minOccurs="0" dfdl:occursCountKind="expression"
dfdl:occursCount="if (../HeaderLength - 108 eq 0) then 0 else 1"/>
</xs:sequence>
</xs:complexType>
Also what do the minimum lengths in Table 22 mean when lengthUnits is
bytes ?
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140710/14ac565b/attachment-0001.html>
More information about the dfdl-wg
mailing list