[DFDL-WG] Action 278: Unparser maxOccurs issue

Mike Beckerle mbeckerle.dfdl at gmail.com
Mon Mar 23 14:36:41 EDT 2015


I wanted to follow up on this issue, as I am writing the unparser
implementation for daffodil for arrays currently.

Specifically, if I have an array with dfdl:occursCountKind="implicit",
maxOccurs="2", but there are three occurrences in the infoset, but there is
no possible element declaration following the array that could account for
an additional element after two, is this a processing error, or a
validation error?

<element name="root">
    <complexType>
      <sequence>
          <element name="a" type="string" minOccurs="0" maxOccurs="2"/>
      </sequence>
   </complexType>
</element>

If the infoset contains <root><a>1</a><a>2</a><a>3</a></root>

is this a processing error all the time irrespective of whether there is
any following sibling element also named "a". Or is it a validation error
because there is no other option for "a" except the array?



Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>


On Thu, Jan 29, 2015 at 4:41 PM, Mike Beckerle <mbeckerle.dfdl at gmail.com>
wrote:

> I read this over. I agree that current language is not symmetric for
> unparsing with parsing, and in this case it should be symmetric. If the
> parser is going to stop looking for more instances when maxOccurs is
> reached, then the unparser should stop output of those instances when
> maxOccurs is reached.
>
> There is a fair amount of complexity to assigning the proper schema
> component for use during unparsing, given an infoset, but counting numbers
> of occurrences is certainly not the most complex such thing, and parsing
> has to do this counting. So unparsing should as well.
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> www.tresys.com
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the OGF Intellectual Property Policy
> <http://www.ogf.org/About/abt_policies.php>
>
>
> On Mon, Jan 19, 2015 at 1:01 PM, Steve Hanson <smh at uk.ibm.com> wrote:
>
>> Please have a position on the below proposal from IBM for this week's WG
>> call.
>>
>> Regards
>>
>> Steve Hanson
>> Architect, *IBM DFDL*
>> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
>> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
>> IBM SWG, Hursley, UK
>> *smh at uk.ibm.com* <smh at uk.ibm.com>
>> tel:+44-1962-815848
>>
>>
>>
>> From:        Steve Hanson/UK/IBM
>> To:        DFDL-WG <dfdl-wg at ogf.org>
>> Date:        12/01/2015 16:56
>> Subject:        Unparser maxOccurs issue
>> ------------------------------
>>
>>
>> Possible change to spec needed., where it describes what happens when
>> maxOccurs is exceeded during unparsing for occursCountKind 'fixed' and
>> 'implicit' (and by implication scalar elements).
>>
>> It currently says it is a processing error. I think it is better to say
>> that the unparser moves on when maxOccurs is reached. This makes the
>> behaviour analogous to parsing, when it does not try to parse beyond
>> maxOccurs and moves on.  The current unparser wording is based on the
>> assumption that any next occurrence of the element in the infoset must be
>> an error, but this is not true - the next occurrence could be an occurrence
>> of a same named element later in the schema.
>>
>> An obvious example is:
>>
>> <xs:element name="data" minOccurs="2"maxOccurs="2"
>> dfdl:occursCountKind="fixed" ... />
>> <xs:element name="stuff" minOccurs="0" dfdl:occursCountKind="implicit"
>> ... />
>> <xs:element name="data" maxOccurs="2" maxOccurs="2"
>> dfdl:occursCountKind="fixed" ... />
>>
>> with an infoset where 'stuff' is missing:
>>
>> message_data
>>   data - xx1
>>   data - xx2
>>   data - yy1
>>   data - yy2
>>
>> A more interesting example is this, taken from MIL-STD-2045 schema (my
>> bold comments added):
>>
>> <xsd:sequence dfdl:separator="">
>> *<!-- Element Value1 -->*
>>         <xsd:choice>
>>           <xsd:sequence dfdl:separator="">
>>             <xsd:group ref="FPI_true"/>
>>             <xsd:element dfdl:length="4" dfdl:lengthKind="explicit"
>> name="Value1" type="xsd:string"/>
>>           </xsd:sequence>
>>           <xsd:sequence dfdl:separator="">
>>             <xsd:group ref="FPI_false"/>
>>           </xsd:sequence>
>>         </xsd:choice>
>> *<!-- Element Value2 -->*
>>         <xsd:choice>
>>           <xsd:sequence dfdl:separator="">
>>             <xsd:group ref="FPI_true"/>
>>             <xsd:element dfdl:length="4" dfdl:lengthKind="explicit"
>> name="Value2" type="xsd:string"/>
>>           </xsd:sequence>
>>           <xsd:sequence dfdl:separator="">
>>             <xsd:group ref="FPI_false"/>
>>           </xsd:sequence>
>>         </xsd:choice>
>> *<!-- Element Value3 -->*
>>         <xsd:choice>
>>           <xsd:sequence dfdl:separator="">
>>             <xsd:group ref="FPI_true"/>
>>             <xsd:element dfdl:length="4" dfdl:lengthKind="explicit"
>> name="Value3" type="xsd:string"/>
>>           </xsd:sequence>
>>           <xsd:sequence dfdl:separator="">
>>             <xsd:group ref="FPI_false"/>
>>           </xsd:sequence>
>>         </xsd:choice>
>> </xsd:sequence>
>>
>> ...where the FPI_true and FPI_false elements are defined in their own
>> global groups.
>>
>>   <xsd:group name="FPI_true">
>>     <xsd:sequence dfdl:separator="">
>>       <xsd:element default="true" dfdl:length="1"
>> dfdl:lengthKind="explicit" dfdl:textBooleanFalseRep="0"
>> dfdl:textBooleanTrueRep="1" name="FPI_true" type="xsd:boolean">
>>         <xsd:annotation>
>>           <xsd:appinfo source="http://www.ogf.org/dfdl/">
>>             <dfdl:discriminator>{. eq fn:true()}</dfdl:discriminator>
>>           </xsd:appinfo>
>>         </xsd:annotation>
>>       </xsd:element>
>>     </xsd:sequence>
>>   </xsd:group>
>>
>>   <xsd:group name="FPI_false">
>>     <xsd:sequence dfdl:separator="">
>>       <xsd:element default="false" dfdl:length="1"
>> dfdl:lengthKind="explicit" dfdl:textBooleanFalseRep="0"
>> dfdl:textBooleanTrueRep="1" name="FPI_false" type="xsd:boolean">
>>       </xsd:element>
>>     </xsd:sequence>
>>   </xsd:group>
>>
>> If the infoset looked like the following an error would be given, whereas
>> it is valid because the second FPI_false is for Value2:
>>
>> message_prefixedOccurs
>>   FPI_false - false
>>   FPI_false - false
>>   FPI_true - true
>>   Value3 - 9999
>>
>> Regards
>>
>> Steve Hanson
>> Architect, *IBM DFDL*
>> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
>> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
>> IBM SWG, Hursley, UK
>> *smh at uk.ibm.com* <smh at uk.ibm.com>
>> tel:+44-1962-815848
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>
>> --
>>   dfdl-wg mailing list
>>   dfdl-wg at ogf.org
>>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20150323/7e3438e0/attachment-0001.html>


More information about the dfdl-wg mailing list