[DFDL-WG] Action 033 : Assertions and discriminators

Tim Kimber KIMBERT at uk.ibm.com
Wed Apr 29 07:41:21 CDT 2009


Hi all,

I have this action:

033
04/03: Assert/Discriminator semantics. AP to document. TK to check uses of 
discriminator besides choice.

I believe the rules should be:

1. A point of uncertainty is any of
* an element which has minOccurs != maxOccurs
* a choice
* a sequence with sequenceKind="unordered"

2. Nested within the scope of a point of uncertainty, there might be other 
points of uncertainty.

3. A discriminator which evaluates to true resolves the nearest in-scope 
point of uncertainty. 

4. An assertion which evaluates to false causes a processing error

5. Any processing error ( from an assertion failure or otherwise ) will 
cause the parser to backtrack to the nearest unresolved point of 
uncertainty and try the next available branch, if any. If there are no 
more branches available, the parser will backtrack to the next nearest 
unresolved point of uncertainty.

6. A processing error which reaches the root tag is reported to the host 
application.

7. Assertions and discriminators are allowed on any point of uncertainty ( 
not only on the branches of a choice )


Rationale:

If we only allow a discriminator on a choice branch, then it will be 
difficult to model this common style of message

Tagged header, minOccurs="1", maxOccurs="1"
Untagged body, maxOccurs="unbounded"
Tagged trailer, minOccurs="1", maxOccurs="1"

An example with 3 occurrences of the body would be:

HE,headerfield1,headerfield2,headerfield3
John Smith, 100, bodyfield3
John Brown, 200, bodyfield3
Elton John, 30Z, bodyfield3
TR,trailerfield1,trailerfield2,trailerfield3

And the DFDL schema would look something like this ( excuse the almost 
inevitable errors, this is just for completeness ):

...
<xs:element name="message">
  <xs:complexType dfdl:lengthKind="implicit">
     <xs:sequence dfdl:separator="\r\n">
        <xs:element name="header" initiator="HE,">
          <xs:complexType>
            <xs:sequence dfdl:separator=",">
              <xs:element name="header1" type="xs:string">
              <xs:element name="header2" type="xs:string">
              <xs:element name="header3" type="xs:string">
            </xs:sequence>
          </xs:complexType>
        </xs:element>

        <xs:element name="body" maxOccurs="unbounded">
          <xs:complexType>
            <xs:sequence dfdl:separator=",">
              <xs:element name="body1" type="xs:string">
              <xs:element name="body2" type="xs:int">
              <xs:element name="body3" type="xs:string">
            </xs:sequence>
          </xs:complexType>
        </xs:element>

        <xs:element name="trailer" initiator="TR,">
          <xs:complexType>
            <xs:sequence dfdl:separator=",">
              <xs:element name="trailer1" type="xs:string">
              <xs:element name="trailer2" type="xs:string">
              <xs:element name="trailer3" type="xs:string">
            </xs:sequence>
          </xs:complexType>
        </xs:element>
     </xs:sequence>
  </xs:complexType>
</xs:element>


The '30Z' value for the final occurrence of element Body2 is incorrect. It 
is not a valid integer, and will trigger a processing error.

Without a discriminator, this failure will cause the parser to backtrack 
to the optional field and try the next element ( the trailer element ). 
The initiator will not be found, and the reported error will be "Initiator 
'TR,' not found for element 'trailer'". The user would almost certainly 
prefer "Invalid value '30Z' for element 'body2'. Value could not be 
converted to simple type 'xs:int'"

For this example, the discriminator would need to detect unambiguously 
that it really was dealing with a Body element and not a Trailer element. 
Due to the message style ( which is quite common ) the only way to do this 
is to detect that it is *not* a Trailer. I cannot think of an elegant way 
to do that using the facilities in v0.33 of the specification. I have 
raised this with Alan and Steve.

regards,

Tim Kimber, Common Transformation Team,
Hursley, UK
Internet:  kimbert at uk.ibm.com
Tel. 01962-816742 
Internal tel. 246742






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20090429/da606f6d/attachment.html 


More information about the dfdl-wg mailing list