[DFDL-WG] Action 033 : Assertions and discriminators

Mike Beckerle mbeckerle.dfdl at gmail.com
Tue May 5 07:49:02 CDT 2009


This sounds good to me.
 
I have an idea to contribute. 
 
If I have a choice enclosing a variable-occurs array, then I might make the
mistake of writing a discriminator for the choice, but not realize it is
resolving the variable-occurs because it is enclosed by that.
 
What if we added a label to discriminators to say what they are
discriminating?
 
In the unambiguous cases this would be redundant. In the more subtle cases
it would allow a DFDL processor to issue a superior diagnostic message
without having to be as clever.
 
I would suggest this syntax:
 
<dfdl:discriminator kind="choice" ..../>
 
or 
 
<dfdl:discriminator kind="occurs".../>
 
or
 
<dfdl:discriminator kind="unorderedSequence"/>
 
Other syntax that achieves this same separation would be fine, e.g., we
could have <dfdl:choiceDiscriminator .../> and analogous other element names
for the kinds of discriminators. 
 
I am not sure about the unordered sequence flavor. There's another
discussion about that going on simultaneously. To me that needs to be
resolved by way of initiators, so the discriminator is always implied;
however, if the semantics there was generalized to allow for more complex
speculation then we would need analogous discriminator constructs.
 
...mike
 

Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc.
Tel:  781-810-2125  | 100 Fifth Ave., 4th Floor, Waltham MA 02451 |
<mailto:mbeckerle.dfdl at gmail.com> mbeckerle.dfdl at gmail.com 

 

  _____  

From: dfdl-wg-bounces at ogf.org [mailto:dfdl-wg-bounces at ogf.org] On Behalf Of
Tim Kimber
Sent: Wednesday, April 29, 2009 8:41 AM
To: Alan Powell; mbeckerle at oco-inc.com; Suman Kalia
Cc: dfdl-wg at ogf.org
Subject: [DFDL-WG] Action 033 : Assertions and discriminators



Hi all, 

I have this action: 


033
04/03: Assert/Discriminator semantics. AP to document. TK to check uses of
discriminator besides choice.


I believe the rules should be: 

1. A point of uncertainty is any of 
* an element which has minOccurs != maxOccurs 
* a choice 
* a sequence with sequenceKind="unordered" 

2. Nested within the scope of a point of uncertainty, there might be other
points of uncertainty. 

3. A discriminator which evaluates to true resolves the nearest in-scope
point of uncertainty. 

4. An assertion which evaluates to false causes a processing error 

5. Any processing error ( from an assertion failure or otherwise ) will
cause the parser to backtrack to the nearest unresolved point of uncertainty
and try the next available branch, if any. If there are no more branches
available, the parser will backtrack to the next nearest unresolved point of
uncertainty. 

6. A processing error which reaches the root tag is reported to the host
application. 

7. Assertions and discriminators are allowed on any point of uncertainty (
not only on the branches of a choice ) 


Rationale: 

If we only allow a discriminator on a choice branch, then it will be
difficult to model this common style of message 

Tagged header, minOccurs="1", maxOccurs="1" 
Untagged body, maxOccurs="unbounded" 
Tagged trailer, minOccurs="1", maxOccurs="1" 

An example with 3 occurrences of the body would be: 

HE,headerfield1,headerfield2,headerfield3 
John Smith, 100, bodyfield3 
John Brown, 200, bodyfield3 
Elton John, 30Z, bodyfield3 
TR,trailerfield1,trailerfield2,trailerfield3 

And the DFDL schema would look something like this ( excuse the almost
inevitable errors, this is just for completeness ): 

... 
<xs:element name="message"> 
  <xs:complexType dfdl:lengthKind="implicit"> 
     <xs:sequence dfdl:separator="\r\n"> 
        <xs:element name="header" initiator="HE,"> 
          <xs:complexType> 
            <xs:sequence dfdl:separator=","> 
              <xs:element name="header1" type="xs:string"> 
              <xs:element name="header2" type="xs:string"> 
              <xs:element name="header3" type="xs:string"> 
            </xs:sequence> 
          </xs:complexType> 
        </xs:element> 

        <xs:element name="body" maxOccurs="unbounded"> 
          <xs:complexType> 
            <xs:sequence dfdl:separator=","> 
              <xs:element name="body1" type="xs:string"> 
              <xs:element name="body2" type="xs:int"> 
              <xs:element name="body3" type="xs:string"> 
            </xs:sequence> 
          </xs:complexType> 
        </xs:element> 

        <xs:element name="trailer" initiator="TR,"> 
          <xs:complexType> 
            <xs:sequence dfdl:separator=","> 
              <xs:element name="trailer1" type="xs:string"> 
              <xs:element name="trailer2" type="xs:string"> 
              <xs:element name="trailer3" type="xs:string"> 
            </xs:sequence> 
          </xs:complexType> 
        </xs:element> 
     </xs:sequence> 
  </xs:complexType> 
</xs:element> 


The '30Z' value for the final occurrence of element Body2 is incorrect. It
is not a valid integer, and will trigger a processing error. 

Without a discriminator, this failure will cause the parser to backtrack to
the optional field and try the next element ( the trailer element ). The
initiator will not be found, and the reported error will be "Initiator 'TR,'
not found for element 'trailer'". The user would almost certainly prefer
"Invalid value '30Z' for element 'body2'. Value could not be converted to
simple type 'xs:int'" 

For this example, the discriminator would need to detect unambiguously that
it really was dealing with a Body element and not a Trailer element. Due to
the message style ( which is quite common ) the only way to do this is to
detect that it is *not* a Trailer. I cannot think of an elegant way to do
that using the facilities in v0.33 of the specification. I have raised this
with Alan and Steve. 

regards,

Tim Kimber, Common Transformation Team,
Hursley, UK
Internet:  kimbert at uk.ibm.com
Tel. 01962-816742  
Internal tel. 246742





  _____  





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 









-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20090505/2b6d2c50/attachment-0001.html 


More information about the dfdl-wg mailing list