[DFDL-WG] discriminators near PoU or not

Steve Hanson smh at uk.ibm.com
Tue Dec 4 12:16:55 EST 2012


Agreed to defer to a future DFDL release.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Steve Hanson/UK/IBM
To:     Mike Beckerle <mbeckerle.dfdl at gmail.com>, 
Cc:     dfdl-wg at ogf.org
Date:   19/11/2012 19:24
Subject:        Re: [DFDL-WG] discriminators near PoU or not


Hi Mike, 

I hit a similar scenario today with an EDIFACT message. There was an 
optional array containing a bunch of initiated records. I used 
'initiatedContent' for the array's sequence, but needed to add an extra 
discriminator to stop an error from within one of the records from causing 
the parser to backtrack and conclude that the array occurrence was not 
there. Not easy to know where to put it. 

One possibility is an 'all' attribute that says it resolves all points of 
uncertainty that are in scope?

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848




From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     dfdl-wg at ogf.org, 
Date:   29/10/2012 01:25
Subject:        [DFDL-WG] discriminators near PoU or not
Sent by:        dfdl-wg-bounces at ogf.org



I thought this situation was worth discussing.

I have a bunch of messages. They all work like this:

<element name="messageName1">
<complexType>
<sequence>
   <element ref="boilerplateElt1" minOccurs="0"/>
   <element ref="boilerplateElt2" minOccurs="0"/>
   <sequence dfdl:initiator="MSG1">
       .... message specific elements go in here...
   </sequence>
   ... optional trailing boilerplates go here
</complexType>
</element>

There are hundreds of those.

These all are used in this way:

<element name="topLevel">
<complexType>
   <sequence>
     <element name="message" minOccurs="1" maxOccurs="unbounded">
       <complexType>
          <choice>
              <element ref="messageName1"/>
              <element ref="messageName2"/> 
              ....
          </choice>
        </complexType>
    </element>
  </sequence>
</complexType>
</element>

So, there are two points of uncertainty here. The choice, and the 
unbounded array surrounding it.

For each message, it is not until the sequence carrying the initiator, 
which is down inside the message structure, that we know for sure we've 
got this kind of message.

So currently each message element is designed to be used ONLY in the above 
context of two surrounding points of uncertainty, like so (same message 
format as above
but with TWO discriminators added).

<element name="messageName1">
<complexType>
<sequence>
   <element ref="boilerplateElt1" minOccurs="0"/>
   <element ref="boilerplateElt2" minOccurs="0"/>
   <sequence dfdl:initiator="MSG1">
      <annotation><appinfo...>
          <!-- discriminate enclosing choice -->
          <dfdl:discriminator>{ true() }</dfdl:discriminator>
          <!-- discriminate enclosing array -->
          <dfdl:discriminator>{ true() }</dfdl:discriminator>
      </appinfo></annotation>
       .... message specific elements go in here...
   </sequence>
   ... optional trailing boilerplates go here
</complexType>
</element>

I'd love to have a better solution to this issue. But in the absence of 
one this works and achieves what I want which is that once it hits the 
initiator, we discriminate both which message we have, and we discriminate 
that we in fact have a message. 

Interesting that I'd really prefer to discriminate them in the opposite 
order of their nesting. I could discriminate the array the the very start 
of the message. But I have no way to do that because the discriminators 
apply in inward-out nesting order, and the nearest enclosing PoU is the 
choice. So I have to discriminate that one first, and I can't discriminate 
that one until I see the initiator, which is later into the message.

Possible fixes/improvements: allow a label/id on each PoU, and allow 
referencing that label from a discriminator to say exactly which PoU you 
are discriminating.  This makes the outward reference implied by a 
discriminator explicit. You still have the context issue, but it's not 
implied anymore, it is explicit. (This makes the problem like expressions 
with "../../.." that reach up and out of a construct. They are reaching up 
and out, but at least you can see directly that they are doing so.) 

-- 
Mike Beckerle | OGF DFDL WG Co-Chair 
Tel:  781-330-0412
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121204/df397f2d/attachment.html>


More information about the dfdl-wg mailing list