[DFDL-WG] DFDL lengthKind on complex objects

Mike Beckerle mbeckerle.dfdl at gmail.com
Tue May 5 07:40:24 CDT 2009


I can easily support this proposal.
 
The only "downside" of restricting sequences from having lengthKind is that
you will have to introduce a named "tier" of an element in your logical DFDL
model when you need to model what is really only a physical implementation
concept, of a box surrounding data.
 
But I've always been in favor of avoiding the slippery slope where a DFDL
schema is supposed to both describe the represenatation as well as
exhibiting a logical model that someone likes. To me the DFDL schema is very
constrained by what is required by the physical model, and transformations
outside of the DFDL schema are going to "tidy up" the model and remove
artifacts of representation. The element tier you need in this case to
express the length of a "box" surrounding content is such a representation
artifact.
 
To me the slippery slope of allowing sequences to be almost like unnamed
elements is a slope we very much want to avoid, and as you point out, we're
already doing that by eliminating occurs behavior on them for DFDL. 
 
Thought: if you are going to need to transform the DFDL-described data
anyway, then introducing named tiers for these kinds of complex physical
features actually makes such transformation easier to express, because the
xpath expressions to address various parts of the DFDL-described data are
more obvious. 
 
...mike

Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc.
Tel:  781-810-2125  | 100 Fifth Ave., 4th Floor, Waltham MA 02451 |
<mailto:mbeckerle.dfdl at gmail.com> mbeckerle.dfdl at gmail.com 

 

  _____  

From: dfdl-wg-bounces at ogf.org [mailto:dfdl-wg-bounces at ogf.org] On Behalf Of
Steve Hanson
Sent: Wednesday, April 29, 2009 6:51 AM
To: dfdl-wg at ogf.org
Subject: [DFDL-WG] DFDL lengthKind on complex objects



I have a long-standing concern about the usability of dfdl:lengthKind, which
others in IBM are encountering when modeling real life formats such as EDI. 

My main concern is below. For example, what's the semantic of setting
different values on the element and the sequence?   

<xs:element name="container" dfdl:lengthKind="implicit"> 
  <xs:complexType> 
    <xs:sequence dfdl:separator="@" dfdl:lengthKind="implicit"> 
      <xs:element name="one" type="xs:string" dfdl:lengthKind="delimited" />

      <xs:element name="two" type="xs:string" dfdl:lengthKind="delimited" />

      <xs:element name="three" type="xs:string" dfdl:lengthKind="delimited"
/> 
    </xs:sequence> 
  </xs:complexType> 
</xs:element> 

It gets even more noticeable if I set a scoping dfdl:lengthKind on the
complex type. 

I propose that we limit dfdl:lengthKind to elements only. It means that the
length of a xs:sequence or xs:choice is always and implicitly given by its
chidren, and if you want to provide an explicit length or a length prefix
you must use a complex element to wrap the sequence or choice. We have
looked at the implications on dfdl:choiceKind for choices, and
dfdl:occursKind on arrays, and the proposal works happily in those
scenarios. 

There's an analogy here with not alowing sequences and choices to repeat,
only elements. 

It also simplifies the grammar, in the sense that any excess fill characters
in a 'box' are always considered part of the element when parsing. 

I'd like to discuss this on today's call. 

Regards

Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848 




  _____  





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 









-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20090505/2031190c/attachment.html 


More information about the dfdl-wg mailing list