[DFDL-WG] Recursive use of DFDL for variable markup - use case
Steve Hanson
smh at uk.ibm.com
Wed Apr 15 07:47:08 CDT 2009
>From last week's call:
7. Recursive use of DFDL for variable markup
Use of a DFDL annotated element/type to describe an initiator, length
prefix, terminator, separator, etc. Steve suggested the most important use
of "variable markup-like mechanism" in IBM's WTX product is to reference a
location earlier in the bit stream where a delimiter value is found. We
handle this already by use of a path expression. The additional variable
markup mechanism was to avoid proliferation of keywords for various corner
cases on initiator, terminator and separator. Eg., what if you want the
initiator to be "Name" or "name" only, not "NAME", "nAmE", etc. So case
insensitive is not expressive enough. This can always be modeled, just not
as an initiator tag. Feeling was to leave out variable markup (other than
for prefix lengths) for v1.0, and to propose the minimum set of extra
properties that can be used to address the common use cases, but that IBM
needed to see whether this satisfied all WTX use cases.
(Post-call update. It doesn't, there is a use case from WTX, Steve will
mail this out before next call).
The use case is from EDI. EDI transactions consist of an initial header
segment which defines, among other things, the separator that is used by
the data segments that follow. The problem is that EDI transactions may be
processed in their entirety, or individual data segments may be processed
without the header segment. For the former case, DFDL supports this fine,
using an XPath expression to locate the separator, which is defined as an
element, the simple type of which enumerates the allowable values,
enabling validation. But for the latter case, the XPath expression won't
resolve, as there is no header. An explicit dfdl:separator property could
be used instead, being a space separated list of all the allowable values
- but that then duplicates the separator element enums, leaving a
maintenance problem.
<xs:element name="header">
<xs:complexType>
<xs:sequence dfdl:lengthKind="implicit">
<xs:element name="separator" dfdl:lengthKind="explicit"
dfdl:length="1" dfdl:representation="text">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enum>xxx</xs:enum>
<xs:enum>yyy</xs:enum>
<xs:enum>aaa</xs:enum>
<xs:enum>bbb</xs:enum>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="850">
<xs:complexType dfdl:lengthKind="delimited">
<xs:sequence dfdl:lengthKind="implicit"
dfdl:separator="../../header/separator">
<xs:element name="one" type="xs:string" />
<xs:element name="two" type="xs:string" />
<xs:element name="three" type="xs:string" />
<xs:element name="four" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="transaction">
<xs:complexType>
<xs:sequence dfdl:lengthKind="implicit">
<xs:element ref="header"/>
<xs:element name="segment" maxOccurs="unbounded" />
<xs:complexType>
<xs:choice>
<xs:element ref="800" />
<xs:element ref="810" />
<xs:element ref="850" />
</xs:choice>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
Note that WTX solves this by pointing at a global separator element by
name, instead of to a separator element in the data by path. At runtime,
the infoset value of the global element is used, and if it is not set, the
enums are used to provide a list of possible values.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20090415/da2bdca4/attachment.html
More information about the dfdl-wg
mailing list