[DFDL-WG] Fw: Hidden elements - summary of approaches
Alan Powell
alan_powell at uk.ibm.com
Wed Sep 15 08:48:21 CDT 2010
All
Proposed section on hidden sequences:
The current dfdl:hidden annotation section is removed
14.5 Hidden Sequence Groups
Some fields in the physical stream provide information about other fields
in the stream and are not really part of the data. For example, a field
could give the number of repeats in a following array. These fields may
not be of interest to an application so may be removed from the Infoset on
parsing by marking them as hidden. A hidden sequence group allows fields
to be defined that will not be added to the infoset on parsing and will
not be expected in the Infoset on unparsing.
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:element name="firstElement" type="xs:int"
<xs:sequence>
<dfdl:sequence hiddenGroupRef="tns:hiddenRepeatCount">
</xs:sequence>
<xs:element name="arrayElement" type="xs:int"
minOccurs="0" maxOccurs="unbounded"
dfdl:occursCountKind=?expression?
dfdl:occurCount= ?{./repeatCount}? />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:group name="hiddenRepeatCount" >
<xs:sequence>
<xs:element name="repeatCount" type=int
dfdl:outputValueCalc=?{count(./arrayElement)}?
dfdl:representation=?binary? dfdl:lengthKind=?implicit? />
</xs:sequence>
</xs:group>
Hidden elements within a hidden sequence can be referenced via path
expressions using the same DFDL expression that we would have if it were
not hidden.
Hidden elements can (typically will) contain the regular DFDL annotations
to define their physical properties and on unparsing to set their value.
They are processed using the same behavior as non-hidden elements.
When the dfdl:hiddenGroupRef property is specified, all other DFDL are
ignored. It is a schema definition error if the sequence is not empty.
A hidden sequence may appear within another hidden sequence.
Property Name
Description
hiddenGroupRef
QName
Reference to a global model group definition that defines the hidden
element or elements.
The model group within the model group definition must be a sequence
Annotation: dfdl:sequence
Table 11 Hidden sequence properties
Regards
Alan Powell
Development - MQSeries, Message Broker, ESB
IBM Software Group, Application and Integration Middleware Software
-------------------------------------------------------------------------------------------------------------------------------------------
IBM
MP211, Hursley Park
Hursley, SO21 2JN
United Kingdom
Phone: +44-1962-815073
e-mail: alan_powell at uk.ibm.com
From: Steve Hanson/UK/IBM
To: remcgrat at illinois.edu
Cc: alejandr at ncsa.illinois.edu, Suman Kalia/Toronto/IBM at IBMCA, Alan
Powell/UK/IBM at IBMGB, Stephanie Fetzer/Charlotte/IBM at IBMUS, Tim
Kimber/UK/IBM at IBMGB, Sandy Gao/Toronto/IBM at IBMCA
Date: 08/09/2010 17:13
Subject: Fw: Hidden elements - summary of approaches
Hi Bob
Alejandro included two extensions to the DFDL hidden syntax.
Here's the current spec syntax for making a local element called 'repeat
count' hidden (exactly same syntax for element ref, sequence, choice, or
group ref)
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:sequence>
<xs:annotation><xs:appinfo source=http://www.ogf.org/dfdl/" />
<dfdl:hidden groupref="tns:hiddenRepeatCount">
</xs:appinfo></xs:annotation>
</xs:sequence>
<xs:element name="array" type="xs:string" maxOccurs="unbounded"
dfdl:occursCountKind=?expression? dfdl:occurCount=
?{./repeatCount}? />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:group name="hiddenRepeatCount" >
<xs:sequence>
<xs:element name="repeatCount" type="int"
dfdl:representation=?binary? dfdl:lengthKind=?implicit? />
</xs:sequence>
</xs:group>
1) Hiding a local element
Here's what I think Alejandro has added in Daffodil, as an optimised
syntax for hiding a local element.
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:sequence>
<xs:annotation><xs:appinfo source=http://www.ogf.org/dfdl/" />
<dfdl:hidden elementref="tns:repeatCount">
</xs:appinfo></xs:annotation>
</xs:sequence>
<xs:element name="array" type="xs:string" maxOccurs="unbounded"
dfdl:occursCountKind=?expression? dfdl:occurCount=
?{./repeatCount}? />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="repeatCount" type="int"
dfdl:representation=?binary? dfdl:lengthKind=?implicit? />
There's a restriction though - it can only be used when
minOccurs=maxOccurs=1, because we have actually lost the XML Schema
particle. There's no loss of DFDL semantic as far as I am aware, because
we do not have particle-specific properties.
Applying my proposed simplified syntax to Alejandro's optimisation gives
the syntax below.
<xs:element name="root">
<xs:complexType>
<xs:sequence>
<xs:sequence dfdl:hiddenElementRef="tns:repeatCount" />
<xs:element name="array" type="xs:string" maxOccurs="unbounded"
dfdl:occursCountKind=?expression? dfdl:occurCount=
?{./repeatCount}? />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="repeatCount" type="int"
dfdl:representation=?binary? dfdl:lengthKind=?implicit? />
This is quite compact, and does make the act of hiding easier. But it can
only be used under specific circumstances. Requires new property
dfdl:hiddenElementRef, or (better) we rename dfdl:hiddenGroupRef to
dfdl:hiddenRef and allow it to point at groups or elements.
It also allows the hiding of an element reference with
minOccurs=maxOccurs=1 and no explicit DFDL properties to be achieved
without creating a global group. (I can hide a group reference with no
explicit DFDL properties in this manner today).
2) Hiding a local choice
If the object to be hidden is a choice, I think Alejandro is allowing the
xs:choice to be the content of the global group, instead of requiring the
sequence to wrap it.
<xs:group name="hiddenRepeatCount" >
<xs:choice>
<xs:element name="repeatCount" type="int"
dfdl:representation=?binary? dfdl:lengthKind=?implicit? />
<xs:element name="repeatString" type="int"
dfdl:representation=?text? dfdl:lengthKind=?explicit?
dfdl:length="10" />
</xs:choice>
</xs:group>
Makes sense - if I was hiding a local sequence, I wouldn't bother to wrap
the sequence in yet another sequence in the global group, so why do so
with a choice?
Thoughts welcome. Personally I like both.
Regards
Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh at uk.ibm.com,
tel +44-(0)1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 08/09/2010 15:54 -----
From:
Steve Hanson/UK/IBM
To:
Sandy Gao/Toronto/IBM at IBMCA
Cc:
Alan Powell/UK/IBM at IBMGB, Michael Hudson/Boca Raton/IBM at IBMUS, Richard
Schofield/UK/IBM at IBMGB, Stephanie Fetzer/Charlotte/IBM at IBMUS, Suman
Kalia/Toronto/IBM at IBMCA, Tim Kimber/UK/IBM at IBMGB, dfdl-wg at ogf.org
Date:
08/09/2010 10:59
Subject:
Hidden elements - summary of approaches
Let's state the two options being considered, as I said I'd do this for
the wider DFDL WG for the call today:
1) Global group approach
Summary:
Particle to hide can be a local element, element ref, local sequence,
local choice or group ref
Particle is removed from its parent into a dedicated global group of
composition sequence and replaced in the parent by a new empty local
sequence
The new empty local sequence carries a dfdl:hidden annotation that has a
property dfdl:groupRef, other DFDL properties are not allowed
Alternatively, the new empty local sequence carries a dfdl:hiddenGroupRef
property, other DFDL properties are not allowed
Pros:
Removal of all DFDL annotations and use of the resultant pure XSD results
in same infoset
Global group can be reused
Cons:
Making something hidden is a refactor operation
Global group sequence needs DFDL properties setting correctly
2) Hidden flag approach
Summary:
Particle to hide can be a local element, element ref
Particle takes a dfdl:hidden property
xs:minOccurs MUST be 0
A dfdl:minOccurs property takes the place of xs:minOccurs.
Pros:
Easy to make something hidden
Cons:
Removal of all DFDL annotations and using pure XSD does not guarantee the
same infoset
Breaks validation
Duplication of minOccurs property
Have to wrap a local sequence, choice or group ref in a complex element in
order to hide it (they can't take minOccurs = 0)
Regards
Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh at uk.ibm.com,
tel +44-(0)1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20100915/d8fca6e3/attachment-0001.html
More information about the dfdl-wg
mailing list