[DFDL-WG] Fw: Hidden elements - summary of approaches

Alan Powell alan_powell at uk.ibm.com
Wed Sep 15 08:48:21 CDT 2010


All

Proposed section on hidden sequences:

The current dfdl:hidden annotation section is removed


14.5    Hidden Sequence Groups
Some fields in the physical stream provide information about other fields 
in the stream and are not really part of the data. For example, a field 
could give the number of repeats in a following array. These fields may 
not be of interest to an application so may be removed from the Infoset on 
parsing by marking them as hidden. A hidden sequence group allows fields 
to be defined that will not be added to the infoset on parsing and will 
not be expected in the Infoset on unparsing. 

<xs:element name="root">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="firstElement" type="xs:int"

      <xs:sequence>
              <dfdl:sequence hiddenGroupRef="tns:hiddenRepeatCount">
      </xs:sequence>

      <xs:element name="arrayElement" type="xs:int" 
                  minOccurs="0" maxOccurs="unbounded" 
                  dfdl:occursCountKind=?expression?
                  dfdl:occurCount= ?{./repeatCount}? />
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:group name="hiddenRepeatCount" > 
  <xs:sequence> 
    <xs:element name="repeatCount" type=int
       dfdl:outputValueCalc=?{count(./arrayElement)}? 
       dfdl:representation=?binary? dfdl:lengthKind=?implicit? /> 
  </xs:sequence> 
</xs:group>

Hidden elements within a hidden sequence can be referenced via path 
expressions using the same DFDL expression that we would have if it were 
not hidden.
Hidden elements can (typically will) contain the regular DFDL annotations 
to define their physical properties and on unparsing to set their value. 
They are processed using the same behavior as non-hidden elements.

When the dfdl:hiddenGroupRef property is specified, all other DFDL are 
ignored. It is a schema definition error if the sequence is not empty. 

A hidden sequence may appear within another hidden sequence.


Property Name
Description
hiddenGroupRef
QName
Reference to a global model group definition that defines the hidden 
element or elements.
The model group within the model group definition must be a sequence
Annotation: dfdl:sequence
Table 11 Hidden sequence properties

 
Regards

 
Alan Powell
 
Development - MQSeries, Message Broker, ESB
IBM Software Group, Application and Integration Middleware Software
-------------------------------------------------------------------------------------------------------------------------------------------
IBM
MP211, Hursley Park
Hursley, SO21 2JN
United Kingdom
Phone: +44-1962-815073
e-mail: alan_powell at uk.ibm.com




From:   Steve Hanson/UK/IBM
To:     remcgrat at illinois.edu
Cc:     alejandr at ncsa.illinois.edu, Suman Kalia/Toronto/IBM at IBMCA, Alan 
Powell/UK/IBM at IBMGB, Stephanie Fetzer/Charlotte/IBM at IBMUS, Tim 
Kimber/UK/IBM at IBMGB, Sandy Gao/Toronto/IBM at IBMCA
Date:   08/09/2010 17:13
Subject:        Fw: Hidden elements - summary of approaches


Hi Bob

Alejandro included two extensions to the DFDL hidden syntax. 

Here's the current spec syntax for making a local element called 'repeat 
count' hidden (exactly same syntax for element ref, sequence, choice, or 
group ref)

<xs:element name="root">
  <xs:complexType>
    <xs:sequence>
      <xs:sequence> 
         <xs:annotation><xs:appinfo source=http://www.ogf.org/dfdl/" />
              <dfdl:hidden groupref="tns:hiddenRepeatCount">
         </xs:appinfo></xs:annotation>
      </xs:sequence>
      <xs:element name="array" type="xs:string" maxOccurs="unbounded"
                  dfdl:occursCountKind=?expression? dfdl:occurCount= 
?{./repeatCount}? />
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:group name="hiddenRepeatCount" > 
  <xs:sequence> 
    <xs:element name="repeatCount" type="int"
       dfdl:representation=?binary? dfdl:lengthKind=?implicit? /> 
  </xs:sequence> 
</xs:group>


1)  Hiding a local element 

Here's what I think Alejandro has added in Daffodil, as an optimised 
syntax for hiding a local element.

<xs:element name="root">
  <xs:complexType>
    <xs:sequence>
      <xs:sequence> 
         <xs:annotation><xs:appinfo source=http://www.ogf.org/dfdl/" />
              <dfdl:hidden elementref="tns:repeatCount">
         </xs:appinfo></xs:annotation>
      </xs:sequence>
      <xs:element name="array" type="xs:string" maxOccurs="unbounded"
                  dfdl:occursCountKind=?expression? dfdl:occurCount= 
?{./repeatCount}? />
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="repeatCount" type="int"
       dfdl:representation=?binary? dfdl:lengthKind=?implicit? /> 


There's a restriction though - it can only be used when 
minOccurs=maxOccurs=1, because we have actually lost the XML Schema 
particle.  There's no loss of  DFDL semantic as far as I am aware, because 
we do not have particle-specific properties. 

Applying my proposed simplified syntax to Alejandro's optimisation gives 
the syntax below.

<xs:element name="root">
  <xs:complexType>
    <xs:sequence>
      <xs:sequence dfdl:hiddenElementRef="tns:repeatCount" />
      <xs:element name="array" type="xs:string" maxOccurs="unbounded"
                  dfdl:occursCountKind=?expression? dfdl:occurCount= 
?{./repeatCount}? />
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="repeatCount" type="int"
       dfdl:representation=?binary? dfdl:lengthKind=?implicit? /> 


This is quite compact, and does make the act of hiding easier.  But it can 
only be used under specific circumstances.  Requires new property 
dfdl:hiddenElementRef, or (better) we rename dfdl:hiddenGroupRef to 
dfdl:hiddenRef and allow it to point at groups or elements.

It also allows the hiding of an element reference with 
minOccurs=maxOccurs=1 and no explicit DFDL properties to be achieved 
without creating a global group.  (I can hide a group reference with no 
explicit DFDL properties in this manner today).

2) Hiding a local choice

If the object to be hidden is a choice, I think Alejandro is allowing the 
xs:choice to be the content of the global group, instead of requiring the 
sequence to wrap it. 

<xs:group name="hiddenRepeatCount" > 
  <xs:choice> 
    <xs:element name="repeatCount" type="int"
       dfdl:representation=?binary? dfdl:lengthKind=?implicit? /> 
    <xs:element name="repeatString" type="int"
       dfdl:representation=?text? dfdl:lengthKind=?explicit? 
dfdl:length="10" /> 
  </xs:choice> 
</xs:group>


Makes sense - if I was hiding a local sequence, I wouldn't bother to wrap 
the sequence in yet another sequence in the global group, so why do so 
with a choice?

Thoughts welcome. Personally I like both.

Regards

Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh at uk.ibm.com,
tel +44-(0)1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 08/09/2010 15:54 -----

From:
Steve Hanson/UK/IBM
To:
Sandy Gao/Toronto/IBM at IBMCA
Cc:
Alan Powell/UK/IBM at IBMGB, Michael Hudson/Boca Raton/IBM at IBMUS, Richard 
Schofield/UK/IBM at IBMGB, Stephanie Fetzer/Charlotte/IBM at IBMUS, Suman 
Kalia/Toronto/IBM at IBMCA, Tim Kimber/UK/IBM at IBMGB, dfdl-wg at ogf.org
Date:
08/09/2010 10:59
Subject:
Hidden elements - summary of approaches


Let's state the two options being considered, as I said I'd do this for 
the wider DFDL WG for the call today:

1) Global group approach 
Summary:
Particle to hide can be a local element, element ref, local sequence, 
local choice or group ref
Particle is removed from its parent into a dedicated global group of 
composition sequence and replaced in the parent by a new empty local 
sequence
The new empty local sequence carries a dfdl:hidden annotation that has a 
property dfdl:groupRef, other DFDL properties are not allowed
Alternatively, the new empty local sequence carries a dfdl:hiddenGroupRef 
property, other DFDL properties are not allowed
Pros:
Removal of all DFDL annotations and use of the resultant pure XSD results 
in same infoset
Global group can be reused
Cons:
Making something hidden is a refactor operation
Global group sequence needs DFDL properties setting correctly

2)  Hidden flag approach
Summary:
Particle to hide can be a local element, element ref
Particle takes a dfdl:hidden property 
xs:minOccurs MUST be 0
A dfdl:minOccurs property takes the place of xs:minOccurs.
Pros:
Easy to make something hidden 
Cons:
Removal of all DFDL annotations and using pure XSD does not guarantee the 
same infoset
Breaks validation
Duplication of minOccurs property 
Have to wrap a local sequence, choice or group ref in a complex element in 
order to hide it (they can't take minOccurs = 0)

Regards

Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh at uk.ibm.com,
tel +44-(0)1962-815848







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU













Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20100915/d8fca6e3/attachment-0001.html 


More information about the dfdl-wg mailing list