[DFDL-WG] Repeating groups in DFDL with continuation indicators - action 146 - updated

Steve Hanson smh at uk.ibm.com
Wed Aug 1 05:39:28 EDT 2012


Hi Mike

A couple of changes in blue, main one being to use a common bit type for 
the repeat and present bits, otherwise may as well carry all DFDL 
properties on elements.

Suggest that we either have dfdl:arrayIndex() and dfdl:arrayCount() or 
dfdl:index() and dfdl:count().

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     Steve Hanson/UK/IBM at IBMGB
Cc:     dfdl-wg at ogf.org
Date:   31/07/2012 17:32
Subject:        Re: [DFDL-WG] Repeating groups in DFDL with continuation 
indicators - action 146 - updated



Here is my revisions to the example to illugrate how to abstract and hide 
most of the presence bit matchinery.

...mike

<!-- This is an example of how to implement presence and repeat bits for 
    use in variable-occurrance but dense packed binary data formats. -->

<!-- The goal here is to avoid the details of this having to appear
    over and over again, but to somehow package this so that the 
complexities 
    are encapsulated and hidden from all the places that use repeat bits. 
-->

<!-- Note: also, I renamed dfdl:position to dfdl:arrayIndex for clarity, 
    and I am allowing it to have an argument (parent axis only) -->

<!-- Reusable global type for bit flags. -->
<xs:simpleType name="bitType" 
    dfdl:representation="binary" dfdl:alignment="1" 
dfdl:alignmentUnits="bits"
    dfdl:byteOrder="bigEndian" dfdl:lengthKind="explicit" dfdl:length="1" 
dfdl:lengthUnits="bits">
  <xs:restriction base="xs:unsignedByte"/>
</xs:simpleType>

<!-- Reusable global element for repeat-control-flag bits. Works in 
conjunction 
    with a presenceBit. -->
<xs:element name="repeatBit" type="bitType">
    <xs:annotation>
        <xs:appinfo source="...">
            <dfdl:discriminator>
                {if ../presenceBit eq 0 then false
                else if dfdl:arrayIndex(..) eq 1 then true
                else if ..[dfdl:arrayIndex(..)-1]/repeatBit eq 0 then 
false
                else true}
            </dfdl:discriminator>
            <dfdl:element outputValueCalc='{ if dfdl:arrayIndex(..) eq 
dfdl:count(..) then 0 else 1}' />
        </xs:appinfo>
    </xs:annotation>
</xs:element>
<!-- note in above that outputValueCalc not allowed on a simpleType 
definition, 
    only elements. Similarly, discriminator not allowed on simpleType. -->

<!-- Reusable global element for presence bit. -->
<xs:element name="presenceBit" type="bitType" />


<!-- Example of array A1 which repeats depending on a repeat flag. Flag 
when 
    set means "there are more occurrances", when clear means "this was 
last occurrance". -->
<xs:element name="example">
    <xs:complexType>
        <xs:sequence>
            <!-- Note that the outputValueCalc can't be hidden because it 
references 
                A1, which could be any name at all. -->
            <xs:element ref="presenceBit"
                dfdl:outputValueCalc='{ if dfdl:count(../A1) eq 0 then 0 
else 1 }' />
            <xs:element name="A1" minOccurs="0" maxOccurs="10"
                dfdl:occursCountKind='parsed'>
                <xs:complexType>
                    <xs:sequence>
                        <xs:element ref="repeatBit" />
                        <xs:element name="someField" 
type="xs:unsignedByte"
                            dfdl:representation="binary" 
dfdl:alignment="1"
                            dfdl:alignmentUnits="bits" 
dfdl:byteOrder="bigEndian"
                            dfdl:lengthKind="explicit" dfdl:length="2" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
            <xs:element name="aFieldAfterA1" type="xs:unsignedByte"
                dfdl:representation="binary" dfdl:alignment="1" 
dfdl:alignmentUnits="bits"
                dfdl:byteOrder="bigEndian" dfdl:lengthKind="explicit" 
dfdl:length="2" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

 

On Tue, Jul 31, 2012 at 6:44 AM, Steve Hanson <smh at uk.ibm.com> wrote:
Here's the proposal for discussion on today's call, updated to fix some 
issues: 

- Moved the dfdl:assert onto element A1 where it more naturally belongs 
- Added a variable $pos to save the dfdl:position() of A1 as this is 
needed by A1's child repeatBit element 
- Corrected repeatBit dfdl:outputValueCalc expression to use $pos 
- Corrected function prefixes 

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 
----- Forwarded by Steve Hanson/UK/IBM on 30/07/2012 18:30 ----- 

From:        Steve Hanson/UK/IBM 
To:        "Mike Beckerle" <mbeckerle.dfdl at gmail.com> 
Cc:        dfdl-wg at ogf.org, "'Ryan Farrell'" <
ryan.farrell.ctr at nrl.navy.mil> 
Date:        27/07/2011 09:29 
Subject:        RE: [DFDL-WG] [wg-all] Repeating groups in DFDL 


Hi Mike 

That looks pretty good.  A couple of minor changes in bold red (I don't 
think there's a length() function). 

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 


From: 
"Mike Beckerle" <mbeckerle.dfdl at gmail.com> 
To: 
Steve Hanson/UK/IBM at IBMGB, "'Ryan Farrell'" <ryan.farrell.ctr at nrl.navy.mil
> 
Cc: 
<dfdl-wg at ogf.org> 
Date: 
26/07/2011 18:04 
Subject: 
RE: [DFDL-WG] [wg-all] Repeating groups in DFDL




  
From our call today, I suggest this schema, based on the assertion 
suggestion from steve/tim, and adding in the unparser/output direction 
calculations, and putting this together to match Ryan’s example schema. 
I’m not certain on the xpath expressions syntax. I may have number of “..” 
moves wrong (at least). 
  
I added outputValueCalc properties to the flag bits, I added 
occursCountKind=’parsed’ to the array, and the assertion to the repeatBit 
element. I put in the check so that a zero length array is possible based 
on the presentBit as well. 
  
  <xs:element name="example"> 
   <xs:annotation><xs:appinfo source="..."> 
      <dfdl:defineVariable name="$pos" type="int"/> 
   </xs:appinfo></xs:annotation> 
   <xs:complexType> 
     <xs:sequence> 

       <xs:element name="presentBit" type="xs:unsignedByte" 
                   dfdl:representation="binary" dfdl:alignment="1" 
dfdl:alignmentUnits="bits" 
                   dfdl:byteOrder="bigEndian" dfdl:lengthKind="explicit" 
dfdl:length="1" 
                   dfdl:outputValueCalc='{ if dfdl:count(../A1) eq 0 then 
0 else 1 }'/> 

       <xs:element name="A1" minOccurs="0" maxOccurs="10" 
                   dfdl:occursCountKind='parsed'> 
         <xs:annotation><xs:appinfo source="..."> 
           <dfdl:newVariableInstance ref="$pos" defaultValue="{ 
dfdl:position() }"/> 
           <dfdl:assert> 
                   {if ../presentBit eq 0 then false 
                    else if $pos eq 1 then true 
                    else if .[$pos-1]/repeatBit eq 0 then false 
                    else true} 
           </dfdl:assert> 
         </xs:appinfo></xs:annotation> 
         <xs:complexType> 
           <xs:sequence>                                     
  
             <xs:element name="repeatBit" type="xs:unsignedByte" 
                         dfdl:representation="binary" dfdl:alignment="1" 
dfdl:alignmentUnits="bits" 
                         dfdl:byteOrder="bigEndian" 
dfdl:lengthKind="explicit" dfdl:length="1" 
                         dfdl:outputValueCalc='{ if $pos eq dfdl:count(..) 
then 0 else 1 }'/> 
             </xs:element> 
  
             <xs:element name="someField" type="xs:unsignedByte" 
                         dfdl:representation="binary" dfdl:alignment="1" 
dfdl:alignmentUnits="bits" 
                         dfdl:byteOrder="bigEndian" 
dfdl:lengthKind="explicit" dfdl:length="2"/> 

           </xs:sequence> 
         </xs:complexType> 
       </xs:element> 
  
       <xs:element name="aFieldAfterA1" type="xs:unsignedByte" 
                   dfdl:representation="binary" dfdl:alignment="1" 
dfdl:alignmentUnits="bits" 
                   dfdl:byteOrder="bigEndian" dfdl:lengthKind="explicit" 
dfdl:length="2"/> 

     </xs:sequence> 
   </xs:complexType> 
 </xs:element> 
    
  
From: Steve Hanson [mailto:smh at uk.ibm.com] 
Sent: Tuesday, July 26, 2011 6:40 AM
To: Mike Beckerle
Cc: dfdl-wg at ogf.org; 'Ryan Farrell'
Subject: Re: [DFDL-WG] [wg-all] Repeating groups in DFDL 
  

Yes we need  a friendly way of saying 'I am the last item' when using 
occursCountKind is 'parsed'.   

But with what we have today, could we use a discriminator on the repeating 
element that fails when the indicator of the previous repeat was 0 ? So 
speculation starts to parse the thing after the last repeat on the 
assumption it is another repeat and then fails and backtracks. 

<dfdl:assert>{if dfdl:position() eq 1 then true else if 
'./present[dfdl:position()-1]' eq 0 then false else true}</dfdl:assert> 

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 

From: 
"Mike Beckerle" <mbeckerle.dfdl at gmail.com> 
To: 
"'Ryan Farrell'" <ryan.farrell.ctr at nrl.navy.mil>, <dfdl-wg at ogf.org> 
Date: 
22/07/2011 23:44 
Subject: 
Re: [DFDL-WG] [wg-all] Repeating groups in DFDL 
Sent by: 
dfdl-wg-bounces at ogf.org

  






Great example Ryan,

We've spent quite a bit of time over the last several months on textual
formats with these kinds of characteristics, which I would describe as
"occurrances determined by parsing" - that is, you look at the data to
determine "is there another element or not" one by one as they are parsed.

We need to make sure the properties are there to make this work for the
binary case as well.  

I actually believe we cannot handle this format right now. We don't have a
property which says to look at the current element to determine whether to
expect a subsequent element, as a way of determining occurrence counts. 

Next week we'll add resolving this example to the workgroup actions list. 

...mikeb

Mike Beckerle
Senior Technology Leader/Manager
Deloitte Managed Analytics
100 Fifth Avenue, Waltham, MA 02451
Tel/Direct: +1 781 330 0412 | Fax: +1 866 253 3006
www.deloitte.com


-----Original Message-----
From: dfdl-wg-bounces at ogf.org [mailto:dfdl-wg-bounces at ogf.org] On Behalf 
Of
Ryan Farrell
Sent: Tuesday, July 19, 2011 2:12 PM
To: dfdl-wg at ogf.org
Subject: [DFDL-WG] [wg-all] Repeating groups in DFDL

For anyone that was at the 19/07  phone meeting, this is what Adam Fox 
and I were trying to find a solution to. If anyone knows of a solution, 
please contact me or adam.fox at nrl.navy.mil.

----------------------------------------------------------------------------
------------------------

Please see the attached xml schema while reading this message.

The problem we want to solve is how to represent a structure that can 
have a variable amount of occurences (even zero), and the amount of 
repeats is controlled by a one bit field, which we will call"repeatBit". 
The repeating structure in this case is "A1". The element before it, 
"presentBit", determines if there is at least one occurence of A1. If 
presentBit is zero, then A1 has zero occurences. If presentBit is one, 
then A1 has AT LEAST one occurence.

The first element in A1 is "repeatBit". What repeatBit does is tells us 
if there will be another occurence of A1 after the current occurence. 
When repeatBit is zero, then that means we will read through the rest of 
A1, then that is the last occurence of A1. As long as repeatBit 
continues to be one however, we will read through the rest of A1, then 
start a new occurence.

I have included all the known required DFDL notation. Please let me know 
what is missing.

      EXAMPLE INPUT (results are in base 10):
      NOTE: Not all bits are used in these examples. Unused bits 
should be ignored for the purpose of the examples.

      1) 0111 0101
      presentBit = 0
      aFieldAfterA1 = 3

      2) 1011 1010
      presentBit = 1
      A1
      -repeatBit = 0
      -someField = 3
      aFieldAfterA1 = 2

      3) 1101 0110 0000 0000
      presentBit = 1
      A1
      -repeatBit = 1
      -someField = 1
      A1
      -repeatBit = 0
      -someField = 3
      aFieldAfterA1 = 0

--
dfdl-wg mailing list
dfdl-wg at ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg






  
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 










Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg



-- 
Mike Beckerle | OGF DFDL WG Co-Chair 
Tel:  781-330-0412


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120801/36b8a54e/attachment-0001.html>


More information about the dfdl-wg mailing list