[DFDL-WG] Repeating groups in DFDL with continuation indicators - action 146 - updated

Mike Beckerle mbeckerle.dfdl at gmail.com
Tue Jul 31 12:32:20 EDT 2012


Here is my revisions to the example to illugrate how to abstract and hide
most of the presence bit matchinery.

...mike

<!-- This is an example of how to implement presence and repeat bits for
    use in variable-occurrance but dense packed binary data formats. -->

<!-- The goal here is to avoid the details of this having to appear
    over and over again, but to somehow package this so that the
complexities
    are encapsulated and hidden from all the places that use repeat bits.
-->

<!-- Note: also, I renamed dfdl:position to dfdl:arrayIndex for clarity,
    and I am allowing it to have an argument (parent axis only) -->

<!-- Reusable global element for repeat-control-flag bits. Works in
conjunction
    with a presenceBit. -->
<xs:element name="repeatBit" type="repeatBitType">
    <xs:annotation>
        <xs:appinfo source="...">
            <dfdl:discriminator>
                {if ../presenceBit eq 0 then false
                else if
                dfdl:arrayIndex(..) eq 1 then true
                else if
                ..[dfdl:arrayIndex(..)-1]/repeatBit eq 0 then false
                else true}
            </dfdl:discriminator>
            <dfdl:element outputValueCalc='{ if dfdl:arrayIndex(..) eq
dfdl:count(..) then 0 }' />
        </xs:appinfo>
    </xs:annotation>
</xs:element>
<!-- note in above that outputValueCalc not allowed on a simpleType
definition,
    only elements. Similarly, discriminator not allowed on simpleType. -->


<!-- Reusable global element for presence bit. -->
<xs:element name="presenceBit" type="xs:unsignedByte"
    dfdl:representation="binary" dfdl:alignment="1"
dfdl:alignmentUnits="bits"
    dfdl:byteOrder="bigEndian" dfdl:lengthKind="explicit" dfdl:length="1" />

<!-- Example of array A1 which repeats depending on a repeat flag. Flag
when
    set means "there are more occurrances", when clear means "this was last
occurrance". -->
<xs:element name="example">
    <xs:complexType>
        <xs:sequence>
            <!-- Note that the outputValueCalc can't be hidden because it
references
                A1, which could be any name at all. -->
            <xs:element ref="presenceBit"
                dfdl:outputValueCalc='{ if dfdl:count(../A1) eq 0 then 0
else 1 }' />
            <xs:element name="A1" minOccurs="0" maxOccurs="10"
                dfdl:occursCountKind='parsed'>
                <xs:complexType>
                    <xs:sequence>
                        <xs:element ref="repeatBit" />
                        <xs:element name="someField" type="xs:unsignedByte"
                            dfdl:representation="binary" dfdl:alignment="1"
                            dfdl:alignmentUnits="bits"
dfdl:byteOrder="bigEndian"
                            dfdl:lengthKind="explicit" dfdl:length="2" />
                    </xs:sequence>
                </xs:complexType>
            </xs:element>
            <xs:element name="aFieldAfterA1" type="xs:unsignedByte"
                dfdl:representation="binary" dfdl:alignment="1"
dfdl:alignmentUnits="bits"
                dfdl:byteOrder="bigEndian" dfdl:lengthKind="explicit"
dfdl:length="2" />
        </xs:sequence>
    </xs:complexType>
</xs:element>



On Tue, Jul 31, 2012 at 6:44 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> Here's the proposal for discussion on today's call, updated to fix some
> issues:
>
> - Moved the dfdl:assert onto element A1 where it more naturally belongs
> - Added a variable $pos to save the dfdl:position() of A1 as this is
> needed by A1's child repeatBit element
> - Corrected repeatBit dfdl:outputValueCalc expression to use $pos
> - Corrected function prefixes
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> ----- Forwarded by Steve Hanson/UK/IBM on 30/07/2012 18:30 -----
>
> From:        Steve Hanson/UK/IBM
> To:        "Mike Beckerle" <mbeckerle.dfdl at gmail.com>
> Cc:        dfdl-wg at ogf.org, "'Ryan Farrell'" <
> ryan.farrell.ctr at nrl.navy.mil>
> Date:        27/07/2011 09:29
> Subject:        RE: [DFDL-WG] [wg-all] Repeating groups in DFDL
> ------------------------------
>
>
> Hi Mike
>
> That looks pretty good.  A couple of minor changes in bold red (I don't
> think there's a length() function).
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>  From: "Mike Beckerle" <mbeckerle.dfdl at gmail.com> To: Steve
> Hanson/UK/IBM at IBMGB, "'Ryan Farrell'" <ryan.farrell.ctr at nrl.navy.mil> Cc:
> <dfdl-wg at ogf.org> Date: 26/07/2011 18:04 Subject: RE: [DFDL-WG] [wg-all]
> Repeating groups in DFDL
> ------------------------------
>
>
>
>
> From our call today, I suggest this schema, based on the assertion
> suggestion from steve/tim, and adding in the unparser/output direction
> calculations, and putting this together to match Ryan’s example schema. I’m
> not certain on the xpath expressions syntax. I may have number of “..”
> moves wrong (at least).
>
> I added outputValueCalc properties to the flag bits, I added
> occursCountKind=’parsed’ to the array, and the assertion to the repeatBit
> element. I put in the check so that a zero length array is possible based
> on the presentBit as well.
>
>   <xs:element name="example">
>    <xs:annotation><xs:appinfo source="...">
>       <dfdl:defineVariable name="$pos" type="int"/>
>    </xs:appinfo></xs:annotation>
>    <xs:complexType>
>      <xs:sequence>
>
>        <xs:element name="presentBit" type="xs:unsignedByte"
>                    dfdl:representation="binary" dfdl:alignment="1"
> dfdl:alignmentUnits="bits"
>                    dfdl:byteOrder="bigEndian" dfdl:lengthKind="explicit"
> dfdl:length="1"
>                    dfdl:outputValueCalc='{ if dfdl:count(../A1) eq 0 then
> 0 else 1 }'/>
>
>        <xs:element name="A1" minOccurs="0" maxOccurs="10"
>                    dfdl:occursCountKind='parsed'>
>          <xs:annotation><xs:appinfo source="...">
>            <dfdl:newVariableInstance ref="$pos" defaultValue="{
> dfdl:position() }"/>
>            <dfdl:assert>
>                    {if ../presentBit eq 0 then false
>                     else if $pos eq 1 then true
>                     else if .[$pos-1]/repeatBit eq 0 then false
>                     else true}
>            </dfdl:assert>
>          </xs:appinfo></xs:annotation>
>          <xs:complexType>
>            <xs:sequence>
>
>              <xs:element name="repeatBit" type="xs:unsignedByte"
>                          dfdl:representation="binary" dfdl:alignment="1"
> dfdl:alignmentUnits="bits"
>                          dfdl:byteOrder="bigEndian"
> dfdl:lengthKind="explicit" dfdl:length="1"
>                          dfdl:outputValueCalc='{ if $pos eq
> dfdl:count(..) then 0 else 1 }'/>
>              </xs:element>
>
>              <xs:element name="someField" type="xs:unsignedByte"
>                          dfdl:representation="binary" dfdl:alignment="1"
> dfdl:alignmentUnits="bits"
>                          dfdl:byteOrder="bigEndian"
> dfdl:lengthKind="explicit" dfdl:length="2"/>
>
>            </xs:sequence>
>          </xs:complexType>
>        </xs:element>
>
>        <xs:element name="aFieldAfterA1" type="xs:unsignedByte"
>                    dfdl:representation="binary" dfdl:alignment="1"
> dfdl:alignmentUnits="bits"
>                    dfdl:byteOrder="bigEndian" dfdl:lengthKind="explicit"
> dfdl:length="2"/>
>
>      </xs:sequence>
>    </xs:complexType>
>  </xs:element>
>
>
> *From:* Steve Hanson [mailto:smh at uk.ibm.com <smh at uk.ibm.com>] *
> Sent:* Tuesday, July 26, 2011 6:40 AM*
> To:* Mike Beckerle*
> Cc:* dfdl-wg at ogf.org; 'Ryan Farrell'*
> Subject:* Re: [DFDL-WG] [wg-all] Repeating groups in DFDL
>
>
> Yes we need  a friendly way of saying 'I am the last item' when using
> occursCountKind is 'parsed'.
>
> But with what we have today, could we use a discriminator on the repeating
> element that fails when the indicator of the previous repeat was 0 ? So
> speculation starts to parse the thing after the last repeat on the
> assumption it is another repeat and then fails and backtracks.
>
> <dfdl:assert>{if dfdl:position() eq 1 then true else if
> './present[dfdl:position()-1]' eq 0 then false else true}</dfdl:assert>
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>*
> **tel:+44-1962-815848* <+44-1962-815848>
>
>  From: "Mike Beckerle" <mbeckerle.dfdl at gmail.com>  To: "'Ryan Farrell'" <
> ryan.farrell.ctr at nrl.navy.mil>, <dfdl-wg at ogf.org>  Date: 22/07/2011 23:44
> Subject: Re: [DFDL-WG] [wg-all] Repeating groups in DFDL  Sent by:
> dfdl-wg-bounces at ogf.org
>
>
> ------------------------------
>
>
>
>
> Great example Ryan,
>
> We've spent quite a bit of time over the last several months on textual
> formats with these kinds of characteristics, which I would describe as
> "occurrances determined by parsing" - that is, you look at the data to
> determine "is there another element or not" one by one as they are parsed.
>
> We need to make sure the properties are there to make this work for the
> binary case as well.
>
> I actually believe we cannot handle this format right now. We don't have a
> property which says to look at the current element to determine whether to
> expect a subsequent element, as a way of determining occurrence counts.
>
> Next week we'll add resolving this example to the workgroup actions list.
>
> ...mikeb
>
> Mike Beckerle
> Senior Technology Leader/Manager
> Deloitte Managed Analytics
> 100 Fifth Avenue, Waltham, MA 02451
> Tel/Direct: +1 781 330 0412 | Fax: +1 866 253 3006*
> **www.deloitte.com* <http://www.deloitte.com>
>
>
> -----Original Message-----
> From: dfdl-wg-bounces at ogf.org [*mailto:dfdl-wg-bounces at ogf.org*<dfdl-wg-bounces at ogf.org>]
> On Behalf Of
> Ryan Farrell
> Sent: Tuesday, July 19, 2011 2:12 PM
> To: dfdl-wg at ogf.org
> Subject: [DFDL-WG] [wg-all] Repeating groups in DFDL
>
> For anyone that was at the 19/07  phone meeting, this is what Adam Fox
> and I were trying to find a solution to. If anyone knows of a solution,
> please contact me or *adam.fox at nrl.navy.mil* <adam.fox at nrl.navy.mil>.
>
>
> ----------------------------------------------------------------------------
> ------------------------
>
> Please see the attached xml schema while reading this message.
>
> The problem we want to solve is how to represent a structure that can
> have a variable amount of occurences (even zero), and the amount of
> repeats is controlled by a one bit field, which we will call"repeatBit".
> The repeating structure in this case is "A1". The element before it,
> "presentBit", determines if there is at least one occurence of A1. If
> presentBit is zero, then A1 has zero occurences. If presentBit is one,
> then A1 has AT LEAST one occurence.
>
> The first element in A1 is "repeatBit". What repeatBit does is tells us
> if there will be another occurence of A1 after the current occurence.
> When repeatBit is zero, then that means we will read through the rest of
> A1, then that is the last occurence of A1. As long as repeatBit
> continues to be one however, we will read through the rest of A1, then
> start a new occurence.
>
> I have included all the known required DFDL notation. Please let me know
> what is missing.
>
>       EXAMPLE INPUT (results are in base 10):
>       NOTE: Not all bits are used in these examples. Unused bits
> should be ignored for the purpose of the examples.
>
>       1) 0111 0101
>       presentBit = 0
>       aFieldAfterA1 = 3
>
>       2) 1011 1010
>       presentBit = 1
>       A1
>       -repeatBit = 0
>       -someField = 3
>       aFieldAfterA1 = 2
>
>       3) 1101 0110 0000 0000
>       presentBit = 1
>       A1
>       -repeatBit = 1
>       -someField = 1
>       A1
>       -repeatBit = 0
>       -someField = 3
>       aFieldAfterA1 = 0
>
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *http://www.ogf.org/mailman/listinfo/dfdl-wg*<http://www.ogf.org/mailman/listinfo/dfdl-wg>
>
>
>
>
> ------------------------------
>
>
>
> *Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> *
>
>
>
>
>
>
>
>
>  ------------------------------
>
> *
> *
>
> *Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> *
>
>
>
>
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>



-- 
Mike Beckerle | OGF DFDL WG Co-Chair
Tel:  781-330-0412
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120731/a70fe539/attachment-0001.html>


More information about the dfdl-wg mailing list