[DFDL-WG] Puzzle: unparsing format involving lengthKind='pattern'

Steve Hanson smh at uk.ibm.com
Mon Dec 3 07:21:50 EST 2012


Afraid not. The problem is that the NUL is not a terminator in the DFDL 
sense of the word (ie, mandatory) but is an early-end-of-data indicator. 
I can't think of an elegant way to handle this so I would simply model the 
data as a single string with a dfdl:lengthPattern that consumed either 
0-63 chars plus NUL or 64 chars. This puts the NUL in the infoset and puts 
the onus on the user to trim the NUL when reading the infoset, and supply 
the NUL when creating the infoset. 

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     dfdl-wg at ogf.org, 
Date:   30/11/2012 21:02
Subject:        [DFDL-WG] Puzzle: unparsing format involving 
lengthKind='pattern'
Sent by:        dfdl-wg-bounces at ogf.org




I have a file format where strings have this unusual discipline for 
termination.

The string is either 64 characters long or, it is from 0 to 63 characters 
long, with a NUL terminator.

I can parse this like so:

    <xs:complexType name="myStringType">
      <xs:sequence>
        <xs:element name="s" type="xs:string"
          dfdl:lengthKind="pattern">
          <xs:annotation>
            <xs:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/">
              <!-- 0 to 63 occurrences of not a Nul, followed by a Nul 
                   (final nul non-captured in the pattern match result), 
OR, just 64 non Nuls -->
              <dfdl:element>
                <dfdl:property 
name="lengthPattern"><![CDATA[([^\x00]{0,63})(?=\x00)|[^\x00]{64}]]></dfdl:property>
              </dfdl:element>
            </xs:appinfo>
          </xs:annotation>
        </xs:element>
     <xs:element name="term" type="xs:string"
        dfdl:lengthKind="explicit" dfdl:length="0" dfdl:initiator="&NUL;" 
dfdl:outputValueCalc="{ '' }"
        minOccurs="0" maxOccurs="1" dfdl:occursCountKind="expression"
        dfdl:occursCount="{ if(fn:string-length(../tns:s) lt 64) then 1 
else 0 }" />
      </xs:sequence>
    </xs:complexType>

What I did is use a lengthKind pattern to pick off the content, excluding 
the NUL, and then model the NUL explicitly as the initiator of an empty 
element which is optionally occurring depending on the length of the 
string.

That will work. I'd like to hide the "term" element in a hidden group, but 
other than that it will work for parsing.

Question: How can I unparse this? 

I want the schema and DFDL processor to put the "term" element into the 
infoset by itself depending on the length of the 's' element that I will 
place into the infoset. So I put an outputValueCalc on there to assign it 
a value of empty string, but that will only happen if the occursCount 
expression causes the optional element to exist at all.

Will this work for unparsing?

Spec currently says that occursCount expression is used only when parsing, 
otherwise the number in the infoset are used. But I don't want to have to 
put this syntax-modeling element into the infoset from the application. I 
just want the application to create a string, and then based on whether it 
is 0-63 long, I want the Nul added or not.

Other ways I've tried to model these strings include as a choice of two 
different elements. That has a similar issue that I then have to assemble 
my infoset for output using a length dependent element. I can't just put a 
string into the infoset and have it decide when unparsing which of the two 
choice branches is the right one. (asserts and discriminators are 
parse-only also.)

Anyone have better ideas?

...mike

-- 
Mike Beckerle | OGF DFDL WG Co-Chair 
Tel:  781-330-0412
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121203/fc92b0e0/attachment.html>


More information about the dfdl-wg mailing list