[DFDL-WG] Problem: simple format that is impossible to model

Steve Hanson smh at uk.ibm.com
Fri Sep 27 04:39:23 EDT 2019


As there are no initiators or terminators, and your example infoset calls 
everything 'field', I am assuming that the element looks logically like:

<xs:element name="field" type="xs:string" minOccurs="0" 
maxOccurs="unbounded" /> 

You want to preserve the position of the occurrences in the infoset so 
that they re-appear on output. The agreed way to do this is:

<xs:element name="field" type="xs:string" minOccurs="0" 
maxOccurs="unbounded" nillable="true" dfdl:nilKind="literalValue" 
dfdl:nilValue="%ES;" /> 

Regards
 
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday 



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     DFDL-WG <dfdl-wg at ogf.org>
Date:   26/09/2019 19:11
Subject:        Re: [DFDL-WG] Problem: simple format that is impossible to 
model
Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org>




To start discussion on my own issue.....

The problem here may be that for a string (or hexBinary), if there is no 
initiator/terminator, there is no way to distinguish EmptyRep from 
NormalRep. I.e., an empty string is a "normal" value for a string. 

Sections 9.2.3 and 9.2.4 seem to define EmptyRep and NormalRep such that 
an empty string will be a EmptyRep, not a NormalRep.

However section 9.2.5 on zero-length says:

   "The normal representation can be a zero-length representation if the 
type is xs:string or xs:hexBinary and there is no framing."

That suggests that when there is no framing, a zero-length string is 
NormalRep, not EmptyRep, which is the opposite conclusion from what is in 
sections 9.2.3 and 9.2.4.

If this latter clarification is correct, then my format *should* work as I 
expect, because the empty string elements will be considered NormalRep and 
infoset values will be created for them.
It simply doesn't work because of a bug in daffodil which has not 
interpreted this correctly. 

...mikeb



Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy



On Thu, Sep 26, 2019 at 1:47 PM Mike Beckerle <mbeckerle.dfdl at gmail.com> 
wrote:
I have a dead-simple little format:

    data/data/data/data
    data/data/data/data

it is lines of "/" separated strings. All elements are optional. 

I simply want this:

   data//data

to round trip. For that to happen I need it to parse into
 
   <field>data</field><field></field><field>data</field>

That is, I require that empty field element in the middle to be created 
and put into the infoset.

I can find no way to do this. 

The strings have no initiator/terminator, so 
dfdl:emptyValueDelimiterPolicy is not relevant. All the elements are 
optional, so default values aren't relevant.

The spec states:

9.4.2.2      Simple element (xs:string or xs:hexBinary)
Required occurrence: If the element has a default value then an item is 
added to the infoset using the default value, otherwise an item is added 
to the Infoset using empty string (type xs:string) or empty hexBinary 
(type xs:hexBinary) as the value.
Optional occurrence: If dfdl:emptyValueDelimiterPolicy is not 'none'[12]
 then an item is added to the Infoset using empty string (type xs:string) 
or empty hexBinary (type xs:hexBinary) as the value, otherwise nothing is 
added to the Infoset.

There are errata/actions to clarify wording here around 
dfdl:emptyValueDelimiterPolicy being in effect or not (because there is no 
initiator/terminator for it to use as opposed to the property in isolation 
just being 'none'). 
But that doesn't change anything about this issue.

If this very simple format is not possible, then we need a property or new 
property enum value that makes it possible. 

Thoughts?


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ogf.org_mailman_listinfo_dfdl-2Dwg&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=AJa9ThEymJXYnOqu84mJuw&m=30B8N6DaCzsTitngV17iCX9xj5OW-EncFOEHhAd14YQ&s=LFvQnqwIMyi8Vz4vYm8A7mo6Swu4JWgqxoAgNNjm_4s&e= 


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20190927/9f0cbc82/attachment-0001.html>


More information about the dfdl-wg mailing list