[DFDL-WG] Problem: simple format that is impossible to model

Mike Beckerle mbeckerle.dfdl at gmail.com
Thu Sep 26 14:10:23 EDT 2019


To start discussion on my own issue.....

The problem here may be that for a string (or hexBinary), if there is no
initiator/terminator, there is no way to distinguish EmptyRep from
NormalRep. I.e., an empty string is a "normal" value for a string.

Sections 9.2.3 and 9.2.4 seem to define EmptyRep and NormalRep such that an
empty string will be a EmptyRep, not a NormalRep.

However section 9.2.5 on zero-length says:

   "The normal representation can be a zero-length representation if the
type is xs:string or xs:hexBinary and there is no framing."

That suggests that when there is no framing, a zero-length string is
NormalRep, not EmptyRep, which is the opposite conclusion from what is in
sections 9.2.3 and 9.2.4.

If this latter clarification is correct, then my format *should* work as I
expect, because the empty string elements will be considered NormalRep and
infoset values will be created for them.
It simply doesn't work because of a bug in daffodil which has not
interpreted this correctly.

...mikeb



Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>



On Thu, Sep 26, 2019 at 1:47 PM Mike Beckerle <mbeckerle.dfdl at gmail.com>
wrote:

> I have a dead-simple little format:
>
>     data/data/data/data
>     data/data/data/data
>
> it is lines of "/" separated strings. All elements are optional.
>
> I simply want this:
>
>    data//data
>
> to round trip. For that to happen I need it to parse into
>
>    <field>data</field><field></field><field>data</field>
>
> That is, I require that empty field element in the middle to be created
> and put into the infoset.
>
> I can find no way to do this.
>
> The strings have no initiator/terminator, so
> dfdl:emptyValueDelimiterPolicy is not relevant. All the elements are
> optional, so default values aren't relevant.
>
> The spec states:
>
> 9.4.2.2      Simple element (xs:string or xs:hexBinary)
>
> Required occurrence: If the element has a default value then an item is
> added to the infoset using the default value, otherwise an item is added to
> the Infoset using empty string (type xs:string) or empty hexBinary (type
> xs:hexBinary) as the value.
>
> Optional occurrence: If dfdl:emptyValueDelimiterPolicy is not 'none'[12]
> <https://daffodil.apache.org/docs/dfdl/#_ftn12> then an item is added to
> the Infoset using empty string (type xs:string) or empty hexBinary (type
> xs:hexBinary) as the value, *otherwise nothing is added to the Infoset*.
>
> There are errata/actions to clarify wording here around
> dfdl:emptyValueDelimiterPolicy being in effect or not (because there is no
> initiator/terminator for it to use as opposed to the property in isolation
> just being 'none').
> But that doesn't change anything about this issue.
>
> If this very simple format is not possible, then we need a property or new
> property enum value that makes it possible.
>
> Thoughts?
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> www.tresys.com
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the OGF Intellectual Property Policy
> <http://www.ogf.org/About/abt_policies.php>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20190926/352277d7/attachment.html>


More information about the dfdl-wg mailing list