[DFDL-WG] clarification: on suppressed ZL string/hexBinary - do we keep variable assignments?

Mike Beckerle mbeckerle.dfdl at gmail.com
Wed Aug 1 15:53:23 EDT 2018


I spoke too soon.

I'm good with this:

   - if EVDP is 'none', then empty strings have no representation syntax at
   all. So we don't create optional elements for them period.


This statement I made is not correct.


   - If EVDP is not 'none', then empty string requires some syntax to be
   there, and based on this syntax appearing an empty string value is added to
   the infoset for the optional element.


Suppose EVDP is not none, but initiator and terminator are both "", then
the empty string does *not* require syntax to be there. So the above
statement is simply wrong.

When EVDP is 'both' but neither initiator nor terminator are defined, then
since EVDP is not 'none', a zero-length string would still cause an
optional element to be added to the infoset. In a separated sequence, this
would mean one could control how many empty strings go into optional values
by providing more separators.

So "a,b,,,,,," would add 2 non-empty and 6 empty strings to the infoset,
regardless of whether they are required or optional.

If the element is named 'x', has minOccurs "3",  default="c" maxOccurs="12"
and occursCountKind 'implicit', then the first empty would trigger
defaulting, and the data would be

     <x>a</x><x>b</x><x>c</x><x/><x/><x/><x/><x/>

So we would get defaulting the required index locations, but creation of
elements with empty string values for optional index locations.

This allows us to construct an XSD invalid document. I suppose that is no
big deal, there are many ways to construct data by parsing with DFDL where
the data proves to be invalid per XSD rules.

So we really have 3 cases:

   1. EVDP is 'none'
   2. EVDP not 'none' but initiator/terminator such that empty
   representation has no syntax
   3. EVDP not 'none' but initator/termiantor such that empty
   representation DOES have syntax.

Case 1 = optional elements never populated from ZL strings
Case 2 = optional elements are always populated from ZL strings
Case 3 = optional elements are populated from the non-ZL empty
representation, optional elements are not populated from ZL strings.

Daffodil has heretofore been crushing Cases 1 and 2 together with behavior
of Case 1.

Does IBM DFDL distinguish all 3 cases?


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>


On Wed, Aug 1, 2018 at 2:33 PM, Mike Beckerle <mbeckerle.dfdl at gmail.com>
wrote:

> ok, so here's the way I understand this now.
>
> if EVDP is 'none', then empty strings have no representation syntax at
> all. So we don't create optional elements for them period.
>
> If EVDP is not 'none', then empty string requires some syntax to be there,
> and based on this syntax appearing an empty string value is added to the
> infoset for the optional element.
>
> Seems simple in retrospect....
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> www.tresys.com
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the OGF Intellectual Property Policy
> <http://www.ogf.org/About/abt_policies.php>
>
>
> On Wed, Aug 1, 2018 at 11:07 AM, Steve Hanson <smh at uk.ibm.com> wrote:
>
>> *9.4.2.2        Simple element (xs:string or xs:hexBinary)*
>>
>> *Required occurrence: If the element has a default value then an item is
>> added to the infoset using the default value, otherwise an item is added to
>> the Infoset using empty string (type xs:string) or empty hexBinary (type
>> xs:hexBinary) as the value. *
>>
>> *Optional occurrence: If dfdl:emptyValueDelimiterPolicy is not 'none'
>> then an item is added to the Infoset using empty string (type xs:string) or
>> empty hexBinary (type xs:hexBinary) as the value, otherwise nothing is
>> added to the Infoset. *
>>
>> *Note: To prevent unwanted empty strings or empty hexBinary values from
>> being added to the Infoset, use XSD minLength > '0' and a dfdl:assert that
>> uses the dfdl:checkConstraints() function, to raise a processing error.*
>>
>> *9.2.2        Empty Representation*
>>
>> *An element occurrence has an empty representation if the occurrence does
>> not have a nil representation and it conforms to the grammar for
>> SimpleEmptyElementRep or ComplexEmptyElementRep. Specifically, the
>> EmptyElementInitiator** and EmptyElementTerminator** regions must be
>> conformant with dfdl:emptyValueDelimiterPolicy and the occurrence's content
>> in the data stream is of length zero.** (If non-conformant it is not a
>> processing error and the representation is not empty).**
>> LeadingAlignment, TrailingAlignment, PrefixLength regions may be present. *
>>
>>
>> Regards
>>
>> Steve Hanson
>>
>> IBM Hybrid Integration, Hursley, UK
>> Architect, *IBM DFDL*
>> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
>> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
>> *smh at uk.ibm.com* <smh at uk.ibm.com>
>> tel:+44-1962-815848
>> mob:+44-7717-378890
>> Note: I work Tuesday to Friday
>>
>>
>>
>> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
>> To:        Steve Hanson <smh at uk.ibm.com>
>> Cc:        dfdl-wg at ogf.org
>> Date:        01/08/2018 15:13
>> Subject:        Re: [DFDL-WG] clarification: on suppressed ZL
>> string/hexBinary - do we keep variable assignments?
>> ------------------------------
>>
>>
>>
>> To follow up then,
>>
>> I have assumed that dfdl:emptyValueDelimiterPolicy isn't even examined
>> unless the element has default="..." i.e., a non-zero-length default value
>> specified. I.e., without a default value, there is no concept of emptiness
>> as distinct from "normal" representation.
>>
>> Are you suggesting that it is also used to control when an empty string
>> (or empty hexbinary) is accepted as a normal representation value for an
>> optional element, vs. treated as a missing value? That's a reasonable
>> interpretation that I would support, but I don't know that the spec says
>> that anywhere, so we need to add a sentence. (Unless I'm missing where this
>> is stated.)
>>
>> I have also thought that dfdl:emptyValueDelimiterPolicy must be combined
>> with dfdl:initiator and dfdl:terminator. If the combination of these is
>> such that the empty representation is zero-length, that is what creates the
>> situation of interest here, where it is ambiguous whether the value is the
>> official empty representation or is the normal representation that just so
>> happens to be of zero length.  That is, there's no special significance to
>> the 'none' EVDP property value.
>>
>> For example, if dfdl:emptyValueDelimiterPolicy is 'both', but
>> dfdl:initiator="" and dfdl:terminator="", then that's just as good as
>> dfdl:emptyValueDelimiterPolicy='none' in terms of whatever effect this
>> has on a decision about normal vs. missing.
>>
>> Does this match your understanding?
>>
>>
>>
>> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
>> *www.tresys.com* <http://www.tresys.com>
>> Please note: Contributions to the DFDL Workgroup's email discussions are
>> subject to the *OGF Intellectual Property Policy*
>> <http://www.ogf.org/About/abt_policies.php>
>>
>>
>> On Wed, Aug 1, 2018 at 9:26 AM, Steve Hanson <*smh at uk.ibm.com*
>> <smh at uk.ibm.com>> wrote:
>> Whether to add a zero-length string or hexBinary to the infoset for an
>> optional element depends on the setting of emptyValueDelimiterPolicy. A
>> setting of 'none' stops it from being added.
>>
>> Regardless, it does not give a processing error, so is therefore
>> known-to-exist, and therefore does not cause backtracking, so preserving
>> discriminators and variables.
>>
>> Regards
>>
>> Steve Hanson
>> IBM Hybrid Integration, Hursley, UK
>> Architect, *IBM DFDL*
>> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
>> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
>> *smh at uk.ibm.com* <smh at uk.ibm.com>
>> tel:+44-1962-815848
>> mob:+44-7717-378890
>> Note: I work Tuesday to Friday
>>
>>
>>
>> From:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*
>> <mbeckerle.dfdl at gmail.com>>
>> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>> Date:        24/07/2018 15:15
>> Subject:        [DFDL-WG] clarification: on suppressed ZL
>> string/hexBinary - do we keep variable assignments?
>> Sent by:        "dfdl-wg" <*dfdl-wg-bounces at ogf.org*
>> <dfdl-wg-bounces at ogf.org>>
>> ------------------------------
>>
>>
>>
>>
>> In some situations we parse and get a successful zero-length parse for a
>> string or hexBinary.
>>
>> But because the occurrence is optional, we do NOT add an element to the
>> infoset.
>>
>> In that case, what happens to side-effects that occurred during the
>> successful parse. There are two possible kinds of side-effects. Variables
>> can be set, and a discriminator can be set to true.
>>
>> It seems to me that if a discriminator is set, then that *must* be
>> preserved, and in that case it would seem the variable settings should be
>> retained as well.
>>
>> Comments?
>>
>> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
>> *www.tresys.com* <http://www.tresys.com>
>> Please note: Contributions to the DFDL Workgroup's email discussions are
>> subject to the *OGF Intellectual Property Policy*
>> <http://www.ogf.org/About/abt_policies.php>
>> --
>>  dfdl-wg mailing list
>>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*
>> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>
>>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20180801/316e3b0b/attachment-0001.html>


More information about the dfdl-wg mailing list