[DFDL-WG] Clarification discussion points for next call(s) - DFDL Spec issues around nil, empty, normal, absent, defaulting, and separator suppression.

Mike Beckerle mbeckerle.dfdl at gmail.com
Thu Aug 2 14:33:31 EDT 2018


Before the next call, I'll prune these to only the ones that are unresolved
as yet in my mind, or where I think we really need to improve the spec.

The responses already provided to address some of these, some of which are
related to, other are redundant with, those threads.

...mikeb

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>


On Thu, Aug 2, 2018 at 12:33 PM, Steve Hanson <smh at uk.ibm.com> wrote:

> Mike
>
> I'm not sure if any of the other threads have answered any of these
> questions, I will not have time to look at this before the next call.
>
> I will say though that an assertion failure is just another way of getting
> a processing error. I think that's why it's not called out explicitly in
> 9.2.1 to 9.2.4.
>
> IBM DFDL has not implemented stopValue yet, but there was a lot of
> discussion whether stopValue was a point of uncertainty during action 140.
> Suggest reading the original action 140 document, which is on Redmine.
>
> Regards
>
> Steve Hanson
>
> IBM Hybrid Integration, Hursley, UK
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> mob:+44-7717-378890
> Note: I work Tuesday to Friday
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        dfdl-wg at ogf.org
> Date:        20/07/2018 20:51
> Subject:        [DFDL-WG] Clarification discussion points for next
> call(s) - DFDL Spec issues around nil, empty, normal, absent, defaulting,
> and separator suppression.
> Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org>
> ------------------------------
>
>
>
> I'm trying to tighten up my understanding of section 9 materials around
> establishing representation, known to/not-to exist, and defaulting,
> particularly as they interact with separators and separator suppresssion
> for "absent" representations.
>
> Section 9.2.5
>
> This sentence
>
>      The nil representation can be a zero-length representation if
> dfdl:nilValue is "%ES;", and there is no framing or framing is suppressed
> by dfdl:nilValueDelimiterPolicy.
>
> Should say
>
>       ".... if dfdl:nilValue is a list containing either %ES; alone, or
> %WSP*; alone, and ...."
>
> This matches existing erratum 5.32. It's just another place that needs the
> same update.
>
> Sections 9.2.1 to 9.2.4 do not say what happens if when trying to
> establish the representation, an assertion failure occurs.  In particular,
> can an assertion failure cause establishing of nil, empty, or normal
> represntation to fail, resulting in absent representation? For example, if
> an assert requires the length of the value to be greater than 1 character,
> then a zero-length string cannot be the normal representation.
>
> But do we parse to nil representation and run assertions - and if fail,
> parse for empty representation, then re-run assertions, and if fail parse
> for normal representation, then re-run assertions, and if fail and the rep
> was "trivially ZL", then it is absent?
>
> Section 9.2.3 doesn't say what happens if a type conversion error happens
> when trying to establish Normal representation. E.g., if a non-defaultable
> text integer is parsed from a zero-length string. T
>
> A section 9.2.5 should be added describing "No representation" - Computed
> elements (dfdl:inputValueCalc) have no representation at all. Not zero
> length, not anything. They have no implications for parsing of, or
> unparsing of, delimiters of any kind.
>
> Section 9.3
>
> Section 9.3.1.1 says establishing known-to-exist, no processing error can
> occur.
>
> For example, an element can have zero-length representation if it is
> nillable and nilValue=%ES; and there are no delimiters nor framing. But an
> assert can fail for this by specifically excluding dfdl:valueLength(.) eq
> 0.  Or an element of type string can have zero length, but an assert can
> insist the string contain 1 or more characters.
>
> Section 9.3.1.3 says known not to exist if the occurence is Missing, which
> means absent representation is one way. So can an assert that causes nil,
> empty, and normal representation to fail (assuming asserts are evaluated
> and contribute to that decision about representation)  can cause the
> occurrence to be absent; hence, missing.  It also says a processing error
> when parsing the component means it is known not to exist.
>
>
> Section 9.3.2 Establishing Representation
>
> Section 9.3.2.1 Simple Element
> (1) already has an erratum allowing WSP* alone as well as ES.
> Does not say whether (2) empty representation - qualified by "empty
> representation must be able to be of zero-length".
> (3) normal representation - does not say if asserts can fail and prevent
> this zero-length from being acceptable. (actually this point about
> assertions applies to all of 1, 2, and 3.
>
> Section 9.3.2.2 Complex Element
> If a complex type element is lengthKind 'delimited' do we still have to
> recurse the type before deciding if it is trivially zero length, or can we
> look at the data stream without recursing in? Section 9.3.2 5th bullet says
> that we can look for if a delimiter is immediately encountered, and does
> not say this is for simple types only.
>
> A complex type element can have empty representation if the content region
> is empty and the delimiters with EVDP specify what is found in the data
> stream.
> Absent representation for a complex element can only occur if the
> representation is zero length after recursing through the type tree. This
> implies that if EVDP indicates delimiters for empty value, then ZL means
> absent.
>
> Section 9.3.3
>
> StopValue seems like it has a point of uncertainty for every occurrance.
> The fact that a stop value must exist doesn't mean there are no points of
> uncertainty. It is uncertain if the logical value will be the stop value or
> not.
>
> But another way of thinking about it is that the stopvalue parser does not
> need to establish points for backtracking. All elements MUST succeed until
> it parses a successful stop value, and if any failure occurs we backtrack
> the entire array, not just an element.
>
> Use of stopValue with type xs:string creates lots of ambiguities. E.g., ZL
> can be a valid normal representation, but the stop value may be "stop",
> i.e., non-ZL. In that case, since all elements are optional according to
> minOccurs 0, then when a ZL is parsed is the optional element suppressed?
> Or not - meaning you get an array full of empty string "normal" values?
>
> Section 9.4.2
>
> Says a complex type must have descended into the type and returned with no
> processing error, but does not say whether processing errors signaled by
> asserts on simple type elements also disable empty representation from
> being established.
>
> Section 9.4.2.2
>
> Says if EVDP is not none, can an assert insist on something that subverts
> establishment of empty representation, such as that the length is > 0?
> Or the assert can test something orthogonal - entirely unrelated like some
> variable is set to a certain value?
>
> E.g., <defineVariable name="disallowEmptyValues" type="xs:boolean".../>
> Then an assert on the element says { if ($disallowEmptyValues) then false
> else true }.
>
> For optional occurrence, if EVDP is not none, then empty representation is
> established by the presence of some positive syntax - the representation is
> not ZL. However, this says a empty string or empty hexbinary becomes the
> value, not the default value of the element. Is that correct? I would think
> having positive syntax that matches the empty representation would satisfy
> known-to-exist, and then that would trigger assigning the default value.
> However, I suppose the rationale is that such an empty value for an
> optional element means there will be no defaulting, hence, the empty
> representation corresponds to empty string or empty hexBinary as "normal"
> representation.
>
> The suggestion to use an assert  that checks a non-zero minLength facet
> only makes sense if the processing error will cause the end of the array
> (occursCountKind parsed or implicit). If the OCK is such that there is no
> point of uncertainty, then this processing error would cause the whole
> array-element to fail. That is, the assert suggested here really does too
> much to be used only to filter empty strings/hexBinary from going into the
> infoset.
>
> Or does a ZL failure for a delimited simpleType value, where the text is
> ZL, but the type conversion fails, or an assert fails, does that create
> Absent representation resulting in no empty string going into the infoset?
>
> I suspect this issue is tied up with separator suppression policy and when
> a ZL thing is suppressed, the separator absorbed, and nothing goes into the
> infoset.
>
> 9.4.2.3 - suggests that processors must keep track of the "all empty flag"
> for every infoset node and recursively all child nodes.
>
> This section should say that a complex type has empty representation if it
> is known to exist, and the position in the data doesn't change after a
> recursive traversal.
> But this last paragraph of the section contradicts what is said in the 2nd
> sentence of the section (maybe). The point the example is making - the
> principle it is illustrating, does not seem to be explicilty stated.
>
>
>
>
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20180802/f3cc8b14/attachment.html>


More information about the dfdl-wg mailing list