[DFDL-WG] Fw: Fw: Action 248 (was Thoughts on a discriminator scenario)

Mike Beckerle mbeckerle.dfdl at gmail.com
Tue Nov 25 12:37:45 EST 2014


As mentioned on the call, this is one of the ways of dealing with the
situation when an array has 'implicit' OCK, but has minOccurs > 1, but also
needs a discriminator for the optional elements.

I suggested on the call that this baggage be in a hidden group, but as
there are no elements involved, I think a hidden group is not advisable
here.

<xs:element name="a" dfdl:occursCountKind="implicit"
          minOccurs="1" maxOccurs="unbounded">
  <xs:complexType>
     <xs:sequence>
         <!-- This choice is DFDL's way of expressing this logic: -->
         <!-- IF the occursIndex is for the optional part of the array -->
         <!-- THEN  evaluate the array-element discriminator -->
         <!-- ELSE don't evaluate discriminator. -->
         <xs:choice>
              <xs:sequence>
                   <xs:annotation><xs:appinfo ...>
                        <!-- IF occursIndex gt 1.... -->
                        <dfdl:discriminator>{ dfdl:occursIndex() gt 1
}</dfdl:discriminator>
                        <!-- THEN discriminate the optional array elements
-->
                        <dfdl:discriminator>{ ....optional array element
discriminator... }</dfdl:discriminator>
                   </xs:appinfo></xs:annotation>
              </xs:sequence>
              <xs:sequence>
                     <!-- ELSE this is the occursIndex eq 1 case, we have
no discriminator -->
                     <!-- for the array element, since it is required. -->
              </xs:sequence>
        </xs:choice>
        .... array content goes here...
</xs:sequence>
</xs:complexType>
</xs:element>


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>


On Tue, Nov 25, 2014 at 10:50 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> I think some of your wording changes have changed my intent, which was
> that all arrays are potential PoUs.  The table now says that fixed,
> expression and stopValue are not potential PoUs, which implies that the
> discriminator never acts on the array but always on a higher PoU.  I was
> trying to avoid this, because it means that changing OCK can change the
> behaviour of the schema.  But I guess it's no different to changing the
> array to a scalar, which would have the same effect.
>
> Regarding the failure of the discriminator. The intent was it should
> behave just like any assert failure or processing error. But I think your
> point is then right - it means that the phrase 'a discriminator only ever
> resolves that point of uncertainty' should actually be ''a discriminator
> only ever positively resolves that point of uncertainty' - which is an
> asymmetric behaviour. Are we comfortable with that?
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        Steve Hanson/UK/IBM at IBMGB
> Cc:        DFDL-WG <dfdl-wg at ogf.org>
> Date:        25/11/2014 15:35
> Subject:        Re: [DFDL-WG] Fw: Fw: Action 248 (was Thoughts on a
> discriminator scenario)
> ------------------------------
>
>
>
> My suggested additional wording in Red below. There is an issue with this
> where it was unclear to me whether we've defined exactly what happens.
>
> If you have say, an array with occursCountKind 'implicit', minOccurs '1',
> and the discriminator on the element evaluates to false for that required
> first element, what happens? Do we fail the whole array? This sounds
> contradictory to the notion that the discriminator "only resolves that
> element". But having the discriminator be ignored doesn't seem right either.
>
> ...mike
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
>
>
> On Mon, Nov 17, 2014 at 8:07 AM, Steve Hanson <*smh at uk.ibm.com*
> <smh at uk.ibm.com>> wrote:
> This action was raised because of concern with the behaviour of the
> discriminator in the following example.  Because OCK is 'implicit' the 1st
> occurrence is not an actual PoU but the other 9 occurrences are. This means
> that for 1st occurrence, the discriminator actually acts on a higher PoU if
> one exists.
>     <xs:element name="Type1" maxOccurs="10"
> dfdl:occursCountKind="implicit">
>                    <dfdl:discriminator test="{fn:exists(A)}" />
>            <xs:complexType>
>                    <xs:sequence>
>                            <xs:element name="A" dfdl:initiator="A:" ... />
>                           <xs:element name="B" dfdl:initiator="B:" ... />
>                           <xs:element name="C" dfdl:initiator="C:"... />
>                   </xs:sequence>
>            </xs:complexType>
>
> This led to the suggestion that a discriminator should not 'leak' beyond a
> potential PoU, regardless of whether it is an actual PoU. The argument for
> this is contained in the thread below, and on re-reading I still think it
> is the best solution to this, so that is what I propose.
>
> There were also issues about the wording in section 9.3.3.
>
> Sections 9.3.3 and 7.4 are reproduced below, and updated to address the
> wording and leaking issues.
>
> -------------------------------------------------
>
> 9.3.3        *Points of Uncertainty*
>
> A *point of uncertainty* occurs when parsing a schema component when an
> occurrence of that schema component might not be the next item encountered
> in the data stream. Points of uncertainty can be nested.
>
> Any one of the following schema constructs is a *potential* point of
> uncertainty:
>
> ·        A branch of xs:choice
>
> ·        All xs:elements in an unordered xs:sequence (dfdl:sequenceKind
> is 'unordered')
>
> ·        An optional xs:element
>
> ·        An array xs:element.
>
> ·        All xs:elements in an xs:sequence containing one or more
> floating xs:elements.
>
> The parser resolves these points of uncertainty by way of a set of
> construct-specific rules given below along with determining whether schema
> components are known-to-exist or known-not-to-exist. For some of these
> constructs, there are situations where while there is the potential for
> uncertainty, the circumstances are such that there isn't any actual
> uncertainty; hence, *potential* points of uncertainty are distinguished
> from *actual* points of uncertainty below.
> A branch of xs:choice is always an actual point of uncertainty. A choice is
> resolved sequentially, or by direct dispatch. Sequential choice resolution
> occurs by parsing each choice branch in schema definition order until one
> is known-to-exist. It is a processing error if none of the choice branches
> are known-to-exist. Direct-dispatch choice resolution occurs by matching
> the value of the dfdl:choiceDispatchKey property to the value of the
> dfdl:choiceChoiceBranchKey property of one of the choice branches. It is a
> processing error if none of the choice branches have a matching value in
> their dfdl:choiceChoiceBranchKey property.
>
> An element in an unordered xs:sequence is always an actual point of
> uncertainty. It is resolved by parsing for the child components of the
> sequence in schema definition order at each point in the data stream where
> a component can exist until the required number of occurrences of each
> child component is known- to-exist or the sequence is terminated by
> delimiters or specified length.
>
> An element in a sequence with one or more floating elements is always an
> actual point of uncertainty. It is resolved by parsing for the expected
> element at that point in the data stream. If the expected element is
> known-not-to-exist then an occurrence of each floating element is parsed in
> schema definition order.
>
> When parsing an array, points of uncertainty only occur for certain values
> of occursCountKind, as follows:
>
>  occursCountKind Details of Potential and Actual Points of Uncertainty
> fixed No potential point of uncertainty (maxOccurs occurrences expected).
> implicit All ocurrences are potential points of uncertainty. An actual point
> of uncertainty exists after minOccurs occurrences found and until maxOccurs
> occurrences have been found.  parsed All occurrences are actual points of
> uncertainty.   expression No potential point of uncertainty (dfdl:occursCount
> occurrences expected)  stopValue No potential point of uncertainty (the
> stopValue must always be present, even
> when minOccurs is 0).
>
> *Table 11: Points of Uncertainty and dfdl:occursCountKind*
>
> An optional element point of uncertainty is resolved by parsing the
> element until it is either known-to-exist or known-not-to-exist. Whether an
> optional element is an actual point of uncertainty depends on property
> dfdl:occursCountKind as described above. (Property dfdl:occursCountKind is
> defined in Section 16.1 dfdl:occursCountKind property.)
>
> For an array element, the point of uncertainty is resolved for each
> occurrence separately by parsing the occurrence until it is either
> known-to-exist or known-not-to-exist.
>
> Discriminators resolve potential points of uncertainty. A discriminator
> defined on, or contained by, a schema construct that is a potential point
> of uncertainty, will only ever resolve that point of uncertainty. This
> holds regardless of whether there is any actual uncertainty.
> For example, if a discriminator is defined on an array element which is
> contained within the branch of a choice, the discriminator will only
> resolve the existence of occurrences of the array element, and never the
> existence of the occurrence of the choice branch. As another example,
> consider an array element with dfdl:occursCountKind 'implicit' and
> minOccurs '1'. The first element of such an array must exist, so there is
> no actual uncertainty. A discriminator on such an element is redundant, but
> often must be expressed so as to discriminate the existence of the second
> and any subsequent array elements. If a discriminator evaluates to
> 'false' or causes a processing error on a potential point of uncertainty
> where there is no actual uncertainty, ..... TBD
>
>
> (I think this causes a processing error which will fail the whole
> array.....but that sounds like it contradicts the statement above that says
> "it only ever resolves that point of uncertainty" ?)
>
> ------------------------------------
>
> 7.4        *The dfdl:discriminator Statement Annotation Element*
>
> DFDL discriminators are used during parsing to resolve points of
> uncertainty that cannot be resolved by speculative parsing. Discriminators
> are not used during unparsing.  They can also be used to force a resolution
> earlier during the parsing of a group so that subsequent parsing errors are
> treated as processing errors of a known component rather than a failure to
> find a component.
>
> A discriminator determines the existence or non-existence of a component.
> If the discriminator is successful then the component is known to exist and
> any subsequent errors will not cause backtracking at points of uncertainty.
> If a discriminator is unsuccessful then the component is known not to exist
> and backtracking occurs immediately.
>
> If the complex type of an element contains a sequence group as its content
> model then if the sequence group is known not to exist, then the element is
> known not to exist.
>
> Examples of dfdl:discriminator annotation are below :
>
> <dfdl:discriminator>
>   { ../recType eq 0 }
> </dfdl:discriminator>
>
> <dfdl:discriminator test="{ ../recType eq 0}" />
>
> When the discriminator's expression evaluates to "false", then it causes a
> processing error, and the discriminator is said to fail.
>
> A discriminator defined on, or contained by, a schema construct that is a
> potential point of uncertainty, will only ever resolve that point of
> uncertainty.
>
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 17/11/2014 11:54 -----
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB
> Cc:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>, *dfdl-wg-bounces at ogf.org*
> <dfdl-wg-bounces at ogf.org>
> Date:        15/05/2014 10:48
> Subject:        Re: [DFDL-WG] Fw: Action 248 (was Thoughts on a
> discriminator        scenario)
> ------------------------------
>
>
> Tim - I've responded to your specific comments below in blue font.
>
> All - You will see that I have some concerns over the words used in the
> definition of a PoU, as we seem to be unclear as to whether a PoU is a
> point in the data stream or a point in the model. I am wondering whether
> the concepts of 'potential PoU' and 'actual PoU' can be better expressed as
> 'PoU in the model' and 'PoU in the data'. I want to mull this over for a
> while.  I'm not changing the rules by this, just how we express them.
>
> So please let me run with this before replying.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        14/05/2014 23:31
> Subject:        Re: [DFDL-WG] Fw: Action 248 (was Thoughts on a
> discriminator        scenario)
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> I agree that the wording is not easy to get right. However, I think the
> current wording needs some adjustment so I'm going to make some suggestions
> and see where it leads.
>
> "
> *A point of uncertainty occurs in the data stream when there is more than
> one schema component that might occur at that point.*"
>
> I don't think this is precise enough.
> SMH: Agree. I need to think about this sentence. There are several things
> potentially wrong. It is defining a PoU as occurring in the data stream,
> whereas elsewhere PoU is equated to a position in the model. It says 'more
> than one schema component that might occur' - maybe it should say 'a schema
> component may or may not occur'. And schema *components* don't occur in
> the data stream anyway - *occurrences* of them do.
>
> - if an optional element occurs at the end of the input data then there is
> only *one* schema component that might occur at that point. The end of the
> data stream might occur instead.
> SMH: Yes but I I raised some similar arguments earlier in the thread,
> about the last branch of a choice not being a PoU, or the last element in
> an unordered sequence when all the others had been found not being a PoU.
> We agreed that these are still all treated as PoUs for clarity. This is
> another example.
>
> - if an optional element occurs before the last required element in a
> sequence AND the separatorSuppressionPolicy is not 'anyEmpty' then there is
> exactly one schema component that can occur at that point in the data
> stream. But it might be 'empty', in which case it will not be put into the
> info set.
> This is not pedantry. The parser will never need to backtrack in either of
> these cases and in the second case it is obvious in advance which schema
> component the parser should select for parsing.
> SMH: We have agreed in the past that the presence of a separator is not
> enough to infer 'known-to-exist', so separators should not be brought into
> this definition. You are right that in a positional sequence the parser is
> looking for an occurrence of a component or its empty rep, and never an
> occurrence of the next schema component, so the parser can certainly
> optimise here. Let's take any discussion of separators out of this for the
> moment, and raise a separate action if needed.
>
>
>
>
>
>
>
> * Points of uncertainty can be nested. Any one of the following constructs
> is a potential point of uncertainty: 1. An xs:choice 2. All xs:elements in
> an unordered xs:sequence (dfdl:sequenceKind is 'unordered') 3. An optional
> xs:element 4. An array xs:element. 5. All xs:elements in an xs:sequence
> containing one or more floating xs:elements. *
>
> 1. should say 'A member of an xs:choice' because it is the member, not the
> group itself, that is the point of uncertainty. I think the confusion has
> arisen because only one member of a choice group can exist in the data. So
> if any member exists, it automatically ends any speculation about the
> content of the choice group. But I insist that the real point of
> uncertainty is the member. A choice group is always 'known to exist'
> because according to DFDL rules it must have minOccurs=maxOccurs=1. FWIW, I
> have no problem with talking about 'resolving a choice', provided that we
> define that as 'Determining which member of a choice group ( if any ) is
> known to exist in the data'.
> SMH: I agree that it should say member.
>
> 2. Should say 'All members of an unordered xs:sequence' to keep the
> language consistent with 1. The section on unordered groups clearly
> restricts members to elements only.
> SMH: No. Using 'xs:element' is consistent with optionals & arrays in 3 and
> 4, which are also always elements. so xs:element is more consistent.
>
> 3. See above - an optional elements is not always a 'point of uncertainty'
> according to the literal definition that we are currently using.
> SMH; Right, but the bullets are defining *potential* PoUs, so it is
> correct as it stands.
>
> 4. Should say 'An optional occurrence of an array element, unless the
> separator properties make it a positional array and the occurrence is
> required in the data'
> SMH: No. All occurrences can be PoUs, it depends on OCK. And separators do
> not resolve PoUs as noted. This definition 4 is the one that is key for
> Action 248, which is ultimately what led to this discussion and what needs
> to be resolved. The question is whether 4 should say *a)* all arrays are
> potential PoUs as it does now, or* b)* just some arrays are potential
> PoUs depending on OCK. Whatever we choose, a discriminator within that
> array must not leak beyond the array as explained *below in bold red font*.
> I think a) is clearer and we can then make a general statement about
> discriminators not leaking outside of any potebtial PoU. If we adopt b)
> then we need a separate statement about discriminators and arrays, which
> seems more bitty.
>
> 5. Should say 'All members...' for consistency.
> SMH: See 3.
>
> regards,
>
> Tim Kimber,
> IBM Integration Bus Development (Industry Packs)
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        ,
> Date:        13/05/2014 10:28
> Subject:        [DFDL-WG] Fw: Action 248 (was Thoughts on a discriminator
> scenario)
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>
>
>
> This will be discussed on today's call. Please have a position on the
> paragraph below that ends 'What do others think?'
>
> Thanks
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 13/05/2014 10:19 -----
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB,
> Cc:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> Date:        30/04/2014 12:25
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
>  ------------------------------
>
>
> Tim
>
> Responses below.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        11/04/2014 14:03
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> * "2. If a potential point of uncertainty is sometimes** an actual point
> of uncertainty (ock 'implicit') then a discriminator that applies it will
> only ever resolve, or have no effect on, that point of uncertainty. It
> never has an effect on any enclosing point of uncertainty."*
> This could be misinterpreted. The discriminator could evaluate to 'false'
> and thus cause the POI to be resolved negatively ( the component would be
> 'known not to exist' )
>
> SMH: Agree, and I can improve the words here.
>
> 1. and 3. will both apply if an element with ock='fixed' appears as a
> choice branch. Is the POI always an actual POI or never?
>
> SMH: No. There are two independent points of uncertainty, the choice
> branch and the array.
>
> The wording of 3. reads very strangely. 'If a potential point of
> uncertainty is *never* an actual point of uncertainty' begs the question
> 'why is it even a *potential *point of uncertainty?'. The current wording
> follows from our definition of the term 'point of uncertainty':
> "
> *A point of uncertainty occurs in the data stream when there is more than
> one schema component that might occur at that point.*"
>
>
>
>
>
> *Points of uncertainty can be nested. Any one of the following constructs
> is a potential point of uncertainty: 1. An xs:choice 2. All xs:elements in
> an unordered xs:sequence (dfdl:sequenceKind is 'unordered') 3. An optional
> xs:element 4. An array xs:element. 5. All xs:elements in an xs:sequence
> containing one or more floating xs:elements. *
> I think this definition is too broad. It forces us to discuss potential
> POUs that will never be actual POUs according to the first sentence.
>
> * SMH: Yes it does read a bit strangely, but there's a reason for this. If
> we said that ock 'fixed', 'expression' or 'stopValue' are never POUs then
> what does it mean if a discriminator is placed on such an element?  A
> discriminator gets evaluated for each occurrence of an array. For that
> reason we can not let a discriminator within an array leak beyond the array
> - regardless of whether it is a POU or not - otherwise what does that mean
> to enclosing POUs? So even if we said that ock 'fixed', 'expression' or
> 'stopValue' are never POUs we would still need the spec to state that a
> discriminator never leaks beyond them. I think it is clearer to say that a
> discriminator never leaks beyond a potential POU and keep the existing
> definition.  What do others think?*
>
> regards,
>
> Tim Kimber,
> IBM Integration Bus Development (Industry Packs)
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        11/04/2014 11:44
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>   *248*
> *Discriminators and potential points of uncertainty (Steve) *
> 28/1: Steve to write up a proposal to prevent a discriminator from
> behaving in a non-obvious manner when used with a potential point of
> uncertainty that turns out not to be an actual point of uncertainty.
> 5/2: Steve sent an email to check whether choice branches, unordered
> elements and floating elements should always be actual points of
> uncertainty, as there are times when there is no uncertainty, eg, last
> choice branch; all floating elements found. It was decided that they are
> always actual points of uncertainty. To do otherwise will complicate
> implementations and result in fragile schemas. Steve will proceed with the
> proposal on that basis.
>
> Based on the above, which reflects the email discussion below, here is
> what I propose to resolve this action.
> 1.        If a potential point of uncertainty is *always* an actual point
> of uncertainty (choice branch, element in unordered sequence, floating
> element, ock 'parsed') then a discriminator that applies to it will only
> ever resolve that point of uncertainty. It never has an effect on any
> enclosing point of uncertainty.
> 2.        If a potential point of uncertainty is *sometimes* an actual
> point of uncertainty (ock 'implicit') then a discriminator that applies it
> will only ever resolve, or have no effect on, that point of uncertainty. It
> never has an effect on any enclosing point of uncertainty.
> 3.        If a potential point of uncertainty is *never* an actual point
> of uncertainty (ock 'fixed', 'expression', 'stopValue') then a
> discriminator that applies to it will never have an effect on that point of
> uncertainty. Nor does it ever have an effect on any enclosing point of
> uncertainty.
> I think 1 and 2 are not controversial, but there is an alternative for 3:
>
>  3.   If a potential point of uncertainty is *never* an actual point of
> uncertainty (ock 'fixed', 'expression', 'stopValue') then a discriminator
> that applies to it will never have an effect on that point of uncertainty.
> Instead the discriminator is applied to any enclosing point of uncertainty.
>
> The alternative means that changing an element from (say) ock 'parsed' to
> ock 'expression' has the same effect on a discriminator as changing the
> element to (1,1). The discriminator that applied to it now applies to any
> enclosing pou.
>
> SMH: Afternote: The alternative does not work for the reason given in my
> reply to Tim above.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB,
> Cc:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>, *dfdl-wg-bounces at ogf.org*
> <dfdl-wg-bounces at ogf.org>
> Date:        05/02/2014 12:04
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
>
> ------------------------------
>
>
> Thanks Tim, all good points. Comments to your comments.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
>
> From:        Tim Kimber/UK/IBM
> To:        Steve Hanson/UK/IBM at IBMGB,
> Cc:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>, *dfdl-wg-bounces at ogf.org*
> <dfdl-wg-bounces at ogf.org>
> Date:        05/02/2014 11:01
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
>  ------------------------------
>
>
> A couple of comments below.
>
> regards,
>
> Tim Kimber,
> IBM Integration Bus Development (Industry Packs)
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        05/02/2014 10:50
> Subject:        [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>   *248*
> *Discriminators and potential points of uncertainty (Steve) *
> 28/1: Steve to write up a proposal to prevent a discriminator from
> behaving in a non-obvious manner when used with a potential point of
> uncertainty that turns out not to be an actual point of uncertainty.
> 5/2: With Steve
>
> I started on this by reading section 9.3.3 on points of uncertainty, which
> lists the potential PoUs. Here's the list to save getting the spec out.
>
> 1.        An xs:choice branch
>
> 2.        All xs:elements in an unordered xs:sequence (dfdl:sequenceKind
> is 'unordered')
>
> 3.        An optional xs:element
>
> 4.        An array xs:element
>
> 5.        All xs:elements in an xs:sequence containing one or more
> floating xs:elements.
>
> The section then looks at each in turn and gives the circumstances when it
> is an actual PoU or not. As currently written, it is only 3 and 4 where a
> potential PoU might not be an actual PoU. For 1, 2 and 5 it says they are
> always actual PoUs.
>
> But I'm not sure that's correct. A deeper analysis of what is actually
> going on with 1, 2 and 5 says to me that there are times when there might
> not be an actual PoU.
>
> 1. Given that there is no concept in DFDL of optional choice branches,
> then if the last branch is reached then there is no longer a PoU. It must
> be that branch else it is a processing error.
>
> TK: I think of it slightly differently. It is a PoU, even if the branch is
> the only remaining branch. If we say that the final choice branch is not a
> PoU then diagnostics become confused - the parser reports the error code as
> 'error while parsing root/choice/lastBranch/field1' when the correct error
> code would be 'none of the branches of root/choice were found in the data'.
>
> SMH: I see your point. My thinking was that choices have finite branches
> and a choice is (1,1). If I have got to the last branch then I am not one
> of the other branches so I *must* be this one. If there is any other
> possibility then the model is missing a branch, even if it is just one that
> contains an empty sequence with an assert {fn:false()}. In practice of
> course users forget to add that last branch (there's no XSDL equivalent to
> the 'default' branch of a switch/case statement), so yes they could end up
> with an unclear diagnostic.
>
> 2. There can come a point in an unordered sequence when all that can be
> encountered is one element, and if that is (1,1) then there is no longer a
> PoU.
>
> TK: It's still a PoU. The specification says that occursCountKind is
> 'parsed' for all members of an unordered group, so min/maxOccurs do not
> come into play.
>
> SMH: Interesting. The spec says that if a member is optional or an array
> then it must be 'parsed'. If it is (1,1) though it does not have an
> occursCountKind. The specific case I was thinking of is when *all*
> members are (1,1), so when you have one element to go there is no PoU.
> However, the rewrite into a repeating choice has the effect of making
> *everything* 'parsed', which is really the point you are making. So I
> agree with you, it is easier to say that everything is an actual PoU else
> it complicates the rewrite semantic.
>
> 5. If all floating elements are (1,1) and all are encountered, then from
> that point on there are no longer any PoUs due to floating elements.
>
> TK: I suspect that floating elements are somewhat like unordered branches
> - most users will not want min/maxOccurs to affect the parsing of the
> group. Schema validation ( or more complex validation applied in the
> receiving application ) will deal with non-conformances.
>
> SMH: Possibly yes. With something like X12 NTE segments, that is the case.
> But we don't express the floating semantic as a rewrite of the whole
> sequence like we do for unordered, it's more of a per element thing. And if
> that is done dynamically as we go through the sequence, having no PoU can
> result.
>
> I'd like us to get straight on this before I proceed with the action
> proper.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 05/02/2014 10:12 -----
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        27/01/2014 17:39
> Subject:        Fw: Thoughts on a discriminator scenario
>
> ------------------------------
>
>
> Been thinking some more on the discriminator scenario below that I mailed
> out before xmas, and discussing it with the IBM DFDL team.
>
> The 'confusing' aspect of the behaviour is that a discriminator within a
> potential PoU will act on a higher level PoU if the potential PoU is not an
> actual PoU. In the example, the array element 'Type1' is not an actual PoU
> for occurrence 1, only for occurrences 2+. So when the discriminator fires
> for occurrence 1 it will resolve a higher level unresolved PoU if one
> exists.
>
> Perhaps the spec should say that a discriminator can't 'leak' beyond the
> potential PoU that encloses it ? If so, then for occurrence 1 the
> discriminator has no effect, and only has an effect for occurrences 2+.
> This makes for more predictable and robust schemas.
>
> We'd need to go through spec section 9.3.3 carefully to see if this does
> not break any of the potential PoUs that are listed.
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 16/01/2014 09:55 -----
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        20/12/2013 13:20
> Subject:        Thoughts on a discriminator scenario
>  ------------------------------
>
>
> Take the following schema (simplified) for element Type1 (1,10) being a
> loop for elements A,B,C.  Type 1 does not have an initiator so I need to
> use a discriminator to establish the existence of an occurrence of Type1 so
> that incorrect backtracking does not occur after an error. Because
> occursCountKind is 'implicit', the 1st occurrence is *not* a point of
> uncertainty so the discriminator acts instead on any enclosing point of
> uncertainty, but for 2nd and subsequent occurrences it acts on Type1.  That
> is all working as designed, but I think users find will the 1st occurrence
> behaviour a bit confusing. There are workarounds to avoid the problem, eg,
> use occursCountKind 'parsed' or split Type1 into two as (1,1) and (0,9). I
> think this is worth documenting in a tutorial as this is quite subtle stuff.
>
>    <xs:element name="Type1" maxOccurs="10" dfdl:occursCountKind="implicit"
> >
>                    <dfdl:discriminator test="{fn:exists(A)}" />
>            <xs:complexType>
>                    <xs:sequence>
>                            <xs:element name="A" dfdl:initiator="A:" ... />
>                           <xs:element name="B" dfdl:initiator="B:" ... />
>                           <xs:element name="C" dfdl:initiator="C:"... />
>                   </xs:sequence>
>            </xs:complexType>
>
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>  dfdl-wg mailing list
>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> --
>   dfdl-wg mailing list
>   *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>   *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20141125/bf0ba021/attachment-0001.html>


More information about the dfdl-wg mailing list