[DFDL-WG] Fw: Fw: Action 248 (was Thoughts on a discriminator scenario)

Mike Beckerle mbeckerle.dfdl at gmail.com
Tue Nov 25 10:35:12 EST 2014


My suggested additional wording in Red below. There is an issue with this
where it was unclear to me whether we've defined exactly what happens.

If you have say, an array with occursCountKind 'implicit', minOccurs '1',
and the discriminator on the element evaluates to false for that required
first element, what happens? Do we fail the whole array? This sounds
contradictory to the notion that the discriminator "only resolves that
element". But having the discriminator be ignored doesn't seem right either.

...mike

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>


On Mon, Nov 17, 2014 at 8:07 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> This action was raised because of concern with the behaviour of the
> discriminator in the following example.  Because OCK is 'implicit' the 1st
> occurrence is not an actual PoU but the other 9 occurrences are. This means
> that for 1st occurrence, the discriminator actually acts on a higher PoU if
> one exists.
>
>     <xs:element name="Type1" maxOccurs="10"
> dfdl:occursCountKind="implicit">
>                    <dfdl:discriminator test="{fn:exists(A)}" />
>            <xs:complexType>
>                    <xs:sequence>
>                            <xs:element name="A" dfdl:initiator="A:" ... />
>                           <xs:element name="B" dfdl:initiator="B:" ... />
>                           <xs:element name="C" dfdl:initiator="C:"... />
>                   </xs:sequence>
>            </xs:complexType>
>
> This led to the suggestion that a discriminator should not 'leak' beyond a
> potential PoU, regardless of whether it is an actual PoU. The argument for
> this is contained in the thread below, and on re-reading I still think it
> is the best solution to this, so that is what I propose.
>
> There were also issues about the wording in section 9.3.3.
>
> Sections 9.3.3 and 7.4 are reproduced below, and updated to address the
> wording and leaking issues.
>
> -------------------------------------------------
>
> 9.3.3        *Points of Uncertainty*
>
> A *point of uncertainty* occurs when parsing a schema component when an
> occurrence of that schema component might not be the next item encountered
> in the data stream. Points of uncertainty can be nested.
>
> Any one of the following schema constructs is a *potential* point of
> uncertainty:
>
> ·        A branch of xs:choice
>
> ·        All xs:elements in an unordered xs:sequence (dfdl:sequenceKind
> is 'unordered')
>
> ·        An optional xs:element
>
> ·        An array xs:element.
>
> ·        All xs:elements in an xs:sequence containing one or more
> floating xs:elements.
>
> The parser resolves these points of uncertainty by way of a set of
> construct-specific rules given below along with determining whether schema
> components are known-to-exist or known-not-to-exist. For some of these
> constructs, there are situations where while there is the potential for
> uncertainty, the circumstances are such that there isn't any actual
> uncertainty; hence, *potential* points of uncertainty are distinguished
> from *actual* points of uncertainty below.
>
A branch of xs:choice is always an actual point of uncertainty. A choice is
> resolved sequentially, or by direct dispatch. Sequential choice resolution
> occurs by parsing each choice branch in schema definition order until one
> is known-to-exist. It is a processing error if none of the choice branches
> are known-to-exist. Direct-dispatch choice resolution occurs by matching
> the value of the dfdl:choiceDispatchKey property to the value of the
> dfdl:choiceChoiceBranchKey property of one of the choice branches. It is a
> processing error if none of the choice branches have a matching value in
> their dfdl:choiceChoiceBranchKey property.
>
> An element in an unordered xs:sequence is always an actual point of
> uncertainty. It is resolved by parsing for the child components of the
> sequence in schema definition order at each point in the data stream where
> a component can exist until the required number of occurrences of each
> child component is known- to-exist or the sequence is terminated by
> delimiters or specified length.
>
> An element in a sequence with one or more floating elements is always an
> actual point of uncertainty. It is resolved by parsing for the expected
> element at that point in the data stream. If the expected element is
> known-not-to-exist then an occurrence of each floating element is parsed in
> schema definition order.
>
> When parsing an array, points of uncertainty only occur for certain values
> of occursCountKind, as follows:
>  occursCountKind Details of Potential and Actual Points of Uncertainty
> fixed No potential point of uncertainty (maxOccurs occurrences expected).
> implicit All ocurrences are potential points of uncertainty. An actual point
> of uncertainty exists after minOccurs occurrences found and until maxOccurs
> occurrences have been found. parsed All occurrences are actual points of
> uncertainty.  expression No potential point of uncertainty (dfdl:occursCount
> occurrences expected) stopValue No potential point of uncertainty (the
> stopValue must always be present, even
> when minOccurs is 0).
>
> *Table 11: Points of Uncertainty and dfdl:occursCountKind*
>
> An optional element point of uncertainty is resolved by parsing the
> element until it is either known-to-exist or known-not-to-exist. Whether an
> optional element is an actual point of uncertainty depends on property
> dfdl:occursCountKind as described above. (Property dfdl:occursCountKind is
> defined in Section 16.1 dfdl:occursCountKind property.)
>
> For an array element, the point of uncertainty is resolved for each
> occurrence separately by parsing the occurrence until it is either
> known-to-exist or known-not-to-exist.
>
> Discriminators resolve potential points of uncertainty. A discriminator
> defined on, or contained by, a schema construct that is a potential point
> of uncertainty, will only ever resolve that point of uncertainty. This
> holds regardless of whether there is any actual uncertainty.
>
For example, if a discriminator is defined on an array element which is
> contained within the branch of a choice, the discriminator will only
> resolve the existence of occurrences of the array element, and never the
> existence of the occurrence of the choice branch. As another example,
> consider an array element with dfdl:occursCountKind 'implicit' and
> minOccurs '1'. The first element of such an array must exist, so there is
> no actual uncertainty. A discriminator on such an element is redundant, but
> often must be expressed so as to discriminate the existence of the second
> and any subsequent array elements. If a discriminator evaluates to
> 'false' or causes a processing error on a potential point of uncertainty
> where there is no actual uncertainty, ..... TBD
>


> (I think this causes a processing error which will fail the whole
> array.....but that sounds like it contradicts the statement above that says
> "it only ever resolves that point of uncertainty" ?)
>

> ------------------------------------
>
> 7.4        *The dfdl:discriminator Statement Annotation Element*
>
> DFDL discriminators are used during parsing to resolve points of
> uncertainty that cannot be resolved by speculative parsing. Discriminators
> are not used during unparsing.  They can also be used to force a resolution
> earlier during the parsing of a group so that subsequent parsing errors are
> treated as processing errors of a known component rather than a failure to
> find a component.
>
> A discriminator determines the existence or non-existence of a component.
> If the discriminator is successful then the component is known to exist and
> any subsequent errors will not cause backtracking at points of uncertainty.
> If a discriminator is unsuccessful then the component is known not to exist
> and backtracking occurs immediately.
>
> If the complex type of an element contains a sequence group as its content
> model then if the sequence group is known not to exist, then the element is
> known not to exist.
>
> Examples of dfdl:discriminator annotation are below :
>
> <dfdl:discriminator>
>   { ../recType eq 0 }
> </dfdl:discriminator>
>
> <dfdl:discriminator test="{ ../recType eq 0}" />
>
> When the discriminator's expression evaluates to "false", then it causes a
> processing error, and the discriminator is said to fail.
>
> A discriminator defined on, or contained by, a schema construct that is a
> potential point of uncertainty, will only ever resolve that point of
> uncertainty.
>
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> ----- Forwarded by Steve Hanson/UK/IBM on 17/11/2014 11:54 -----
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB
> Cc:        dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org
> Date:        15/05/2014 10:48
> Subject:        Re: [DFDL-WG] Fw: Action 248 (was Thoughts on a
> discriminator        scenario)
> ------------------------------
>
>
> Tim - I've responded to your specific comments below in blue font.
>
> All - You will see that I have some concerns over the words used in the
> definition of a PoU, as we seem to be unclear as to whether a PoU is a
> point in the data stream or a point in the model. I am wondering whether
> the concepts of 'potential PoU' and 'actual PoU' can be better expressed as
> 'PoU in the model' and 'PoU in the data'. I want to mull this over for a
> while.  I'm not changing the rules by this, just how we express them.
>
> So please let me run with this before replying.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        dfdl-wg at ogf.org,
> Date:        14/05/2014 23:31
> Subject:        Re: [DFDL-WG] Fw: Action 248 (was Thoughts on a
> discriminator        scenario)
> Sent by:        dfdl-wg-bounces at ogf.org
> ------------------------------
>
>
>
> I agree that the wording is not easy to get right. However, I think the
> current wording needs some adjustment so I'm going to make some suggestions
> and see where it leads.
>
> "
> *A point of uncertainty occurs in the data stream when there is more than
> one schema component that might occur at that point.*"
>
> I don't think this is precise enough.
> SMH: Agree. I need to think about this sentence. There are several things
> potentially wrong. It is defining a PoU as occurring in the data stream,
> whereas elsewhere PoU is equated to a position in the model. It says 'more
> than one schema component that might occur' - maybe it should say 'a schema
> component may or may not occur'. And schema *components* don't occur in
> the data stream anyway - *occurrences* of them do.
>
> - if an optional element occurs at the end of the input data then there is
> only *one* schema component that might occur at that point. The end of the
> data stream might occur instead.
> SMH: Yes but I I raised some similar arguments earlier in the thread,
> about the last branch of a choice not being a PoU, or the last element in
> an unordered sequence when all the others had been found not being a PoU.
> We agreed that these are still all treated as PoUs for clarity. This is
> another example.
>
> - if an optional element occurs before the last required element in a
> sequence AND the separatorSuppressionPolicy is not 'anyEmpty' then there is
> exactly one schema component that can occur at that point in the data
> stream. But it might be 'empty', in which case it will not be put into the
> info set.
> This is not pedantry. The parser will never need to backtrack in either of
> these cases and in the second case it is obvious in advance which schema
> component the parser should select for parsing.
> SMH: We have agreed in the past that the presence of a separator is not
> enough to infer 'known-to-exist', so separators should not be brought into
> this definition. You are right that in a positional sequence the parser is
> looking for an occurrence of a component or its empty rep, and never an
> occurrence of the next schema component, so the parser can certainly
> optimise here. Let's take any discussion of separators out of this for the
> moment, and raise a separate action if needed.
>
>
>
>
>
>
>
> * Points of uncertainty can be nested. Any one of the following constructs
> is a potential point of uncertainty: 1. An xs:choice 2. All xs:elements in
> an unordered xs:sequence (dfdl:sequenceKind is 'unordered') 3. An optional
> xs:element 4. An array xs:element. 5. All xs:elements in an xs:sequence
> containing one or more floating xs:elements. *
>
> 1. should say 'A member of an xs:choice' because it is the member, not the
> group itself, that is the point of uncertainty. I think the confusion has
> arisen because only one member of a choice group can exist in the data. So
> if any member exists, it automatically ends any speculation about the
> content of the choice group. But I insist that the real point of
> uncertainty is the member. A choice group is always 'known to exist'
> because according to DFDL rules it must have minOccurs=maxOccurs=1. FWIW, I
> have no problem with talking about 'resolving a choice', provided that we
> define that as 'Determining which member of a choice group ( if any ) is
> known to exist in the data'.
> SMH: I agree that it should say member.
>
> 2. Should say 'All members of an unordered xs:sequence' to keep the
> language consistent with 1. The section on unordered groups clearly
> restricts members to elements only.
> SMH: No. Using 'xs:element' is consistent with optionals & arrays in 3 and
> 4, which are also always elements. so xs:element is more consistent.
>
> 3. See above - an optional elements is not always a 'point of uncertainty'
> according to the literal definition that we are currently using.
> SMH; Right, but the bullets are defining *potential* PoUs, so it is
> correct as it stands.
>
> 4. Should say 'An optional occurrence of an array element, unless the
> separator properties make it a positional array and the occurrence is
> required in the data'
> SMH: No. All occurrences can be PoUs, it depends on OCK. And separators do
> not resolve PoUs as noted. This definition 4 is the one that is key for
> Action 248, which is ultimately what led to this discussion and what needs
> to be resolved. The question is whether 4 should say *a)* all arrays are
> potential PoUs as it does now, or* b)* just some arrays are potential
> PoUs depending on OCK. Whatever we choose, a discriminator within that
> array must not leak beyond the array as explained *below in bold red font*.
> I think a) is clearer and we can then make a general statement about
> discriminators not leaking outside of any potebtial PoU. If we adopt b)
> then we need a separate statement about discriminators and arrays, which
> seems more bitty.
>
> 5. Should say 'All members...' for consistency.
> SMH: See 3.
>
> regards,
>
> Tim Kimber,
> IBM Integration Bus Development (Industry Packs)
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        ,
> Date:        13/05/2014 10:28
> Subject:        [DFDL-WG] Fw: Action 248 (was Thoughts on a discriminator
> scenario)
> Sent by:        dfdl-wg-bounces at ogf.org
>
>
>
> This will be discussed on today's call. Please have a position on the
> paragraph below that ends 'What do others think?'
>
> Thanks
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> ----- Forwarded by Steve Hanson/UK/IBM on 13/05/2014 10:19 -----
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB,
> Cc:        dfdl-wg at ogf.org
> Date:        30/04/2014 12:25
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
>  ------------------------------
>
>
> Tim
>
> Responses below.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        dfdl-wg at ogf.org,
> Date:        11/04/2014 14:03
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
> Sent by:        dfdl-wg-bounces at ogf.org
>  ------------------------------
>
>
>
> * "2. If a potential point of uncertainty is sometimes** an actual point
> of uncertainty (ock 'implicit') then a discriminator that applies it will
> only ever resolve, or have no effect on, that point of uncertainty. It
> never has an effect on any enclosing point of uncertainty."*
> This could be misinterpreted. The discriminator could evaluate to 'false'
> and thus cause the POI to be resolved negatively ( the component would be
> 'known not to exist' )
>
> SMH: Agree, and I can improve the words here.
>
> 1. and 3. will both apply if an element with ock='fixed' appears as a
> choice branch. Is the POI always an actual POI or never?
>
> SMH: No. There are two independent points of uncertainty, the choice
> branch and the array.
>
> The wording of 3. reads very strangely. 'If a potential point of
> uncertainty is *never* an actual point of uncertainty' begs the question
> 'why is it even a *potential *point of uncertainty?'. The current wording
> follows from our definition of the term 'point of uncertainty':
> "
> *A point of uncertainty occurs in the data stream when there is more than
> one schema component that might occur at that point.*"
>
>
>
>
>
> *Points of uncertainty can be nested. Any one of the following constructs
> is a potential point of uncertainty: 1. An xs:choice 2. All xs:elements in
> an unordered xs:sequence (dfdl:sequenceKind is 'unordered') 3. An optional
> xs:element 4. An array xs:element. 5. All xs:elements in an xs:sequence
> containing one or more floating xs:elements. *
> I think this definition is too broad. It forces us to discuss potential
> POUs that will never be actual POUs according to the first sentence.
>
> *SMH: Yes it does read a bit strangely, but there's a reason for this. If
> we said that ock 'fixed', 'expression' or 'stopValue' are never POUs then
> what does it mean if a discriminator is placed on such an element?  A
> discriminator gets evaluated for each occurrence of an array. For that
> reason we can not let a discriminator within an array leak beyond the array
> - regardless of whether it is a POU or not - otherwise what does that mean
> to enclosing POUs? So even if we said that ock 'fixed', 'expression' or
> 'stopValue' are never POUs we would still need the spec to state that a
> discriminator never leaks beyond them. I think it is clearer to say that a
> discriminator never leaks beyond a potential POU and keep the existing
> definition.  What do others think?*
>
> regards,
>
> Tim Kimber,
> IBM Integration Bus Development (Industry Packs)
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        dfdl-wg at ogf.org,
> Date:        11/04/2014 11:44
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
> Sent by:        dfdl-wg-bounces at ogf.org
>  ------------------------------
>   *248*
> *Discriminators and potential points of uncertainty (Steve) *
> 28/1: Steve to write up a proposal to prevent a discriminator from
> behaving in a non-obvious manner when used with a potential point of
> uncertainty that turns out not to be an actual point of uncertainty.
> 5/2: Steve sent an email to check whether choice branches, unordered
> elements and floating elements should always be actual points of
> uncertainty, as there are times when there is no uncertainty, eg, last
> choice branch; all floating elements found. It was decided that they are
> always actual points of uncertainty. To do otherwise will complicate
> implementations and result in fragile schemas. Steve will proceed with the
> proposal on that basis.
>
> Based on the above, which reflects the email discussion below, here is
> what I propose to resolve this action.
> 1.        If a potential point of uncertainty is *always* an actual point
> of uncertainty (choice branch, element in unordered sequence, floating
> element, ock 'parsed') then a discriminator that applies to it will only
> ever resolve that point of uncertainty. It never has an effect on any
> enclosing point of uncertainty.
> 2.        If a potential point of uncertainty is *sometimes* an actual
> point of uncertainty (ock 'implicit') then a discriminator that applies it
> will only ever resolve, or have no effect on, that point of uncertainty. It
> never has an effect on any enclosing point of uncertainty.
> 3.        If a potential point of uncertainty is *never* an actual point
> of uncertainty (ock 'fixed', 'expression', 'stopValue') then a
> discriminator that applies to it will never have an effect on that point of
> uncertainty. Nor does it ever have an effect on any enclosing point of
> uncertainty.
> I think 1 and 2 are not controversial, but there is an alternative for 3:
>
>  3.   If a potential point of uncertainty is *never* an actual point of
> uncertainty (ock 'fixed', 'expression', 'stopValue') then a discriminator
> that applies to it will never have an effect on that point of uncertainty.
> Instead the discriminator is applied to any enclosing point of uncertainty.
>
> The alternative means that changing an element from (say) ock 'parsed' to
> ock 'expression' has the same effect on a discriminator as changing the
> element to (1,1). The discriminator that applied to it now applies to any
> enclosing pou.
>
> SMH: Afternote: The alternative does not work for the reason given in my
> reply to Tim above.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB,
> Cc:        dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org
> Date:        05/02/2014 12:04
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
>  ------------------------------
>
>
> Thanks Tim, all good points. Comments to your comments.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
>
> From:        Tim Kimber/UK/IBM
> To:        Steve Hanson/UK/IBM at IBMGB,
> Cc:        dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org
> Date:        05/02/2014 11:01
> Subject:        Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
>  ------------------------------
>
>
> A couple of comments below.
>
> regards,
>
> Tim Kimber,
> IBM Integration Bus Development (Industry Packs)
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        dfdl-wg at ogf.org,
> Date:        05/02/2014 10:50
> Subject:        [DFDL-WG] Action 248 (was Thoughts on a discriminator
> scenario)
> Sent by:        dfdl-wg-bounces at ogf.org
>  ------------------------------
>   *248*
> *Discriminators and potential points of uncertainty (Steve) *
> 28/1: Steve to write up a proposal to prevent a discriminator from
> behaving in a non-obvious manner when used with a potential point of
> uncertainty that turns out not to be an actual point of uncertainty.
> 5/2: With Steve
>
> I started on this by reading section 9.3.3 on points of uncertainty, which
> lists the potential PoUs. Here's the list to save getting the spec out.
>
> 1.        An xs:choice branch
>
> 2.        All xs:elements in an unordered xs:sequence (dfdl:sequenceKind
> is 'unordered')
>
> 3.        An optional xs:element
>
> 4.        An array xs:element
>
> 5.        All xs:elements in an xs:sequence containing one or more
> floating xs:elements.
>
> The section then looks at each in turn and gives the circumstances when it
> is an actual PoU or not. As currently written, it is only 3 and 4 where a
> potential PoU might not be an actual PoU. For 1, 2 and 5 it says they are
> always actual PoUs.
>
> But I'm not sure that's correct. A deeper analysis of what is actually
> going on with 1, 2 and 5 says to me that there are times when there might
> not be an actual PoU.
>
> 1. Given that there is no concept in DFDL of optional choice branches,
> then if the last branch is reached then there is no longer a PoU. It must
> be that branch else it is a processing error.
>
> TK: I think of it slightly differently. It is a PoU, even if the branch is
> the only remaining branch. If we say that the final choice branch is not a
> PoU then diagnostics become confused - the parser reports the error code as
> 'error while parsing root/choice/lastBranch/field1' when the correct error
> code would be 'none of the branches of root/choice were found in the data'.
>
> SMH: I see your point. My thinking was that choices have finite branches
> and a choice is (1,1). If I have got to the last branch then I am not one
> of the other branches so I *must* be this one. If there is any other
> possibility then the model is missing a branch, even if it is just one that
> contains an empty sequence with an assert {fn:false()}. In practice of
> course users forget to add that last branch (there's no XSDL equivalent to
> the 'default' branch of a switch/case statement), so yes they could end up
> with an unclear diagnostic.
>
> 2. There can come a point in an unordered sequence when all that can be
> encountered is one element, and if that is (1,1) then there is no longer a
> PoU.
>
> TK: It's still a PoU. The specification says that occursCountKind is
> 'parsed' for all members of an unordered group, so min/maxOccurs do not
> come into play.
>
> SMH: Interesting. The spec says that if a member is optional or an array
> then it must be 'parsed'. If it is (1,1) though it does not have an
> occursCountKind. The specific case I was thinking of is when *all*
> members are (1,1), so when you have one element to go there is no PoU.
> However, the rewrite into a repeating choice has the effect of making
> *everything* 'parsed', which is really the point you are making. So I
> agree with you, it is easier to say that everything is an actual PoU else
> it complicates the rewrite semantic.
>
> 5. If all floating elements are (1,1) and all are encountered, then from
> that point on there are no longer any PoUs due to floating elements.
>
> TK: I suspect that floating elements are somewhat like unordered branches
> - most users will not want min/maxOccurs to affect the parsing of the
> group. Schema validation ( or more complex validation applied in the
> receiving application ) will deal with non-conformances.
>
> SMH: Possibly yes. With something like X12 NTE segments, that is the case.
> But we don't express the floating semantic as a rewrite of the whole
> sequence like we do for unordered, it's more of a per element thing. And if
> that is done dynamically as we go through the sequence, having no PoU can
> result.
>
> I'd like us to get straight on this before I proceed with the action
> proper.
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> ----- Forwarded by Steve Hanson/UK/IBM on 05/02/2014 10:12 -----
>
> From:        Steve Hanson/UK/IBM
> To:        dfdl-wg at ogf.org,
> Date:        27/01/2014 17:39
> Subject:        Fw: Thoughts on a discriminator scenario
>  ------------------------------
>
>
> Been thinking some more on the discriminator scenario below that I mailed
> out before xmas, and discussing it with the IBM DFDL team.
>
> The 'confusing' aspect of the behaviour is that a discriminator within a
> potential PoU will act on a higher level PoU if the potential PoU is not an
> actual PoU. In the example, the array element 'Type1' is not an actual PoU
> for occurrence 1, only for occurrences 2+. So when the discriminator fires
> for occurrence 1 it will resolve a higher level unresolved PoU if one
> exists.
>
> Perhaps the spec should say that a discriminator can't 'leak' beyond the
> potential PoU that encloses it ? If so, then for occurrence 1 the
> discriminator has no effect, and only has an effect for occurrences 2+.
> This makes for more predictable and robust schemas.
>
> We'd need to go through spec section 9.3.3 carefully to see if this does
> not break any of the potential PoUs that are listed.
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> ----- Forwarded by Steve Hanson/UK/IBM on 16/01/2014 09:55 -----
>
> From:        Steve Hanson/UK/IBM
> To:        dfdl-wg at ogf.org,
> Date:        20/12/2013 13:20
> Subject:        Thoughts on a discriminator scenario
>  ------------------------------
>
>
> Take the following schema (simplified) for element Type1 (1,10) being a
> loop for elements A,B,C.  Type 1 does not have an initiator so I need to
> use a discriminator to establish the existence of an occurrence of Type1 so
> that incorrect backtracking does not occur after an error. Because
> occursCountKind is 'implicit', the 1st occurrence is *not* a point of
> uncertainty so the discriminator acts instead on any enclosing point of
> uncertainty, but for 2nd and subsequent occurrences it acts on Type1.  That
> is all working as designed, but I think users find will the 1st occurrence
> behaviour a bit confusing. There are workarounds to avoid the problem, eg,
> use occursCountKind 'parsed' or split Type1 into two as (1,1) and (0,9). I
> think this is worth documenting in a tutorial as this is quite subtle stuff.
>
>    <xs:element name="Type1" maxOccurs="10" dfdl:occursCountKind="implicit"
> >
>                    <dfdl:discriminator test="{fn:exists(A)}" />
>            <xs:complexType>
>                    <xs:sequence>
>                            <xs:element name="A" dfdl:initiator="A:" ... />
>                           <xs:element name="B" dfdl:initiator="B:" ... />
>                           <xs:element name="C" dfdl:initiator="C:"... />
>                   </xs:sequence>
>            </xs:complexType>
>
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> dfdl-wg at ogf.org
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> dfdl-wg at ogf.org
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> dfdl-wg at ogf.org
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> dfdl-wg at ogf.org
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20141125/5c619bfc/attachment-0001.html>


More information about the dfdl-wg mailing list