[DFDL-WG] Direct dispatch choice clarifications

Mike Beckerle mbeckerle.dfdl at gmail.com
Fri Aug 16 13:54:43 EDT 2013


I am ok with leaving the name elementID as is purely due to our schedule.
However, I really do think it would be good to change it now.

I think it is that name which is creating all the confusion, and we really
ought to be more careful.

I think the important thing is to stop thinking of elementID as any sort of
XSD/DFDL language ID. It is not. It is a DFDL String Literal, meaning it's
value is matched against data.

For example: It could be dfdl:elementID="%NUL;%NUL;%NUL;%NUL;" meaning 4
null chars. So that when the choiceBranchRef="{ ../hdr/tag }" evaluates to
a 4 character string, then if they are all nul the branch is taken by
direct dispatch. This character NUL isn't allowed in any namespaced
identifiers in XSD or DFDL. It isn't even allowed in XML.

If I had dfdl:termnator="%NUL;" you wouldn't ask "what namespace is that
%NUL; in?"

So I think elementID is not, in anyway, a namespace-qualified identifier
any more than a delimiter is.

So I think uniqueness of the elementID within the alternatives of a choice
should be the only requirement. This stuff about unique within a namespace
if on a global element should be dropped.

The other argument for changing the name from elementID to something more
choice-dispatch specific is that this workgroup seems to be speculating
about using elementID for some sort of future wildcard-oriented feature. I
think we should reserve names we want for that future by NOT using them
now. If we use them now, we lock down part of the semantics in a way which
may just not work properly with some future addition to DFDL. Do we really
want elementID which is about direct dispatch choices but is not an
identifier, and then in the future have dfdl:wildcardElementID which is
about wildcards and IS an identifier. That's a mess.

If you think elementID is a really good name for some future wildcard
feature, then you should be advocating to NOT use it now for direct
dispatch choices. That is, unless you have a complete design in mind for
the wildcard stuff and are confident that elementID can be overloaded in a
backward compatible way. I've seen nothing like this articulated.

...mike



On Fri, Aug 16, 2013 at 12:32 PM, Steve Hanson <smh at uk.ibm.com> wrote:

> IBM has started to implement choiceBranchRef, in so far as the XML schemas
> for DFDL have been updated to include elementID and we have regenerated all
> our model code. I'd like to stick with elementID please.
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        Steve Hanson/UK/IBM at IBMGB,
> Cc:        Tim Kimber/UK/IBM at IBMGB, "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, "
> dfdl-wg-bounces at ogf.org" <dfdl-wg-bounces at ogf.org>
> Date:        16/08/2013 17:09
> Subject:        Re: [DFDL-WG] Direct dispatch choice clarifications
> ------------------------------
>
>
>
> I  think we need to stay out of identifiers and namespaces and qnames for
> this feature.
>
> The elementID should have to be locally unique within the alternatives of
> the choice that is dispatching to it. If that elementID lives on a global
> element declaration, that means nothing at all alone. In the context of a
> choice that has an element ref to that global element, then it has to be
> unique within the arms of that choice.
>
> This means it is possible to have two global element decls which have
> elementID="X", and there is no conflict unless they are both used via
> element refs from the same choice.
>
> I think this handles every use case I know of. The "namespace" requirement
> for elementID is only "unique within alternatives of a choice".
>
> Since nobody has implemented this feature yet, I would posit that we
> should change elementID to something less suggestive of an XSD
> identifier/QName-ish thing. Such as choiceDispatchTag.
>
> ...mike
>
>
>
> On Fri, Aug 16, 2013 at 5:18 AM, Steve Hanson <*smh at uk.ibm.com*<smh at uk.ibm.com>>
> wrote:
> You are right about the xs:QName constructor, it takes a string which is
> prefix plus name. If we supplied a dfdl:QName constructor that took a URI
> and a name, that would simplify things.
>
> For example, the choiceBranchRef expression for SWIFT would be as below.
> (The element name is always Document and the namespace is used to
> distinguish the different messages).
>
> {dfdl:QName(fn:concat(fn:concat('urn:swift:xsd:fin.',
> /FinMessage/Block2/MessageType), ".2011"), 'Document')}
>
> If we stick with elementID for DFDL 1.0, then I agree with your 3 bullets,
> but not the defaulting to element name as it is setting a behaviour that
> may not be where we want to go for 2.0.
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        15/08/2013 20:34
> Subject:        Re: [DFDL-WG] Direct dispatch choice clarifications
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> The other problem with using a QName is that it involves using namespace
> prefixes. That means that there needs to be a mapping between prefixes and
> namespace URIs. I can see that getting very problematic if the choice group
> is located in a different xsd from the global elements that it references.
>
> I think we should
> - keep the elementID as a simple string
> - insist that all branches of a choice have different elementIDs
> - remove the global uniqueness constraint, for the reasons explained in
> this email chain
>
> I think it would be easier for modellers if the elementID defaulted to the
> local name of the element. I understand that name clashes can, in
> principle, occur. If users want to avoid that then they can be explicit
> about elementIDs and they could even define a  naming convention for their
> elementIDs to make them look very much like QNames. Sounds like a lot of
> work, but DFDL models that are complex enough to need that approach will
> often be code-genned anyway.
>
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        15/08/2013 18:37
> Subject:        Re: [DFDL-WG] Direct dispatch choice clarifications
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> The more I think it through, the more I see the use of a string elementID
> (or local element name) causing problems when/if we extend to support
> xs:any in the future.  In my original proposal for direct dispatch choice I
> proposed that choiceBranchRef returned a QName which therefore
> automatically selected the element, and coped with any namespace issues.
> The problem with QName though is that the expression to build it can become
> a big case statement negating some of the performance gain, if there is no
> automap way of getting from the 'tag' to the QName. Hence why we introduced
> elementID.
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        Suman Kalia <*kalia at ca.ibm.com* <kalia at ca.ibm.com>>,
> Cc:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>, *dfdl-wg-bounces at ogf.org*<dfdl-wg-bounces at ogf.org>,
> Tim Kimber/UK/IBM at IBMGB
> Date:        15/08/2013 17:20
> Subject:        Re: [DFDL-WG] Direct dispatch choice clarifications
>  ------------------------------
>
>
> Suman, comments to yours in pink
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
>
> From:        Suman Kalia <*kalia at ca.ibm.com* <kalia at ca.ibm.com>>
> To:        Tim Kimber/UK/IBM at IBMGB,
> Cc:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>, *dfdl-wg-bounces at ogf.org*<dfdl-wg-bounces at ogf.org>
> Date:        15/08/2013 15:31
> Subject:        Re: [DFDL-WG] Direct dispatch choice clarifications
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> comments in green
>
> Suman Kalia
> IBM Canada Lab
> WMB Toolkit Architect and Development Lead
> Tel: *905-413-3923* <905-413-3923> T/L 313-3923
> Email: *kalia at ca.ibm.com* <kalia at ca.ibm.com>
>
> For info on Message broker *
> **
> http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
> *<http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html>
>
>
>
>
>
> From:        Tim Kimber <*KIMBERT at uk.ibm.com* <KIMBERT at uk.ibm.com>>
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        08/15/2013 09:58 AM
> Subject:        Re: [DFDL-WG] Direct dispatch choice clarifications
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> See comment in <TK> tags.
>
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        15/08/2013 12:57
> Subject:        [DFDL-WG] Direct dispatch choice clarifications
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> Looking at this in more detail prior to writing up behaviour for section
> 9, there are a couple of things missing from the spec or that need
> clarification:
>
> 1) Description of elementID property should say that empty string is not
> allowed (this was in the erratum).
>
> 2) Should say that an elementID on an elementRef overrides any elementID
> on the global element (this was in the erratum).
>
> 3) Section 15.1.2 says that is a schema definition error if the elementId
> values of global elements are not unique within a given namespace. I don't
> see where namespace comes into this, the elementID is just a string so
> surely it needs to be unique across namespaces? (Strictly elementID needs
> only to be unique across the global elements involved in each specific
> choice, but it was minuted that *global* uniqueness was desirable to
> allow future xs:any support).
> <TK>
> In XML Schema, an xs:any does not, in general, match all global elements.
> The 'namespace' attribute can narrow the set to elements from a specified
> list of namespaces. There is no way in XML Schema 1.0 to further narrow the
> xs:any,  So the rule is designed to ensure that future usage of xs:any *when
> a single namespace is specified and processContents!='skip' *does not
> throw up schema definition errors. However...I note that XML Schema 1.1
> allows a new way to narrow the scope of an xs:any ( by specifying a list of
> not-included QNames ). My feeling is that the unique-within-namespace check
> is fragile.
> </TK>
>
> <SK>
> As per my recollection, we put the uniqueness rules across namespaces to
> accommodate chameleon namespaces.   Consider a global element  E1  in
> notarget namespace having   elementID  E1_ID  and is included in  2
>  schemas with different target namespaces say TNS1 and TNS2..  Consider a
> choice containing  element references TNS1:E1 and TNS2:E1,  in order to
> disambiguate these elements in the context choice, the element ID has to be
> unique in the context of namespace.   This is somewhat an edge case but can
> come more prevalent when the support for xs:any is provided.
> </SK>
>
> SMH: In the choice, the element refs to TNS1:E1 and TNS2:E1 *both* have
> an elementID string 'E1_ID' from the original E1 global element. In the
> choice, this is an error because the elementID is not unique in the choice
> (we match the result of the choiceBranchRef expression, which returns a
> string not a QName). The only way round this is to override the elementID
> on one of the element refs (see 2 above) and set a value that is unique.
> That then works. But that does not help the (future) xs:any scenario, where
> there is no element ref to carry the override. I think the chameleon
> namespace scenario will always cause a problem with xs:any because our
> elementIDs are strings not QNames.
>
> I think we should leave a global element uniqueness check out of DFDL 1.0.
> It doesn't actually future proof anything, as once I use xs:any the whole
> nature of the xsd changes.
>
> 4) Spec does not explicitly say that when choiceBranchRef is present each
> branch of the choice must have an elementID. This must be the case, as
> otherwise a choice branch will never be accessible.
>
> 5) Tim has suggested that if an element was silent about elementID, the *
> local* name of the element could be used instead. So conceptually an
> element would have an 'effective elementID'.  This makes modelling easier
> if the 'tag' in the data is the same as the element name.
> <TK>...or if the element name is derivable from the 'tag' using a simple
> XPath expression</TK>   SMH: True.
> The validation checks would need to ensure that the set of 'effective
> elementIDs' was unique; for the global element check as currently specified
> (see 3) this would mean that all global elements must have unique local
> names, unless an elementID is carried - I think this is too limiting.
>
> SMH: While defaulting to the local name sounds attractive, I can't
> convince myself that it won't cause problems if we add xs:any in DFDL 2.0
> and multiple/chameleon namespaces are involved.
>
> SMH: Conclusion: For DFDL 1.0 we take the conservative position and say
> that you must specify an elementID on an element that is used in a choice
> with choiceBranchRef and it must be unique in the context of the choice
> only. No global uniqueness check is made.
>
>
> From minutes of 17th April 2012.
> *145*
> *Provide a 'dispatch' way of discriminating a choice for better
> performance of the envelope/payload use case (Steve, Mike, Suman)*
> 12/7: See minutes. Need to choose a proposal and flesh out.
> 19/07:  Waiting for proposals
> 26/07:  Waiting for proposals
> 16/08: Waiting for proposals. Suman added to action.
> ...
> 1/11:  Steve to send a proposal
> ...
> 21/03: Steve has sent a proposal. Mike has sent a counter proposal. Steve
> to respond.
> 28/03: Steve has sent a revised proposal.  Review for discussion next
> week. Ensure proposal handles Mike's scenario where tag value to branch
> mapping is not 1-1.
> 05/04: Discussed Mike's review comments and Suman's concerns. Agreed that
> name should be elementID, should be a single DFDL String Literal value, and
> that matching of choiceBranchRef expression result should only be against
> elementID to avoid QName v String confusion. Steve to recirculate with a
> schema example.
> 17/04: *Closed.* Discussion on whether the choiceBranchRef expression
> should retiurn xs:string or something else. Agreed on xs:string. Discussed
> whether elemenID should be a pure xs:string or a DFDL String Literal. For
> consistency with other DFDL properties it should be a string literal, but
> raw byte entities and character classes should be disallowed to avoid
> complications. Discussed scope of uniqueness of elementIDs. Agreed that
> uniqueness is both local to a choice, and across all global elements in the
> same namespace (the latter is not strictly needed right now but
> accommodates any future addition of xs:any). Agreed that elementID should
> be on global element, local element, and element ref (in which case it
> overrides any elementId on the global element, which is ok as the property
> does not follow the usual scoping rules). Errata taken.
>
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list*
> **dfdl-wg at ogf.org* <dfdl-wg at ogf.org>*
> **https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list*
> **dfdl-wg at ogf.org* <dfdl-wg at ogf.org>*
> **https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
> --
> dfdl-wg mailing list*
> **dfdl-wg at ogf.org* <dfdl-wg at ogf.org>*
> **https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list*
> **dfdl-wg at ogf.org* <dfdl-wg at ogf.org>*
> **https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>  dfdl-wg mailing list
>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> --
>   dfdl-wg mailing list
>   *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>   *https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
>
>
> --
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | *
> www.tresys.com* <http://www.tresys.com/>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>



-- 
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130816/256c7628/attachment-0001.html>


More information about the dfdl-wg mailing list