[DFDL-WG] Action 269: clarification on dfdl:occursIndex() function

Mike Beckerle mbeckerle.dfdl at gmail.com
Thu Sep 4 13:28:49 EDT 2014


I understand the principle you are suggesting, but this is a dfdl-specific
function whose name shares concept with dfdl properties.

It feels very inconsistent to say that scalar elements disregard
dfdl:occursCountKind and dfdl:occursCount, but at runtime they respond to
dfdl:occursIndex(...) with a value.

My view: where we are using XPath functions directly, keeping their
semantics makes some sense. Where we are introducing new dfdl functions I
believe the semantics should be as strict and narrow and schema-aware as
possible.

My bias comes from discovering the hard way that XPath is full of
compromises that only are justifiable if you have to consider execution of
the XPath against documents where there is no schema available.

In DFDL we're never in that situation, so we can simply do much better at
helping users create correct expressions that do what is intended.

My favorite whipping post is fn:nilled(...). In XPath fn:nilled(...) always
returns a value or empty node set, even if used in entirely non-sensical
ways.  Consider fn:nilled($x) where x is a DFDL variable.  Note that DFDL
variables can only be simple type values. They can't be nilled meaningfully
ever. But fn:nilled says if called on a non-element value returns empty
node list, so technically fn:nilled($x) is a valid expression.

I can accept this because fn:nilled is already documented as having this
behavior. But it is a miserable semantics when you have schema information
available that can inform you where calling fn:nilled is meaningful versus
when it is simply always going to produce empty-node-list.

I'm reworking the daffodil expression language implementation to discard
Saxon and do our own. I'm planning some flags for expression-strictness
checking so that users can either have robust schema-aware semantics, or
can insist on XPath's gooey semantics.




Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>



On Wed, Sep 3, 2014 at 4:28 PM, Tim Kimber <KIMBERT at uk.ibm.com> wrote:

> I think it would be better to allow dfdl:occursIndex() to be used on
> non-array elements. XML Schema and XPath do not distinguish between 'array'
> and 'non-array'. DFDL does, but only in the representation properties. I
> think the parts of the specification that build purely on XSDL and XPath
> should conform to the norms of those languages.
>
> regards,
>
> Tim Kimber,
> Technical Lead for IBM Integration Bus Healthcare Pack
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        Steve Hanson/UK/IBM at IBMGB
> Cc:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, "dfdl-wg-bounces at ogf.org"
> <dfdl-wg-bounces at ogf.org>
> Date:        03/09/2014 18:21
> Subject:        Re: [DFDL-WG] Action 269: clarification on
> dfdl:occursIndex()        function
> Sent by:        dfdl-wg-bounces at ogf.org
> ------------------------------
>
>
>
> I need some further clarification on this.
>
> Current description says "This function may be used on non-Array
> elements." In the old no-arg version, that meant that you didn't have to
> hang this directly on the array, but could be on some contained child
> element.
>
> I believe what we mean now is this:
>
> dfdl:occursIndex(0) = the current element itself must be an array (SDE
> otherwise), return its occurs index.
> dfdl:occursIndex(1) = the parent element must be an array (SDE otherwise),
> return its occurs index.
> ...
> dfdl:occursIndex(N) = the nth-parent element must be an array (SDE
> otherwise) return its occurs index.
>
>  Specifically note that 0 is a valid argument (I think we just neglected
> this case in prior discussion).
>
> The alternative to SDE if the path is incorrect here is to mask the error
> by returning 1. But I don't think this is helpful really. At least I can't
> think of a use case for why this particular polymorphism would be useful.
> An SDE is the more conservative choice.
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
>
>
>
> On Wed, Sep 3, 2014 at 11:27 AM, Steve Hanson <*smh at uk.ibm.com*
> <smh at uk.ibm.com>> wrote:
> Mark
>
> Yes, that makes more sense. The existing wording says:  dfdl:occursIndex() Returns
> the position of the current item within an array as an xs:long.
>
> The first element is at position 1.
>
> The function may be used on non-array elements.
>
>
>
>
> So I blindly used xs:long for the argument type as well.  On consideration
> the argument type should be xs:nonNegativeInteger and so should the return
> type.
>
> Could be xs:positiveInteger, but as you say it's not used anywhere in the
> spec (no XPath constructor, etc).
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Mark Frost/UK/IBM
> To:        Steve Hanson/UK/IBM at IBMGB,
> Cc:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*
> <mbeckerle.dfdl at gmail.com>>, "*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>" <
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>>, *dfdl-wg-bounces at ogf.org*
> <dfdl-wg-bounces at ogf.org>
> Date:        03/09/2014 13:35
> Subject:        Re: [DFDL-WG] Action 269: clarification on
>  dfdl:occursIndex()        function
>  ------------------------------
>
>
> Would a parameter type xs:positiveInteger (right value space) or
>  xs:NonNegativeInteger (less wrong value space, and already used in DFDL)
>  be more appropriate?
>
> Regards,
> Mark
> ------------------------------
>   *Mark Frost *  IBM United Kingdom
>  Software Engineer  Hursley Park  IBM DFDL, IBM Integration Bus
>  Winchester       SO21 2JN  Phone: +44 (0)1962 817009  England  e-mail:
> *frostmar at uk.ibm.com* <frostmar at uk.ibm.com>
>
>
>
>
>
>
> From:        Steve Hanson/UK/IBM at IBMGB
> To:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*
> <mbeckerle.dfdl at gmail.com>>
> Cc:        "*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>" <*dfdl-wg at ogf.org*
> <dfdl-wg at ogf.org>>, *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
> Date:        03/09/2014 13:16
> Subject:        Re: [DFDL-WG] Action 269: clarification on
>  dfdl:occursIndex()        function
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> Closed.  After discussion, settled on an xs:long argument being a 1 based
> count up the parent axis. It is a processing error if the argument reaches
> beyond the root. It is a schema definition error if the argument is <= 0. *Erratum
> 2.167 created.* *To be included in draft r23.*
>
> Regards
>
> Steve Hanson
> Architect, *IBM DFDL*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*
> <mbeckerle.dfdl at gmail.com>>
> To:        "*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>" <*dfdl-wg at ogf.org*
> <dfdl-wg at ogf.org>>,
> Date:        02/09/2014 15:21
> Subject:        Re: [DFDL-WG] Action 269: clarification on
> dfdl:occursIndex()        function
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> I am currently implementing functionality in the area where
> dfdl:occursIndex( path ) will be in the Daffodil implementation.
>
> It doesn't look very hard to put in a specific argument that is restricted
> to be upward only and which identifies exactly which array.
>
> This function is a special case in any DFDL runtime no matter what, and
> will require a fair bit of test attention for any implementation so making
> it simpler by allowing only the innermost array to be examined isn't saving
> a lot of work really.
>
> Question: is a crazy path like ../foo/../../bar/.././.. (which just so
> happens to be equivalent to ../../..), a schema definition error?
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the *OGF Intellectual Property Policy*
> <http://www.ogf.org/About/abt_policies.php>
>
>
>
> On Tue, Sep 2, 2014 at 9:28 AM, Mike Beckerle <*mbeckerle at tresys.com*
> <mbeckerle at tresys.com>> wrote:
>
> ________________________________
> From: Steve Hanson [*smh at uk.ibm.com* <smh at uk.ibm.com>]
> Sent: Monday, September 01, 2014 11:08 AM
> To: Mike Beckerle
> Subject: Fw: [DFDL-WG] Action 269: clarification on dfdl:occursIndex()
> function
>
> 269
>        dfdl:occursIndex() function (Steve)
> 29/7: Clarify behaviour and decide whether an argument is needed to make
> it context independent. Noted that fn:position() also does not take an
> argument, but it returns position in current sequence and not position in
> array.
> 4/8: With Steve but consensus is that an argument is needed.
> 26/8: Mike to review. If an argument is needed, we should make sure this
> is in next published spec as the function is used in MIL-STD-2045 schemas.
>
> Regards
>
> Steve Hanson
> Architect, IBM DFDL<
> *http://www.ibm.com/developerworks/library/se-dfdl/index.html*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>>
> Co-Chair, OGF DFDL Working Group<*http://www.ogf.org/dfdl/*
> <http://www.ogf.org/dfdl/>>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com><mailto:*smh at uk.ibm.com* <smh at uk.ibm.com>
> >
> tel:*+44-1962-815848* <%2B44-1962-815848>
> ----- Forwarded by Steve Hanson/UK/IBM on 01/09/2014 16:03 -----
>
> From:        Steve Hanson/UK/IBM
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Cc:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
> Date:        31/07/2014 10:57
> Subject:        Re: [DFDL-WG] clarification on dfdl:occursIndex() function
>
> ________________________________
>
>
> Note:  The reason why fn:position() takes no argument is because it gives
> position in the current sequence. Which is also why we have a separate DFDL
> specific function dfdl:occursIndex() which is intended for giving position
> in enclosing array. I think we need an argument that targets the specific
> enclosing array.
>
> Regards
>
> Steve Hanson
> Architect, IBM DFDL<
> *http://www.ibm.com/developerworks/library/se-dfdl/index.html*
> <http://www.ibm.com/developerworks/library/se-dfdl/index.html>>
> Co-Chair, OGF DFDL Working Group<*http://www.ogf.org/dfdl/*
> <http://www.ogf.org/dfdl/>>
> IBM SWG, Hursley, UK
> *smh at uk.ibm.com* <smh at uk.ibm.com><mailto:*smh at uk.ibm.com* <smh at uk.ibm.com>
> >
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        28/07/2014 23:51
> Subject:        Re: [DFDL-WG] clarification on dfdl:occursIndex() function
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
> ________________________________
>
>
>
> I agree that it is inconvenient for the 'nearest array parent' to be
> inaccessible. However, experience with discriminators makes me fearful of
> any rule that includes the phrase 'nearest enclosing' :-)
> I think at least one other DFDL function allows the target of the function
> to be specified as an argument, but insists that the argument must be in
> the dynamic scope of the element ( i.e. its parent/grandparent etc ). I
> would be much happier with that solution for occursCountIndex().
>
> I can think of a use case where it may be useful to get a consistent
> behaviour for occursCountIndex(). If a DFDL schema is generated from some
> other data format description then the model generation code may want to
> refer to the nth occurrence of something else in the model, where n is the
> occurs index of the current element - regardless of whether this particular
> element is an array.
>
> regards,
>
> Tim Kimber,
>
>
>
>
> From:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*
> <mbeckerle.dfdl at gmail.com>>
> To:        "*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>" <*dfdl-wg at ogf.org*
> <dfdl-wg at ogf.org>>
> Date:        28/07/2014 23:27
> Subject:        [DFDL-WG] clarification on dfdl:occursIndex() function
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
> ________________________________
>
>
>
> This function says it can be called on non-array elements. However, it
> does not say what the result is.
>
> If called when "." is not itself an array element there are only two
> possible behaviors consistent with the fact that it is explicitly allowed
> on non-array elements.
>
> The result has to be either
>
> (a) 1
> (b) the occursIndex of the nearest enclosing array parent, or 1 if there
> is no enclosing array parent.
>
> I claim (a) is fairly pointless. You will just end up having to create
> newVariableInstances to carry the array current index downward into
> expressions.
>
> I cannot think of a use case where one would want to call occursIndex()
> polymorphically, i.e., where you want a number in the case of an array, but
> 1 otherwise.
>
> So (b) is the preferable behavior.
>
>
> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
> *www.tresys.com* <http://www.tresys.com/><*http://www.tresys.com/*
> <http://www.tresys.com/>>
> Please note: Contributions to the DFDL Workgroup's email discussions are
> subject to the OGF Intellectual Property Policy<
> *http://www.ogf.org/About/abt_policies.php*
> <http://www.ogf.org/About/abt_policies.php>>
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
> 3AU--
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list
> *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>  dfdl-wg mailing list
>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*
> <https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140904/20dd36d8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 360 bytes
Desc: not available
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140904/20dd36d8/attachment-0001.gif>


More information about the dfdl-wg mailing list