[DFDL-WG] clarification - argument expression to fn:count() function

Mike Beckerle mbeckerle at apache.org
Tue Jan 9 12:57:39 PST 2024


The description of fn:count in section 18.5.2.5 says fn:count() can be
called on a node-sequence. It does not seem to require an array/optional
path. This seems to be suggesting that if you have say....

<element name="b" type="xs:int"/>
<element name="a" type="xs:int"/>
<element name="b" type="xs:int" minOccurs="0"/>

That one could write fn:count(b) and it should return 1 or 2. The Daffodil
project calls this a "query-style" expression, as opposed to just a basic
path expression.

Furthermore, in the first paragraph of section 18.5.2.5 it says "(Note that
DFDL v1.0 does not support sequences of length > 1 as the final results of
expressions.)" which suggests that some expressions can constructively
return node sequences of length > 1 as intermediate results. Presumably
fn:count(b) above is such a situation, as the expression 'b' returns 2
nodes.

The above all suggests to me that fn:count(a) would be legal where 'a' is a
scalar. This is just always 1 of course.

But Section 35 says:

Expression value is not single node

§  Most DFDL expression contexts require an expression to identify a single
node, not an array (aka sequence of nodes). There are a few exceptions such
as the fn:count(…) function, where the path expression must be to an array
or optional element.

o    Expression value is not array element or optional element.

§  Some DFDL expression contexts require an array or an optional element.

§  Example: The fn:count(...) function argument must be to an array or
optional element. It is a Schema Definition Error if the argument
expression is otherwise.


Experience at the Daffodil project is that allowing fn:count argument
expression to be a non-array non-optional element, where fn:count would
always return 1, just hides errors that are very hard to find, and this
situation comes up often as a schema is written. Usually the expression to
fn:count is initially correct with an array/optional as the argument, but
element nesting evolves, and the paths need updating, but end up referring
not to the array/optional element, but that name is now of a scalar
enclosing element of the array, so the fn:count is always 1, and the schema
is incorrect because the expression is not doing what is intended, but no
error is detected. This is then quite hard to isolate and fix.

A concrete example of this experience is you start with a schema like:

<element name="record" maxOccurs="unbounded">
    <complexType>
       <sequence>
           .... elements of the record

But then you need the valueLength of the whole array of all the records, to
store the length for unparsing, so you revise this to:

<element name="record">
    <complexType>
        <sequence>
            <element name="item" maxOccurs="unbounded"/>
                <complexType>
                     <sequence>
                        .... elements of each record 'item'.


And now, paths you had like fn:count(foo/bar/record) are no longer to an
array, they are to a scalar, so always return 1. This is decidedly
unhelpful in a large schema.
It is far better if fn:count(foo/bar/record) becomes an SDE because record
is now scalar.

So the clarification I'm seeking is whether section 35 was just missed when
updates were made about this node-sequence stuff, or if it is reasonable to
implement the restrictions in Section 35.

I am  biased. I want the restrictions in Section 35, but this was muddy
enough that I thought we should get a clarification first.

Daffodil already doesn't implement any query-style expressions so the
fn:count(b) example above would be an SDE in Daffodil.

Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 7737 bytes
Desc: not available
URL: <https://lists.ogf.org/pipermail/dfdl-wg/attachments/20240109/a934589c/attachment.txt>


More information about the dfdl-wg mailing list