[DFDL-WG] clarification - argument expression to fn:count() function

Steve Hanson smhdfdl at gmail.com
Thu Jan 11 05:06:39 PST 2024


Hi Mike

I would go along with the restrictions in Section 35. They were deliberate.
It's an SDE if fn:count() target is not optional or array.

I can make the call today, so we can discuss further then if necessary.

Regards
Steve

On Tue, Jan 9, 2024 at 8:58 PM Mike Beckerle <mbeckerle at apache.org> wrote:

> The description of fn:count in section 18.5.2.5 says fn:count() can be
> called on a node-sequence. It does not seem to require an array/optional
> path. This seems to be suggesting that if you have say....
>
> <element name="b" type="xs:int"/>
> <element name="a" type="xs:int"/>
> <element name="b" type="xs:int" minOccurs="0"/>
>
> That one could write fn:count(b) and it should return 1 or 2. The Daffodil
> project calls this a "query-style" expression, as opposed to just a basic
> path expression.
>
> Furthermore, in the first paragraph of section 18.5.2.5 it says "(Note
> that DFDL v1.0 does not support sequences of length > 1 as the final
> results of expressions.)" which suggests that some expressions can
> constructively return node sequences of length > 1 as intermediate results.
> Presumably fn:count(b) above is such a situation, as the expression 'b'
> returns 2 nodes.
>
> The above all suggests to me that fn:count(a) would be legal where 'a' is
> a scalar. This is just always 1 of course.
>
> But Section 35 says:
>
> Expression value is not single node
>
> §  Most DFDL expression contexts require an expression to identify a
> single node, not an array (aka sequence of nodes). There are a few
> exceptions such as the fn:count(…) function, where the path expression must
> be to an array or optional element.
>
> o    Expression value is not array element or optional element.
>
> §  Some DFDL expression contexts require an array or an optional element.
>
> §  Example: The fn:count(...) function argument must be to an array or
> optional element. It is a Schema Definition Error if the argument
> expression is otherwise.
>
>
> Experience at the Daffodil project is that allowing fn:count argument
> expression to be a non-array non-optional element, where fn:count would
> always return 1, just hides errors that are very hard to find, and this
> situation comes up often as a schema is written. Usually the expression to
> fn:count is initially correct with an array/optional as the argument, but
> element nesting evolves, and the paths need updating, but end up referring
> not to the array/optional element, but that name is now of a scalar
> enclosing element of the array, so the fn:count is always 1, and the schema
> is incorrect because the expression is not doing what is intended, but no
> error is detected. This is then quite hard to isolate and fix.
>
> A concrete example of this experience is you start with a schema like:
>
> <element name="record" maxOccurs="unbounded">
>     <complexType>
>        <sequence>
>            .... elements of the record
>
> But then you need the valueLength of the whole array of all the records,
> to store the length for unparsing, so you revise this to:
>
> <element name="record">
>     <complexType>
>         <sequence>
>             <element name="item" maxOccurs="unbounded"/>
>                 <complexType>
>                      <sequence>
>                         .... elements of each record 'item'.
>
>
> And now, paths you had like fn:count(foo/bar/record) are no longer to an
> array, they are to a scalar, so always return 1. This is decidedly
> unhelpful in a large schema.
> It is far better if fn:count(foo/bar/record) becomes an SDE because record
> is now scalar.
>
> So the clarification I'm seeking is whether section 35 was just missed
> when updates were made about this node-sequence stuff, or if it is
> reasonable to implement the restrictions in Section 35.
>
> I am  biased. I want the restrictions in Section 35, but this was muddy
> enough that I thought we should get a clarification first.
>
> Daffodil already doesn't implement any query-style expressions so the
> fn:count(b) example above would be an SDE in Daffodil.
>
> Mike Beckerle
> Apache Daffodil PMC | daffodil.apache.org
> OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> Owl Cyber Defense | www.owlcyberdefense.com
>
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at lists.ogf.org
>   https://lists.ogf.org/mailman/listinfo/dfdl-wg
>


-- 
Regards
Steve
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 8576 bytes
Desc: not available
URL: <https://lists.ogf.org/pipermail/dfdl-wg/attachments/20240111/a1114f6a/attachment.txt>


More information about the dfdl-wg mailing list