[DFDL-WG] is this legal

Steve Hanson smh at uk.ibm.com
Fri Mar 1 05:50:39 EST 2013


What does 'current element declaration' mean?  Typically I would want to 
use dfdl:occursCount() from an element that is not an array, to get the 
count of an element that is.  And if used when parsing from within an 
array, it can only give the current count which might change as parsing 
proceeds.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Tim Kimber/UK/IBM at IBMGB
To:     dfdl-wg at ogf.org, 
Date:   01/03/2013 10:15
Subject:        Re: [DFDL-WG] is this legal
Sent by:        dfdl-wg-bounces at ogf.org



Interesting thread. 

I agree that the xsd is a valid DFDL xsd. No problem with same-named 
siblings as long as they obey XML Schema rules. 

Steve has raised a related question : what is dfdl:occursCount() for? How 
does it differ from XPath's fn:count(). We could define it like this: 
- fn:count() returns the count of all occurrences from all element 
declarations. 
- dfdl:occursCount() returns the number of occurrences of the current DFDL 
array. That means the current element declaration, excluding occurrences 
from same-named previousi siblings. 

Why is this useful? I don't know. But I do know that a DFDL array involves 
exactly one element declaration, whereas fn:count() can involve two or 
more element declarations. Furthermore, the DFDL properties ( especially 
dfdl:occursCountKind ) might be different on the two same-named element 
declarations, so it is possible that the author of the DFDL schema might 
want to treat them differently. 

It may sound unlikely that two element declarations will have the same 
name and different DFDL array properties - but I reckon I could invent 
some plausible scenarios where it could happen. 

regards,

Tim Kimber, DFDL Team,
Hursley, UK
Internet:  kimbert at uk.ibm.com
Tel. 01962-816742 
Internal tel. 37246742




From:        Steve Hanson/UK/IBM at IBMGB 
To:        Suman Kalia <kalia at ca.ibm.com>, 
Cc:        dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org 
Date:        01/03/2013 08:56 
Subject:        Re: [DFDL-WG] is this legal 
Sent by:        dfdl-wg-bounces at ogf.org 



Additionally UPA rules apply. Your example is fine as long as first "foo" 
and "bar" are not minOccurs '0'. 

Using your example, in standard XPath the path expression "foo" would 
return a sequence of length 2. 
A more interesting example is: 

<sequence>
 <element name="foo" type="int" dfdl:lengthKind="explicit" 
dfdl:length="1"/ minOccurs="2" maxOccurs="2">
  <element name="bar" type="int" dfdl:lengthKind="explicit" 
dfdl:length="1" minOccurs="0"/>
 <element name="foo" type="int" dfdl:lengthKind="explicit" 
dfdl:length="1"/>
</sequence>

In standard XPath the path expression would now return a sequence of 
length 3, as it would just lift the 3 occurrences from the infoset. Note 
they could all be adjacent if "bar" was not in the data. 

Given the examples, I don't see how a DFDL path expression can distinguish 
between the different occurrences of elements with the same name. There is 
no way in XPath to ask for a count of the number of element occurrences 
that match a specific element declaration, because there is no way in the 
language to identify such an element. 

The DFDL spec in section 23 says "DFDL expressions never return 
node-sequences having more than one node. DFDL expressions either return a 
simple value, a node sequence containing exactly one node/value, or an 
empty node sequence." and "The result of evaluating the expression must be 
a single atomic value of the type expected by the context, and it is a 
schema definition error otherwise. Some XPath expressions naturally return 
a sequence of values, and in this case it is also schema definition error 
if an expression returns a sequence containing more than one item". That 
talks about what is ultimately returned by a DFDL expression. Later it 
says "(Note that DFDL v1.0 does not support sequences of length > 1.)". 
And says "DFDL implementations may use off-the-shelf XPath 2.0 processors, 
but will need to pre-process DFDL expressions to ensure that the behaviour 
matches the DFDL specification:  Wrap path locations in a call to 
fn:exactly-one() except when the path location occurs within certain 
functions which operate on arrays". We also said on a recent WG call that 
dfdl:occursCount() is allowed on non-arrays. 

If the real requirement here is that a DFDL expression should not return a 
sequence > length 1, then is there a problem with allowing intermediate 
steps to return sequences > length 1 as long as the final result is not > 
1 ?  Then, couldn't we drop dfdl:occursCount() and just use fn:count() ? 
Are we just making things hard for implementers? 

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 



From:        Suman Kalia <kalia at ca.ibm.com> 
To:        Mike Beckerle <mbeckerle.dfdl at gmail.com>, 
Cc:        dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org 
Date:        01/03/2013 01:11 
Subject:        Re: [DFDL-WG] is this legal 
Sent by:        dfdl-wg-bounces at ogf.org 



This is certainly allowed in XML schema.. In the sequence you can have 
multiple  elements with same name as long as their type is identical which 
is the case in your example.  I think from XPath perspective, it would be 
treated like array and if true dldl:occursCount should return 2.  . 

Suman Kalia 
IBM Canada Lab 
WMB Toolkit Architect and Development Lead 
Tel: 905-413-3923 T/L 313-3923 
Email: kalia at ca.ibm.com 

For info on Message broker 
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html 






From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        dfdl-wg at ogf.org, 
Date:        02/28/2013 07:44 PM 
Subject:        [DFDL-WG] is this legal 
Sent by:        dfdl-wg-bounces at ogf.org 




I can't find clarity on this:

<sequence>
 <element name="foo" type="int" dfdl:lengthKind="explicit" 
dfdl:length="1"/>
  <element name="bar" type="int" dfdl:lengthKind="explicit" 
dfdl:length="1"/>
 <element name="foo" type="int" dfdl:lengthKind="explicit" 
dfdl:length="1"/>
 <element name="bar" type="int" dfdl:lengthKind="explicit" 
dfdl:length="1"/>
</sequence>

Is this allowed?

If so, then the XPaths for accessing the 2nd foo would be foo[2], and the 
path "foo" would be ambiguous or
could be treated as identifying an array. In which case one could do an 
expression dfdl:occursCount("foo") and get back 2 ??

Or am I completely missing the boat here?

-- 
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg --
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130301/0811ebcc/attachment.html>


More information about the dfdl-wg mailing list