Fw: [dfdl-wg] Ambiguous XPaths to hidden elements

Mike Beckerle beckerle at us.ibm.com
Thu Jan 19 14:01:16 CST 2006


I agree with the sentiment here of not putting limitations on the 
user-defined schema.

However, I believe the example we are looking at does not add any problem 
that does not already exist in XML Schema when XPath expressions are 
referring to it.

For example, consider this XSD Fragment. Same as before without any DFDL 
annotations or "hidden stuff"

<xs:element name="root"> 
<xs:complexType> 
               <xs:sequence> 
                               <xs:element name="repeats" 
type="xs:integer"/> 
                               <xs:element name="testElement" 
type="xs:integer " minOccurs=?0? maxOccurs=?unbounded?/> 

                               <xs:group ref="groupFromOtherSchemaFile"/> 
  
               </xs:complexType> 
</xs:element> 

Now consider an XPath for manipulating this document "/root/repeats". 
Perhaps I am using it in an XSLT stylesheet.

Now, what happens in my stylesheet depends on what the definition of the 
group "groupFromOtherSchemaFile" is. That is, this XSD is legal regardless 
of whether that group contains an element named "repeats". But my 
stylesheet could be broken if it was expecting there to be only one such 
element and more than one is returned.

My point is this: anything that couples an XSD with XPaths can have this 
kind of dependency where the meaning of the paths might not be respected.

DFDL is not the only thing that bundles XPaths with XSD. The new XSLT (2.0 
?) specification, reportedly allows the XSD for the input and the XSD for 
the output to be bundled with the stylesheet. If the XSD were to include 
an externally defined group as above, then the stylesheet can be broken 
even though the XSDs are bundled.

This is why I think it is simply ok for our hidden layer example to simply 
fail. Because anything with Xpaths and XSD coupled can have this class of 
failure.

...mikeb
 
Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA 01581
voice and FAX 508-599-7148
home/mobile office 508-915-4767





Suman Kalia <kalia at ca.ibm.com> 
Sent by: owner-dfdl-wg at ggf.org
01/19/2006 11:52 AM

To
Mike Beckerle/Worcester/IBM at IBMUS
cc
dfdl-wg at ggf.org, owner-dfdl-wg at ggf.org
Subject
Fw: [dfdl-wg] Ambiguous XPaths to hidden elements







As a design point ,  We should strive not to put limitations on the user 
defined schemas - it just works out better in the long run. 

Note the xsd:groups can be nested and they could be many levels deep and 
this problem is not restricted to groups included  from noTarget namespace 
, it could be from any namespace.  As per schema rules, all local elements 
defined in groups or complex types belong to noTarget namespace unless 
elementFormDefault is explicitly set to "qualified" at schema level or on 
the specific element. 

Detecting such conflicts could be quite expensive particularly when you 
have very large schemas.  Industry standard ACORD messaging schema is a 
good example it is about 1.5 M and it takes awfully long (hours) to 
validate it.  Putting additional constraints like this will further slow 
down validation. 

Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools 
Tel : 905-413-3923  T/L  969-3923
Fax : 905-413-4850
Internet ID : kalia at ca.ibm.com 
----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 11:39 AM ----- 
Mike Beckerle <beckerle at us.ibm.com> 
Sent by: owner-dfdl-wg at ggf.org 
01/19/2006 10:48 AM 


To
"Robert E. McGrath" <mcgrath at ncsa.uiuc.edu> 
cc
dfdl-wg at ggf.org, owner-dfdl-wg at ggf.org 
Subject
Re: Fw: [dfdl-wg] Ambiguous XPaths to hidden elements









One idea that hasn't been advanced yet is ruling out the problematic case. 


Let me illustrate. Here's the example, modified to have a model group 
reference which can introduce the name conflict: 

<xs:element name="root"> 
<xs:complexType> 
                               <xs:sequence> 
                                               <xs:annotation><xs:appinfo 
source=?http://dataformat.org? /> 
                                            <hidden>   
     <xs:element name="repeats" type="xs:integer"/> 
                                                               </hidden> 
 </xs:appinfo></xs:annotation > 
                               <xs:element name="testElement" 
type="xs:integer " minOccurs=?0? maxOccurs=?unbounded? 
                                dfdl:repeatCount=?../repeats?> 

                               <xs:group ref="groupFromOtherSchemaFile"/> 
<!-- what if this has an element decl named "repeats"? --> 
  
               </xs:complexType> 
</xs:element> 
 
So, what hasn't been suggested yet is this: What if we just say DFDL 
doesn't allow this? It's an error which must be detected. This DFDL schema 
is broken because the path "../repeats" cannot be analyzed along with the 
DFDL schema to return only a single node. 

I beleive name conflicts like this are what namespace management is for. 
XSD has truly great namespace managment. You can solve the problem that 
way. 

Furthermore, when you define a reusable named group like the definer of 
the "groupFromOtherSchemaFile" above, and you put it in no target 
namespace, that's the situation where this conflict can arise. Expecting 
that your names are never going to conflict with anything in that case is 
just naive. It's equivalent to having global variables in a C program 
module and expecting you can never link it to something else that uses the 
same names. Those name conflicts can occur, and someone has to change the 
conflicting name. In XSD we can do that by including the group in a schema 
which puts it into a target namespace so that after that the namespaces 
can be used to disambiguate. 

The approach above is consistent with the path "../repeats" still being 
officialy an "XPath", it just adds the semantic restriction that it must 
be an XPath that identifies a single node unambiguously, independent of 
what data is being processed. This is one of these "data independent" 
notions (what I had previously been calling "static"), as we discussed 
yesterday. 

...mikeb 




"Robert E. McGrath" <mcgrath at ncsa.uiuc.edu> 
Sent by: owner-dfdl-wg at ggf.org 
01/19/2006 10:00 AM 


To
dfdl-wg at ggf.org 
cc

Subject
Re: Fw: [dfdl-wg] Ambiguous XPaths to hidden elements










I would want to change XPath only as a last resort.  (Any of the
options is OK by me, assuming we have to mess with the Xpath
at all.)

Can we deal with this some other way?

Can we document the problematic cases, and suggest best practices that
will minimize the problem?

On Thursday 19 January 2006 08:45, Suman Kalia wrote:
> I fully agree with Steve - let's not invent another XPATH like syntax ..
>
> Suman Kalia
> IBM Toronto Lab
> WebSphere Business Integration Application Connectivity Tools
> Tel : 905-413-3923  T/L  969-3923
> Fax : 905-413-4850
> Internet ID : kalia at ca.ibm.com
> ----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 09:43 AM -----
>
> Steve Hanson <smh at uk.ibm.com>
> Sent by: owner-dfdl-wg at ggf.org
> 01/19/2006 04:43 AM
>
> To
> "Westhead, Martin (Martin)" <westhead at avaya.com>
> cc
> dfdl-wg at ggf.org, owner-dfdl-wg at ggf.org
> Subject
> Re: [dfdl-wg] Ambiguous XPaths to hidden elements
>
>
>
>
>
>
> As a DFDL parser implementor I do not want modifications to the XPath
> syntax. I want to be able to reuse existing XPath implementations. It's
> also something else for the user to have to learn. So 2a/b/c are not
> attractive.
>
> Regards, Steve
>
> Steve Hanson
> WebSphere Message Brokers,
> IBM Hursley, England
> Internet: smh at uk.ibm.com
> Phone (+44)/(0) 1962-815848
>
>
>
>              "Westhead, Martin
>              (Martin)"
>              <westhead at avaya.c To
>
>              om>                       <dfdl-wg at ggf.org>
>              Sent by: cc
>
>              owner-dfdl-wg at ggf
>              .org Subject
>
>                                        [dfdl-wg] Ambiguous XPaths to
>                                        hidden elements
>              18/01/2006 20:24
>
>
>
>
>
>
>
>
>
> Hi folks,
>
> This is to try to pick up on the issue identified by Suman in today?s
> call.
>
> The Issue
> Consider the following example:
>
> <xs:element name="root">
> <xs:complexType>
>                                 <xs:sequence>
> <xs:annotation><xs:appinfo
> source=?http://dataformat.org? />
>                                              <hidden>
> <xs:element name="repeats" type="xs:integer"/>
> </hidden>
>
> </xs:appinfo></xs:annotation >
>                                 <xs:element name="testElement"
> type="xs:integer " minOccurs=?0? maxOccurs=?unbounded?
>                                  dfdl:repeatCount=?../repeats?>
>                 </xs:complexType>
> </xs:element>
>
> The problem is that the path ?../repeats? can be broken by modifications
> to
> the logical model due to name clashes on ?repeats? and there are cases
> that
> can be constructed where this would not be obvious to a user.
>
> Possible Solutions
> Possible fixes to this include:
>    1.  Disallow  XPath  references to hidden elements the user is forced
> to
>       place the material into the global context to refer to it.
>    2.  Provide  a  special  XPath operator to indicate we are 
referencing
> a
>       hidden element, possibilities include:
>          a. ?../hidden(repeats)?
>          b.  ?hidden(../repeats)?
>          c.  ?../dfdl:hidden/repeats?
>    3.  Only allow hidden elements to be present in top level global
> complex
>       types. These can then be included where needed. (This is the
> solution
>       that  Suman  was  pushing  but  I  don?t  fully  understand  it  ?
> in
>       particular I don?t see how it resolves the ambiguity issue.)
>
>
> I believe my preference here is 2a or 2b followed by 1.
>
> Comments/suggestions/opinions?
>
> Thanks,
>
> Martin

-- 
---
Robert E. McGrath, Ph.D.
National Center for Supercomputing Applications
University of Illinois, Urbana-Champaign
1205 West Clark
Urbana, Illinois 61801
(217)-333-6549

mcgrath at ncsa.uiuc.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20060119/73d2becc/attachment.htm 


More information about the dfdl-wg mailing list