Fw: [dfdl-wg] Ambiguous XPaths to hidden elements

Steve Hanson smh at uk.ibm.com
Thu Jan 19 09:49:18 CST 2006


I think there is a wider issue here that we need to consider before we can
answer this question, and that is what do we plan to say in the DFDL
standard about the output we create? Let me explain this further.

In IBM we have several different products that could exploit a DFDL parser,
but these products all have a different way of exposing the output to the
user, normally some kind of DOM-like tree model with an accompanying API.
This is almost certainly the case elsewhere too because there is no
equivalent standard to DOM for non-XML data. A DFDL parser must therefore
provide a SAX-like interface and generate events, leaving it up to the
caller to create the tree model (even if it also provided the capability to
build an XML DOM model).

However both the use of XPath during the parse and hidden elements pose
problems for a DFDL parser.
- All the XPath implementations I have encountered are bound to the type of
tree that is being created - DOM or SDO or WMB for example. With SAX there
is no standard tree.
- Hidden elements will not by definition appear in a DOM tree.
So even in DOM mode a DFDL parser needs some internal mechanism for
retaining parsed values that could be referenced by XPath, and its XPath
implemetation must therefore use this mechanism. Note it should only use
this mechanism for retaining the values of those elements that it knows
could be referenced by XPath, ie, it is a sparse mechanism. Otherwise the
amount of memory used for large messages becomes unacceptable.

An alternative is to force the caller to provide a handler that gets
invoked when XPath needs to access the caller's tree (implies disallowing
XPath for hidden elements). I don't think a handler approach would be
acceptable, even if a DOM alternative was provided.

Perhaps it is time to give some thought to the API aspect of DFDL ?

(The implication above is that an existing XPath implementation can't be
used as is. True, but it is only the tree/model interface that needs
changing and this is an internal. I'm still not happy with 2a/b/c below
which change the XPath externals. I prefer 1 below.)

Regards, Steve

Steve Hanson
WebSphere Message Brokers,
IBM Hursley, England
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 19/01/2006 09:53 -----
                                                                           
             Steve                                                         
             Hanson/UK/IBM                                                 
                                                                        To 
             19/01/2006 09:43          "Westhead, Martin (Martin)"         
                                       <westhead at avaya.com>                
                                                                        cc 
                                       dfdl-wg at ggf.org,                    
                                       owner-dfdl-wg at ggf.org               
                                                                   Subject 
                                       Re: [dfdl-wg] Ambiguous XPaths to   
                                       hidden elements(Document link:      
                                       Steve Hanson)                       
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           



As a DFDL parser implementor I do not want modifications to the XPath
syntax. I want to be able to reuse existing XPath implementations. It's
also something else for the user to have to learn. So 2a/b/c are not
attractive.

Regards, Steve

Steve Hanson
WebSphere Message Brokers,
IBM Hursley, England
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848


                                                                           
             "Westhead, Martin                                             
             (Martin)"                                                     
             <westhead at avaya.c                                          To 
             om>                       <dfdl-wg at ggf.org>                   
             Sent by:                                                   cc 
             owner-dfdl-wg at ggf                                             
             .org                                                  Subject 
                                       [dfdl-wg] Ambiguous XPaths to       
                                       hidden elements                     
             18/01/2006 20:24                                              
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Hi folks,

This is to try to pick up on the issue identified by Suman in today’s call.

The Issue
Consider the following example:

<xs:element name="root">
<xs:complexType>
                                <xs:sequence>
                                                <xs:annotation><xs:appinfo
source=”http://dataformat.org” />
                                             <hidden>
<xs:element name="repeats" type="xs:integer"/>
                                                                </hidden>

</xs:appinfo></xs:annotation >
                                <xs:element              name="testElement"
type="xs:integer " minOccurs=”0” maxOccurs=”unbounded”
                                 dfdl:repeatCount=”../repeats”>
                </xs:complexType>
</xs:element>

The problem is that the path “../repeats” can be broken by modifications to
the logical model due to name clashes on “repeats” and there are cases that
can be constructed where this would not be obvious to a user.

Possible Solutions
Possible fixes to this include:
   1.  Disallow  XPath  references to hidden elements the user is forced to
      place the material into the global context to refer to it.
   2.  Provide  a  special  XPath operator to indicate we are referencing a
      hidden element, possibilities include:
         a. “../hidden(repeats)”
         b.  “hidden(../repeats)”
         c.  “../dfdl:hidden/repeats”
   3.  Only allow hidden elements to be present in top level global complex
      types. These can then be included where needed. (This is the solution
      that  Suman  was  pushing  but  I  don’t  fully  understand  it  – in
      particular I don’t see how it resolves the ambiguity issue.)


I believe my preference here is 2a or 2b followed by 1.

Comments/suggestions/opinions?

Thanks,

Martin


More information about the dfdl-wg mailing list