[dfdl-wg] RE: Attributes - (was Ambiguous XPaths to hidden elements)

Steve Hanson smh at uk.ibm.com
Fri Jan 20 03:55:36 CST 2006


Regarding attributes, it is not true to say that there is no extra work to
do. Once you allow attributes into the equation you have to consider not
only Mike's points below but also questions like where you put
construct-specific annotations for groups of attributes, and how do you
handle complex type with simple content. Attributes are not a small delta
on the XML spec. In the existing IBM model we started with a Schema centric
view and allowed all this and it has caused us many problems in the
implementation. It certainly muddies the waters for non-XML users starting
off with the IBM model, most of whom aren't interested in attributes, to
the extent that many users are put off from using the model altogether. I
think DFDL should be approaching the problem from the "I have some non-XML
data let me model it" angle and not "Here is XML Schema let's provide a way
of representing all its constructs in non-XML form". I strongly believe
that to get DFDL adoption going, the 1.0 spec should do the minimum
possible to permit the modelling of all types of non-XML data. That's a big
enough task in itself. Stuff like allowing an easy transformation into XML
with attributes or substitution groups can wait for a second release, if it
is something that users want. If you make the scope of a DFDL parser too
wide, no-one will write one.

Regards, Steve

Steve Hanson
WebSphere Message Brokers,
IBM Hursley, England
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848


                                                                           
             "Westhead, Martin                                             
             \(Martin\)"                                                   
             <westhead at avaya.c                                          To 
             om>                       "Mike Beckerle"                     
                                       <beckerle at us.ibm.com>               
             19/01/2006 20:59                                           cc 
                                       <dfdl-wg at ggf.org>, "Suman Kalia"    
                                       <kalia at ca.ibm.com>,                 
                                       <owner-dfdl-wg at ggf.org>, Steve      
                                       Hanson/UK/IBM at IBMGB                 
                                                                   Subject 
                                       RE: Attributes - (was Ambiguous     
                                       XPaths to hidden elements)          
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Well I wasn’t saying that. I was saying that you could do that if things
didn’t line up.

I suppose I don’t feel strongly about the restriction I just don’t really
see the need.

I have a number of examples of XML which are essentially tables with lists
of elements with just attributes. I could imagine it would be easy and
convenient to populate such a logical model directly using annotations
without forcing the user to add hidden layers…

Martin


From: Mike Beckerle [mailto:beckerle at us.ibm.com]
Sent: Thursday, January 19, 2006 12:55 PM
To: Westhead, Martin (Martin)
Cc: dfdl-wg at ggf.org; Suman Kalia; owner-dfdl-wg at ggf.org; Steve Hanson
Subject: RE: Attributes - (was Ambiguous XPaths to hidden elements)


I agree with you.

There is one caveat though, you are saying in effect that attributes are
only supported via population to/from other elements or attributes (which
can be hidden). The bottom layer must be elements only.

With this restriction I am fine with attributes. The vast bulk of the DFDL
spec will simply be unconcerned with them.

...mikeb

Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA 01581
voice and FAX 508-599-7148
home/mobile office 508-915-4767





                                                                           
 "Westhead, Martin                                                         
 \(Martin\)"                                                               
 <westhead at avaya.com>                                                      
                                                                        To 
                                       Mike Beckerle/Worcester/IBM at IBMUS   
 01/19/2006 03:45 PM                                                    cc 
                                       <dfdl-wg at ggf.org>, "Suman Kalia"    
                                       <kalia at ca.ibm.com>,                 
                                       <owner-dfdl-wg at ggf.org>, "Steve     
                                       Hanson" <smh at uk.ibm.com>            
                                                                   Subject 
                                       RE: Attributes - (was Ambiguous     
                                       XPaths to hidden elements)          
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           





Perhaps there is a discussion here for later, but with hidden elements I
don’t think there’s much of an issue here.

My take on this is that we are using the XSD to do more than it is supposed
to, we are not just using it to describe valid XML we are using it
constructively to describe how to build an XML instance. We are
establishing a semantics for how the DFDL parser constructs this XML
document from the XSD and the data.

Now, it is easy to imagine cases where you have a particular xml model you
want to populate which is similar to the data but is perhaps a subset with
slightly different ordering. Hidden elements make this easy to handle. You
put the stuff that is to be omitted in a hidden element. Stuff that is out
of order can be put in hidden elements and then referenced by correctly
ordered elements for value.

In this framework attributes are just another node for data population. If
it happens that you can arrange the XML Schema so that the order in which
attributes are laid out in the file corresponds to the ordering of the
data, well then its no different from elements. If there are constraints
that prevent that ordering then you are in a situation just like the one I
describe above and you solve it with hidden elements.

So I don’t really disagree with your points below it just seems to me that
there’s nothing much to worry about here. I can’t see why we would put it
off – there’s no extra work to do?

Cheers,

Martin






From: Mike Beckerle [mailto:beckerle at us.ibm.com]
Sent: Thursday, January 19, 2006 12:27 PM
To: Westhead, Martin (Martin)
Cc: dfdl-wg at ggf.org; Suman Kalia; owner-dfdl-wg at ggf.org; Steve Hanson
Subject: Re: Attributes - (was Ambiguous XPaths to hidden elements)


Here's some of the issues with attributes:

1) every element can have two kinds of children now, attributes and
sub-elements. Actual DFDL-described data will typically have just one kind
of sub-fields of data, but some want to be mapped to attributes, others to
elements. How do you specify how the attributes are separated from the
elements. Do the attributes have to be first, or last, or can they be
interleaved with the sub-elements' physical data?
2) in XSD syntax, if you define a element with both sub-elements and
attributes, the attributes textually come after the sub-elements. This
means you may or may not have them in the right order for DFDL annotation
to be convenient. E.g., suppose you had 5 children the first 2 you want to
be attributes the remaining 3 sub-elements in a sequence. How can you
annotate this conveniently?
3)The attributes are always unordered w.r.t. the XSD concept, but the
sub-elements can be either ordered or un-ordered.

These all seemed like avoidable issues. That is, I'm not saying there is no
good solution here, I'm just saying it's probably uninteresting.

One solution to the whole issue is to say that attributes can't be in the
bottom-most layer. I.e., you create them by referencing element values in
hidden or non-hidden layers. This eliminates all the above problems.It is
equivalent to using a stylesheet to create the attributes you want from
elements-only data. However, by use of the layers you can embed it all
right into the DFDL schema.

I also believe attributes are to some extent beside the point for DFDL.
What we needed from XSD is a standardized hierarchical type system that we
can hang represetnation properties on. We get that by using the
elements-only subset of XSD. Attributes are only needed if we broaden our
agenda to include "creating the XML I want out of my data", which layering
can do.

Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA 01581
voice and FAX 508-599-7148
home/mobile office 508-915-4767



                                                                           
 "Westhead, Martin                                                         
 \(Martin\)"                                                               
 <westhead at avaya.com>                                                      
                                                                           
                                                                           
 01/19/2006 03:05 PM                                                       
                                                                        To 
                                       Mike Beckerle/Worcester/IBM at IBMUS   
                                                                        cc 
                                       <dfdl-wg at ggf.org>, "Suman Kalia"    
                                       <kalia at ca.ibm.com>,                 
                                       <owner-dfdl-wg at ggf.org>, "Steve     
                                       Hanson" <smh at uk.ibm.com>            
                                                                   Subject 
                                       Attributes - (was Ambiguous XPaths  
                                       to hidden elements)                 
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           








I think (hope) that with the emerging operational semantics, that there
will not be any problems putting annotations into attributes and having
them evaluated just like an element. It is not obvious to me that they are
any different.

Martin






From: Mike Beckerle [mailto:beckerle at us.ibm.com]
Sent: Thursday, January 19, 2006 12:03 PM
To: Westhead, Martin (Martin)
Cc: dfdl-wg at ggf.org; Suman Kalia; owner-dfdl-wg at ggf.org; Steve Hanson
Subject: RE: [dfdl-wg] Ambiguous XPaths to hidden elements


I think we did not agree formally to this. It was one of those things where
we're trying to find something we can cut out with low risk of later
complications.

Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA 01581
voice and FAX 508-599-7148
home/mobile office 508-915-4767


                                                                           
 "Westhead, Martin (Martin)"                                               
 <westhead at avaya.com>                                                      
 Sent by:                                                                  
 owner-dfdl-wg at ggf.org                                                     
                                                                           
                                                                           
 01/19/2006 01:59 PM                                                       
                                                                           
                                                                           
                                                                        To 
                                           "Suman Kalia"                   
                                           <kalia at ca.ibm.com>, "Steve      
                                           Hanson" <smh at uk.ibm.com>        
                                                                        cc 
                                           Mike                            
                                           Beckerle/Worcester/IBM at IBMUS,   
                                           <dfdl-wg at ggf.org>,              
                                           <owner-dfdl-wg at ggf.org>         
                                                                   Subject 
                                           RE: [dfdl-wg] Ambiguous XPaths  
                                           to hidden elements              
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           








Why are we not allowing attributes?

Martin








From: owner-dfdl-wg at ggf.org [mailto:owner-dfdl-wg at ggf.org] On Behalf Of
Suman Kalia
Sent: Thursday, January 19, 2006 10:57 AM
To: Steve Hanson
Cc: Mike Beckerle; dfdl-wg at ggf.org; owner-dfdl-wg at ggf.org
Subject: Fw: [dfdl-wg] Ambiguous XPaths to hidden elements


The main problem will be performance and excessively long validation times
and either asking the user to change their schema or model it different
way. These are all undesirable.  Attributes I hope will be supported in the
future release .  Redefine construct is hardly used in the practical
applications; at least I haven't come across any customer that uses this
construct ..

Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools
Tel : 905-413-3923  T/L  969-3923
Fax : 905-413-4850
Internet ID : kalia at ca.ibm.com
----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 01:49 PM -----


                                                                           
 Steve Hanson                                                              
 <smh at uk.ibm.com>                                                          
                                                                           
                                                                           
 01/19/2006 01:15 PM                                                       
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                        To 
                                      Suman Kalia/Toronto/IBM at IBMCA        
                                                                        cc 
                                      Mike Beckerle <beckerle at us.ibm.com>, 
                                      dfdl-wg at ggf.org,                     
                                      owner-dfdl-wg at ggf.org                
                                                                   Subject 
                                      Re: Fw: [dfdl-wg] Ambiguous XPaths   
                                      to hidden elements                   
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           








We are already putting constraints on user-defined schema, by saying that
we don't support redefines and attributes for example. I don't see an issue
with further constraints if they make DFDL easier to understand and/or
easier to create a DFDL parser.

I don't have a problem with saying that an XPath must return a single
unambiguous node else it is an error.
I don't have a problem with saying the XPaths can't reference hidden
elements, and that context must be used instead.

Regards, Steve

Steve Hanson
WebSphere Message Brokers,
IBM Hursley, England
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848



         Suman Kalia
         <kalia at ca.ibm.com
         >                                                          To
         Sent by:                  Mike Beckerle <beckerle at us.ibm.com>
         owner-dfdl-wg at ggf                                          cc
         .org                      dfdl-wg at ggf.org,
                                   owner-dfdl-wg at ggf.org
                                                               Subject
         19/01/2006 18:02          Fw: [dfdl-wg] Ambiguous XPaths to
                                   hidden elements











Well if we go with global complex type approach (which I described option 1
in previous append) then it is not issue.. XPATH work and there are no
conflicts with user defined schemas ..

Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools
Tel : 905-413-3923  T/L  969-3923
Fax : 905-413-4850
Internet ID : kalia at ca.ibm.com
----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 12:59 PM -----

Mike
Beckerle/Worcester/IBM at IBMUS

                                                                    To
01/19/2006 12:59 PM                        Suman Kalia/Toronto/IBM at IBMCA
                                                                    cc
                                        dfdl-wg at ggf.org,
                                        owner-dfdl-wg at ggf.org
                                                               Subject
                                        Re: Fw: [dfdl-wg] Ambiguous
                                        XPaths to hidden elementsLink











So we have a quandry here:

on one hand we don't want to change the XPath syntax to include a device
that would let us be clear that we're navigating a hidden layer

on the other hand we don't want to constrain what can be included so that
we wouldn't need such a device.

...mikeb

Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA 01581
voice and FAX 508-599-7148
home/mobile office 508-915-4767




Suman Kalia/Toronto/IBM at IBMCA


01/19/2006 11:52 AM                                                    To
                                         Mike
                                         Beckerle/Worcester/IBM at IBMUS
                                                                    cc
                                         dfdl-wg at ggf.org,
                                         owner-dfdl-wg at ggf.org
                                                               Subject
                                         Fw: [dfdl-wg] Ambiguous
                                         XPaths to hidden elements











As a design point ,  We should strive not to put limitations on the user
defined schemas - it just works out better in the long run.

Note the xsd:groups can be nested and they could be many levels deep and
this problem is not restricted to groups included  from noTarget namespace
, it could be from any namespace.  As per schema rules, all local elements
defined in groups or complex types belong to noTarget namespace unless
elementFormDefault is explicitly set to "qualified" at schema level or on
the specific element.

Detecting such conflicts could be quite expensive particularly when you
have very large schemas.  Industry standard ACORD messaging schema is a
good example it is about 1.5 M and it takes awfully long (hours) to
validate it.  Putting additional constraints like this will further slow
down validation.

Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools
Tel : 905-413-3923  T/L  969-3923
Fax : 905-413-4850
Internet ID : kalia at ca.ibm.com
----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 11:39 AM -----

Mike Beckerle
<beckerle at us.ibm.com>
Sent by: owner-dfdl-wg at ggf.org
                                                                    To
                                          "Robert E. McGrath"
01/19/2006 10:48 AM                          <mcgrath at ncsa.uiuc.edu>
                                                                    cc
                                          dfdl-wg at ggf.org,
                                          owner-dfdl-wg at ggf.org
                                                               Subject
                                          Re: Fw: [dfdl-wg] Ambiguous
                                          XPaths to hidden elements












One idea that hasn't been advanced yet is ruling out the problematic case.

Let me illustrate. Here's the example, modified to have a model group
reference which can introduce the name conflict:

<xs:element name="root">
<xs:complexType>
                           <xs:sequence>
                                           <xs:annotation><xs:appinfo
source=”http://dataformat.org” />
                                        <hidden>

<xs:element name="repeats" type="xs:integer"/>
                                                           </hidden>
                                           </xs:appinfo></xs:annotation
>
                           <xs:element name="testElement"
type="xs:integer " minOccurs=”0” maxOccurs=”unbounded”
                            dfdl:repeatCount=”../repeats”>

                           <xs:group ref="groupFromOtherSchemaFile"/>
<!-- what if this has an element decl named "repeats"? -->

           </xs:complexType>
</xs:element>

So, what hasn't been suggested yet is this: What if we just say DFDL
doesn't allow this? It's an error which must be detected. This DFDL schema
is broken because the path "../repeats" cannot be analyzed along with the
DFDL schema to return only a single node.

I beleive name conflicts like this are what namespace management is for.
XSD has truly great namespace managment. You can solve the problem that
way.

Furthermore, when you define a reusable named group like the definer of the
"groupFromOtherSchemaFile" above, and you put it in no target namespace,
that's the situation where this conflict can arise. Expecting that your
names are never going to conflict with anything in that case is just naive.
It's equivalent to having global variables in a C program module and
expecting you can never link it to something else that uses the same names.
Those name conflicts can occur, and someone has to change the conflicting
name. In XSD we can do that by including the group in a schema which puts
it into a target namespace so that after that the namespaces can be used to
disambiguate.

The approach above is consistent with the path "../repeats" still being
officialy an "XPath", it just adds the semantic restriction that it must be
an XPath that identifies a single node unambiguously, independent of what
data is being processed. This is one of these "data independent" notions
(what I had previously been calling "static"), as we discussed yesterday.

...mikeb




"Robert E. McGrath"
<mcgrath at ncsa.uiuc.edu>
Sent by: owner-dfdl-wg at ggf.org
                                                                    To
                                            dfdl-wg at ggf.org
01/19/2006 10:00 AM                                                    cc

                                                               Subject
                                            Re: Fw: [dfdl-wg]
                                            Ambiguous XPaths to hidden
                                            elements














I would want to change XPath only as a last resort.  (Any of the
options is OK by me, assuming we have to mess with the Xpath
at all.)

Can we deal with this some other way?

Can we document the problematic cases, and suggest best practices that
will minimize the problem?

On Thursday 19 January 2006 08:45, Suman Kalia wrote:
> I fully agree with Steve - let's not invent another XPATH like syntax ..
>
> Suman Kalia
> IBM Toronto Lab
> WebSphere Business Integration Application Connectivity Tools
> Tel : 905-413-3923  T/L  969-3923
> Fax : 905-413-4850
> Internet ID : kalia at ca.ibm.com
> ----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 09:43 AM -----
>
> Steve Hanson <smh at uk.ibm.com>
> Sent by: owner-dfdl-wg at ggf.org
> 01/19/2006 04:43 AM
>
> To
> "Westhead, Martin (Martin)" <westhead at avaya.com>
> cc
> dfdl-wg at ggf.org, owner-dfdl-wg at ggf.org
> Subject
> Re: [dfdl-wg] Ambiguous XPaths to hidden elements
>
>
>
>
>
>
> As a DFDL parser implementor I do not want modifications to the XPath
> syntax. I want to be able to reuse existing XPath implementations. It's
> also something else for the user to have to learn. So 2a/b/c are not
> attractive.
>
> Regards, Steve
>
> Steve Hanson
> WebSphere Message Brokers,
> IBM Hursley, England
> Internet: smh at uk.ibm.com
> Phone (+44)/(0) 1962-815848
>
>
>
>              "Westhead, Martin
>              (Martin)"
>              <westhead at avaya.c
To
>
>              om>                       <dfdl-wg at ggf.org>
>              Sent by:
cc
>
>              owner-dfdl-wg at ggf
>              .org
Subject
>
>                                        [dfdl-wg] Ambiguous XPaths to
>                                        hidden elements
>              18/01/2006 20:24
>
>
>
>
>
>
>
>
>
> Hi folks,
>
> This is to try to pick up on the issue identified by Suman in today?s
> call.
>
> The Issue
> Consider the following example:
>
> <xs:element name="root">
> <xs:complexType>
>                                 <xs:sequence>
>
<xs:annotation><xs:appinfo
> source=?http://dataformat.org? />
>                                              <hidden>
> <xs:element name="repeats" type="xs:integer"/>
>                                                                 </hidden>
>
> </xs:appinfo></xs:annotation >
>                                 <xs:element name="testElement"
> type="xs:integer " minOccurs=?0? maxOccurs=?unbounded?
>                                  dfdl:repeatCount=?../repeats?>
>                 </xs:complexType>
> </xs:element>
>
> The problem is that the path ?../repeats? can be broken by modifications
> to
> the logical model due to name clashes on ?repeats? and there are cases
> that
> can be constructed where this would not be obvious to a user.
>
> Possible Solutions
> Possible fixes to this include:
>    1.  Disallow  XPath  references to hidden elements the user is forced
> to
>       place the material into the global context to refer to it.
>    2.  Provide  a  special  XPath operator to indicate we are referencing
> a
>       hidden element, possibilities include:
>          a. ?../hidden(repeats)?
>          b.  ?hidden(../repeats)?
>          c.  ?../dfdl:hidden/repeats?
>    3.  Only allow hidden elements to be present in top level global
> complex


More information about the dfdl-wg mailing list