[dfdl-wg] RE: Attributes - (was Ambiguous XPaths to hidden elements)

Westhead, Martin (Martin) westhead at avaya.com
Fri Jan 20 09:58:02 CST 2006


I'm afraid I don't understand. I guess I need to see some real examples.

I hear you saying that including attributes would require significant
extra work. That it would make the language more complicated for the
users and harder to write implementations of. Unfortunately I don't have
enough of an implementation perspective to understand or argue with you.

What I propose is this. Let us finish this discussion of the underlying
semantics; I have more material to present. And then perhaps we can have
this discussion with some examples. From where I am sitting I don't see
attributes introducing all the corner cases you are worried about. If
the generalization is right some of these may simplify.

Cheers,

Martin

>-----Original Message-----
>From: owner-dfdl-wg at ggf.org [mailto:owner-dfdl-wg at ggf.org] On Behalf Of
>Steve Hanson
>Sent: Friday, January 20, 2006 1:56 AM
>To: Westhead, Martin (Martin)
>Cc: dfdl-wg at ggf.org
>Subject: [dfdl-wg] RE: Attributes - (was Ambiguous XPaths to hidden
>elements)
>
>Regarding attributes, it is not true to say that there is no extra work
to
>do. Once you allow attributes into the equation you have to consider
not
>only Mike's points below but also questions like where you put
>construct-specific annotations for groups of attributes, and how do you
>handle complex type with simple content. Attributes are not a small
delta
>on the XML spec. In the existing IBM model we started with a Schema
centric
>view and allowed all this and it has caused us many problems in the
>implementation. It certainly muddies the waters for non-XML users
starting
>off with the IBM model, most of whom aren't interested in attributes,
to
>the extent that many users are put off from using the model altogether.
I
>think DFDL should be approaching the problem from the "I have some
non-XML
>data let me model it" angle and not "Here is XML Schema let's provide a
way
>of representing all its constructs in non-XML form". I strongly believe
>that to get DFDL adoption going, the 1.0 spec should do the minimum
>possible to permit the modelling of all types of non-XML data. That's a
big
>enough task in itself. Stuff like allowing an easy transformation into
XML
>with attributes or substitution groups can wait for a second release,
if it
>is something that users want. If you make the scope of a DFDL parser
too
>wide, no-one will write one.
>
>Regards, Steve
>
>Steve Hanson
>WebSphere Message Brokers,
>IBM Hursley, England
>Internet: smh at uk.ibm.com
>Phone (+44)/(0) 1962-815848
>
>
>
>             "Westhead, Martin
>             \(Martin\)"
>             <westhead at avaya.c
To
>             om>                       "Mike Beckerle"
>                                       <beckerle at us.ibm.com>
>             19/01/2006 20:59
cc
>                                       <dfdl-wg at ggf.org>, "Suman Kalia"
>                                       <kalia at ca.ibm.com>,
>                                       <owner-dfdl-wg at ggf.org>, Steve
>                                       Hanson/UK/IBM at IBMGB
>
Subject
>                                       RE: Attributes - (was Ambiguous
>                                       XPaths to hidden elements)
>
>
>
>
>
>
>
>
>
>
>Well I wasn't saying that. I was saying that you could do that if
things
>didn't line up.
>
>I suppose I don't feel strongly about the restriction I just don't
really
>see the need.
>
>I have a number of examples of XML which are essentially tables with
lists
>of elements with just attributes. I could imagine it would be easy and
>convenient to populate such a logical model directly using annotations
>without forcing the user to add hidden layers...
>
>Martin
>
>
>From: Mike Beckerle [mailto:beckerle at us.ibm.com]
>Sent: Thursday, January 19, 2006 12:55 PM
>To: Westhead, Martin (Martin)
>Cc: dfdl-wg at ggf.org; Suman Kalia; owner-dfdl-wg at ggf.org; Steve Hanson
>Subject: RE: Attributes - (was Ambiguous XPaths to hidden elements)
>
>
>I agree with you.
>
>There is one caveat though, you are saying in effect that attributes
are
>only supported via population to/from other elements or attributes
(which
>can be hidden). The bottom layer must be elements only.
>
>With this restriction I am fine with attributes. The vast bulk of the
DFDL
>spec will simply be unconcerned with them.
>
>...mikeb
>
>Mike Beckerle
>STSM, Architect, Scalable Computing
>IBM Software Group
>Information Integration Solutions
>Westborough, MA 01581
>voice and FAX 508-599-7148
>home/mobile office 508-915-4767
>
>
>
>
>
>
> "Westhead, Martin
> \(Martin\)"
> <westhead at avaya.com>
>
To
>                                       Mike
Beckerle/Worcester/IBM at IBMUS
> 01/19/2006 03:45 PM
cc
>                                       <dfdl-wg at ggf.org>, "Suman Kalia"
>                                       <kalia at ca.ibm.com>,
>                                       <owner-dfdl-wg at ggf.org>, "Steve
>                                       Hanson" <smh at uk.ibm.com>
>
Subject
>                                       RE: Attributes - (was Ambiguous
>                                       XPaths to hidden elements)
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>Perhaps there is a discussion here for later, but with hidden elements
I
>don't think there's much of an issue here.
>
>My take on this is that we are using the XSD to do more than it is
supposed
>to, we are not just using it to describe valid XML we are using it
>constructively to describe how to build an XML instance. We are
>establishing a semantics for how the DFDL parser constructs this XML
>document from the XSD and the data.
>
>Now, it is easy to imagine cases where you have a particular xml model
you
>want to populate which is similar to the data but is perhaps a subset
with
>slightly different ordering. Hidden elements make this easy to handle.
You
>put the stuff that is to be omitted in a hidden element. Stuff that is
out
>of order can be put in hidden elements and then referenced by correctly
>ordered elements for value.
>
>In this framework attributes are just another node for data population.
If
>it happens that you can arrange the XML Schema so that the order in
which
>attributes are laid out in the file corresponds to the ordering of the
>data, well then its no different from elements. If there are
constraints
>that prevent that ordering then you are in a situation just like the
one I
>describe above and you solve it with hidden elements.
>
>So I don't really disagree with your points below it just seems to me
that
>there's nothing much to worry about here. I can't see why we would put
it
>off - there's no extra work to do?
>
>Cheers,
>
>Martin
>
>
>
>
>
>
>From: Mike Beckerle [mailto:beckerle at us.ibm.com]
>Sent: Thursday, January 19, 2006 12:27 PM
>To: Westhead, Martin (Martin)
>Cc: dfdl-wg at ggf.org; Suman Kalia; owner-dfdl-wg at ggf.org; Steve Hanson
>Subject: Re: Attributes - (was Ambiguous XPaths to hidden elements)
>
>
>Here's some of the issues with attributes:
>
>1) every element can have two kinds of children now, attributes and
>sub-elements. Actual DFDL-described data will typically have just one
kind
>of sub-fields of data, but some want to be mapped to attributes, others
to
>elements. How do you specify how the attributes are separated from the
>elements. Do the attributes have to be first, or last, or can they be
>interleaved with the sub-elements' physical data?
>2) in XSD syntax, if you define a element with both sub-elements and
>attributes, the attributes textually come after the sub-elements. This
>means you may or may not have them in the right order for DFDL
annotation
>to be convenient. E.g., suppose you had 5 children the first 2 you want
to
>be attributes the remaining 3 sub-elements in a sequence. How can you
>annotate this conveniently?
>3)The attributes are always unordered w.r.t. the XSD concept, but the
>sub-elements can be either ordered or un-ordered.
>
>These all seemed like avoidable issues. That is, I'm not saying there
is no
>good solution here, I'm just saying it's probably uninteresting.
>
>One solution to the whole issue is to say that attributes can't be in
the
>bottom-most layer. I.e., you create them by referencing element values
in
>hidden or non-hidden layers. This eliminates all the above problems.It
is
>equivalent to using a stylesheet to create the attributes you want from
>elements-only data. However, by use of the layers you can embed it all
>right into the DFDL schema.
>
>I also believe attributes are to some extent beside the point for DFDL.
>What we needed from XSD is a standardized hierarchical type system that
we
>can hang represetnation properties on. We get that by using the
>elements-only subset of XSD. Attributes are only needed if we broaden
our
>agenda to include "creating the XML I want out of my data", which
layering
>can do.
>
>Mike Beckerle
>STSM, Architect, Scalable Computing
>IBM Software Group
>Information Integration Solutions
>Westborough, MA 01581
>voice and FAX 508-599-7148
>home/mobile office 508-915-4767
>
>
>
>
> "Westhead, Martin
> \(Martin\)"
> <westhead at avaya.com>
>
>
> 01/19/2006 03:05 PM
>
To
>                                       Mike
Beckerle/Worcester/IBM at IBMUS
>
cc
>                                       <dfdl-wg at ggf.org>, "Suman Kalia"
>                                       <kalia at ca.ibm.com>,
>                                       <owner-dfdl-wg at ggf.org>, "Steve
>                                       Hanson" <smh at uk.ibm.com>
>
Subject
>                                       Attributes - (was Ambiguous
XPaths
>                                       to hidden elements)
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>I think (hope) that with the emerging operational semantics, that there
>will not be any problems putting annotations into attributes and having
>them evaluated just like an element. It is not obvious to me that they
are
>any different.
>
>Martin
>
>
>
>
>
>
>From: Mike Beckerle [mailto:beckerle at us.ibm.com]
>Sent: Thursday, January 19, 2006 12:03 PM
>To: Westhead, Martin (Martin)
>Cc: dfdl-wg at ggf.org; Suman Kalia; owner-dfdl-wg at ggf.org; Steve Hanson
>Subject: RE: [dfdl-wg] Ambiguous XPaths to hidden elements
>
>
>I think we did not agree formally to this. It was one of those things
where
>we're trying to find something we can cut out with low risk of later
>complications.
>
>Mike Beckerle
>STSM, Architect, Scalable Computing
>IBM Software Group
>Information Integration Solutions
>Westborough, MA 01581
>voice and FAX 508-599-7148
>home/mobile office 508-915-4767
>
>
>
> "Westhead, Martin (Martin)"
> <westhead at avaya.com>
> Sent by:
> owner-dfdl-wg at ggf.org
>
>
> 01/19/2006 01:59 PM
>
>
>
To
>                                           "Suman Kalia"
>                                           <kalia at ca.ibm.com>, "Steve
>                                           Hanson" <smh at uk.ibm.com>
>
cc
>                                           Mike
>
Beckerle/Worcester/IBM at IBMUS,
>                                           <dfdl-wg at ggf.org>,
>                                           <owner-dfdl-wg at ggf.org>
>
Subject
>                                           RE: [dfdl-wg] Ambiguous
XPaths
>                                           to hidden elements
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>Why are we not allowing attributes?
>
>Martin
>
>
>
>
>
>
>
>
>From: owner-dfdl-wg at ggf.org [mailto:owner-dfdl-wg at ggf.org] On Behalf Of
>Suman Kalia
>Sent: Thursday, January 19, 2006 10:57 AM
>To: Steve Hanson
>Cc: Mike Beckerle; dfdl-wg at ggf.org; owner-dfdl-wg at ggf.org
>Subject: Fw: [dfdl-wg] Ambiguous XPaths to hidden elements
>
>
>The main problem will be performance and excessively long validation
times
>and either asking the user to change their schema or model it different
>way. These are all undesirable.  Attributes I hope will be supported in
the
>future release .  Redefine construct is hardly used in the practical
>applications; at least I haven't come across any customer that uses
this
>construct ..
>
>Suman Kalia
>IBM Toronto Lab
>WebSphere Business Integration Application Connectivity Tools
>Tel : 905-413-3923  T/L  969-3923
>Fax : 905-413-4850
>Internet ID : kalia at ca.ibm.com
>----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 01:49 PM -----
>
>
>
> Steve Hanson
> <smh at uk.ibm.com>
>
>
> 01/19/2006 01:15 PM
>
>
>
>
>
>
>
>
To
>                                      Suman Kalia/Toronto/IBM at IBMCA
>
cc
>                                      Mike Beckerle
<beckerle at us.ibm.com>,
>                                      dfdl-wg at ggf.org,
>                                      owner-dfdl-wg at ggf.org
>
Subject
>                                      Re: Fw: [dfdl-wg] Ambiguous
XPaths
>                                      to hidden elements
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>We are already putting constraints on user-defined schema, by saying
that
>we don't support redefines and attributes for example. I don't see an
issue
>with further constraints if they make DFDL easier to understand and/or
>easier to create a DFDL parser.
>
>I don't have a problem with saying that an XPath must return a single
>unambiguous node else it is an error.
>I don't have a problem with saying the XPaths can't reference hidden
>elements, and that context must be used instead.
>
>Regards, Steve
>
>Steve Hanson
>WebSphere Message Brokers,
>IBM Hursley, England
>Internet: smh at uk.ibm.com
>Phone (+44)/(0) 1962-815848
>
>
>
>         Suman Kalia
>         <kalia at ca.ibm.com
>         >                                                          To
>         Sent by:                  Mike Beckerle <beckerle at us.ibm.com>
>         owner-dfdl-wg at ggf                                          cc
>         .org                      dfdl-wg at ggf.org,
>                                   owner-dfdl-wg at ggf.org
>                                                               Subject
>         19/01/2006 18:02          Fw: [dfdl-wg] Ambiguous XPaths to
>                                   hidden elements
>
>
>
>
>
>
>
>
>
>
>
>Well if we go with global complex type approach (which I described
option 1
>in previous append) then it is not issue.. XPATH work and there are no
>conflicts with user defined schemas ..
>
>Suman Kalia
>IBM Toronto Lab
>WebSphere Business Integration Application Connectivity Tools
>Tel : 905-413-3923  T/L  969-3923
>Fax : 905-413-4850
>Internet ID : kalia at ca.ibm.com
>----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 12:59 PM -----
>
>Mike
>Beckerle/Worcester/IBM at IBMUS
>
>                                                                    To
>01/19/2006 12:59 PM                        Suman
Kalia/Toronto/IBM at IBMCA
>                                                                    cc
>                                        dfdl-wg at ggf.org,
>                                        owner-dfdl-wg at ggf.org
>                                                               Subject
>                                        Re: Fw: [dfdl-wg] Ambiguous
>                                        XPaths to hidden elementsLink
>
>
>
>
>
>
>
>
>
>
>
>So we have a quandry here:
>
>on one hand we don't want to change the XPath syntax to include a
device
>that would let us be clear that we're navigating a hidden layer
>
>on the other hand we don't want to constrain what can be included so
that
>we wouldn't need such a device.
>
>...mikeb
>
>Mike Beckerle
>STSM, Architect, Scalable Computing
>IBM Software Group
>Information Integration Solutions
>Westborough, MA 01581
>voice and FAX 508-599-7148
>home/mobile office 508-915-4767
>
>
>
>
>Suman Kalia/Toronto/IBM at IBMCA
>
>
>01/19/2006 11:52 AM
To
>                                         Mike
>                                         Beckerle/Worcester/IBM at IBMUS
>                                                                    cc
>                                         dfdl-wg at ggf.org,
>                                         owner-dfdl-wg at ggf.org
>                                                               Subject
>                                         Fw: [dfdl-wg] Ambiguous
>                                         XPaths to hidden elements
>
>
>
>
>
>
>
>
>
>
>
>As a design point ,  We should strive not to put limitations on the
user
>defined schemas - it just works out better in the long run.
>
>Note the xsd:groups can be nested and they could be many levels deep
and
>this problem is not restricted to groups included  from noTarget
namespace
>, it could be from any namespace.  As per schema rules, all local
elements
>defined in groups or complex types belong to noTarget namespace unless
>elementFormDefault is explicitly set to "qualified" at schema level or
on
>the specific element.
>
>Detecting such conflicts could be quite expensive particularly when you
>have very large schemas.  Industry standard ACORD messaging schema is a
>good example it is about 1.5 M and it takes awfully long (hours) to
>validate it.  Putting additional constraints like this will further
slow
>down validation.
>
>Suman Kalia
>IBM Toronto Lab
>WebSphere Business Integration Application Connectivity Tools
>Tel : 905-413-3923  T/L  969-3923
>Fax : 905-413-4850
>Internet ID : kalia at ca.ibm.com
>----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 11:39 AM -----
>
>Mike Beckerle
><beckerle at us.ibm.com>
>Sent by: owner-dfdl-wg at ggf.org
>                                                                    To
>                                          "Robert E. McGrath"
>01/19/2006 10:48 AM                          <mcgrath at ncsa.uiuc.edu>
>                                                                    cc
>                                          dfdl-wg at ggf.org,
>                                          owner-dfdl-wg at ggf.org
>                                                               Subject
>                                          Re: Fw: [dfdl-wg] Ambiguous
>                                          XPaths to hidden elements
>
>
>
>
>
>
>
>
>
>
>
>
>One idea that hasn't been advanced yet is ruling out the problematic
case.
>
>Let me illustrate. Here's the example, modified to have a model group
>reference which can introduce the name conflict:
>
><xs:element name="root">
><xs:complexType>
>                           <xs:sequence>
>                                           <xs:annotation><xs:appinfo
>source="http://dataformat.org" />
>                                        <hidden>
>
><xs:element name="repeats" type="xs:integer"/>
>                                                           </hidden>
>                                           </xs:appinfo></xs:annotation
>>
>                           <xs:element name="testElement"
>type="xs:integer " minOccurs="0" maxOccurs="unbounded"
>                            dfdl:repeatCount="../repeats">
>
>                           <xs:group ref="groupFromOtherSchemaFile"/>
><!-- what if this has an element decl named "repeats"? -->
>
>           </xs:complexType>
></xs:element>
>
>So, what hasn't been suggested yet is this: What if we just say DFDL
>doesn't allow this? It's an error which must be detected. This DFDL
schema
>is broken because the path "../repeats" cannot be analyzed along with
the
>DFDL schema to return only a single node.
>
>I beleive name conflicts like this are what namespace management is
for.
>XSD has truly great namespace managment. You can solve the problem that
>way.
>
>Furthermore, when you define a reusable named group like the definer of
the
>"groupFromOtherSchemaFile" above, and you put it in no target
namespace,
>that's the situation where this conflict can arise. Expecting that your
>names are never going to conflict with anything in that case is just
naive.
>It's equivalent to having global variables in a C program module and
>expecting you can never link it to something else that uses the same
names.
>Those name conflicts can occur, and someone has to change the
conflicting
>name. In XSD we can do that by including the group in a schema which
puts
>it into a target namespace so that after that the namespaces can be
used to
>disambiguate.
>
>The approach above is consistent with the path "../repeats" still being
>officialy an "XPath", it just adds the semantic restriction that it
must be
>an XPath that identifies a single node unambiguously, independent of
what
>data is being processed. This is one of these "data independent"
notions
>(what I had previously been calling "static"), as we discussed
yesterday.
>
>...mikeb
>
>
>
>
>"Robert E. McGrath"
><mcgrath at ncsa.uiuc.edu>
>Sent by: owner-dfdl-wg at ggf.org
>                                                                    To
>                                            dfdl-wg at ggf.org
>01/19/2006 10:00 AM
cc
>
>                                                               Subject
>                                            Re: Fw: [dfdl-wg]
>                                            Ambiguous XPaths to hidden
>                                            elements
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>I would want to change XPath only as a last resort.  (Any of the
>options is OK by me, assuming we have to mess with the Xpath
>at all.)
>
>Can we deal with this some other way?
>
>Can we document the problematic cases, and suggest best practices that
>will minimize the problem?
>
>On Thursday 19 January 2006 08:45, Suman Kalia wrote:
>> I fully agree with Steve - let's not invent another XPATH like syntax
..
>>
>> Suman Kalia
>> IBM Toronto Lab
>> WebSphere Business Integration Application Connectivity Tools
>> Tel : 905-413-3923  T/L  969-3923
>> Fax : 905-413-4850
>> Internet ID : kalia at ca.ibm.com
>> ----- Forwarded by Suman Kalia/Toronto/IBM on 01/19/2006 09:43 AM
-----
>>
>> Steve Hanson <smh at uk.ibm.com>
>> Sent by: owner-dfdl-wg at ggf.org
>> 01/19/2006 04:43 AM
>>
>> To
>> "Westhead, Martin (Martin)" <westhead at avaya.com>
>> cc
>> dfdl-wg at ggf.org, owner-dfdl-wg at ggf.org
>> Subject
>> Re: [dfdl-wg] Ambiguous XPaths to hidden elements
>>
>>
>>
>>
>>
>>
>> As a DFDL parser implementor I do not want modifications to the XPath
>> syntax. I want to be able to reuse existing XPath implementations.
It's
>> also something else for the user to have to learn. So 2a/b/c are not
>> attractive.
>>
>> Regards, Steve
>>
>> Steve Hanson
>> WebSphere Message Brokers,
>> IBM Hursley, England
>> Internet: smh at uk.ibm.com
>> Phone (+44)/(0) 1962-815848
>>
>>
>>
>>              "Westhead, Martin
>>              (Martin)"
>>              <westhead at avaya.c
>To
>>
>>              om>                       <dfdl-wg at ggf.org>
>>              Sent by:
>cc
>>
>>              owner-dfdl-wg at ggf
>>              .org
>Subject
>>
>>                                        [dfdl-wg] Ambiguous XPaths to
>>                                        hidden elements
>>              18/01/2006 20:24
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Hi folks,
>>
>> This is to try to pick up on the issue identified by Suman in today?s
>> call.
>>
>> The Issue
>> Consider the following example:
>>
>> <xs:element name="root">
>> <xs:complexType>
>>                                 <xs:sequence>
>>
><xs:annotation><xs:appinfo
>> source=?http://dataformat.org? />
>>                                              <hidden>
>> <xs:element name="repeats" type="xs:integer"/>
>>
</hidden>
>>
>> </xs:appinfo></xs:annotation >
>>                                 <xs:element name="testElement"
>> type="xs:integer " minOccurs=?0? maxOccurs=?unbounded?
>>                                  dfdl:repeatCount=?../repeats?>
>>                 </xs:complexType>
>> </xs:element>
>>
>> The problem is that the path ?../repeats? can be broken by
modifications
>> to
>> the logical model due to name clashes on ?repeats? and there are
cases
>> that
>> can be constructed where this would not be obvious to a user.
>>
>> Possible Solutions
>> Possible fixes to this include:
>>    1.  Disallow  XPath  references to hidden elements the user is
forced
>> to
>>       place the material into the global context to refer to it.
>>    2.  Provide  a  special  XPath operator to indicate we are
referencing
>> a
>>       hidden element, possibilities include:
>>          a. ?../hidden(repeats)?
>>          b.  ?hidden(../repeats)?
>>          c.  ?../dfdl:hidden/repeats?
>>    3.  Only allow hidden elements to be present in top level global
>> complex>  ysTsc ebilewrnd.Tssh
>suo
>  ttSa sphg  but  I  don?t  fully  understand  it  ?
>> in
>>       particular I don?t see how it resolves the ambiguity issue.)
>>
>>
>> I believe my preference here is 2a or 2b followed by 1.
>>
>> Comments/suggestions/opinions?
>>
>> Thanks,
>>
>> Martin
>
>--
>---
>Robert E. McGrath, Ph.D.
>National Center for Supercomputing Applications
>University of Illinois, Urbana-Champaign
>1205 West Clark
>Urbana, Illinois 61801
>(217)-333-6549
>
>mcgrath at ncsa.uiuc.edu






More information about the dfdl-wg mailing list