[dfdl-wg] Annotation complexity

Fri Nov 19 17:13:00 CST 2004

In my experience the rep properties fall in two classes with respect to
being inherited by contained elements.

Type 1:  e.g., byteOrder - applies and inherited on contained scope. can be
overridden by a more local definition.

Type 2: e.g., delimiters (terminators and separators),  or alignment -
applies only to the element or grouping structure to which it is attached.

Example: Consider a CSV file.

Each row is terminated by a CRLF.

A comma separates the fields of a row. These fields are "contained" within
the row. But the fields are NOT terminated by a CRLF, so we don't want to
inherit that terminator def down the containment hierarchy.

At some point we'll have to categorize all the rep properties into these two
classes.

...mikeb

  _____  

From: Myers, James D [mailto:jim.myers at pnl.gov] 
Sent: Friday, November 19, 2004 4:13 PM
To: dfdl-wg at gridforum.org
Subject: RE: [dfdl-wg] Annotation complexity

I think I get the issue, but not necessarily the proposed solution. I can
have an element with a complex type that contains other elements with
complex types and I might want params such as littleendian to follow that
hierarchy, independent of the type derivation hierarchy. Are you saying that
inheritance should never flow down the 'contains' hierarchy?

  Jim

PS. After two days of this in December, I may need help in getting myself to
the airport ... ;-)

-----Original Message-----
From: owner-dfdl-wg at ggf.org [mailto:owner-dfdl-wg at ggf.org] On Behalf Of
Suman Kalia
Sent: Friday, November 19, 2004 3:47 PM
To: dfdl-wg at gridforum.org
Subject: Fw: [dfdl-wg] Annotation complexity

You are beginning to see the complexities of overrides in this very simple
example. Consider complex type hierarchy which is four/five levels deep and
then determine which override is applicable and where in the tree.  I
briefly mentioned in the call on Wednesday, that we should carefully
determine which annotations are applicable for which constructs of XML
schema and try to avoid the override mechanism.   

Annotations that exist on the element belong only to the element; there is
no inheritance or override issue here.  Annotations that exist on structural
constructs (group/complex types etc) truly belong to the structure only (
such as data element separators, delimiter  which cannot be associated with
elements because elements could be reused via element Ref in other
structures where they could have different delimiter in that structure etc).
Once we have separated the types and elements; then annotations defined on
derived types can follow the well established rules of inheritance ie
inherit annotation from parent unless  explicitly overridden etc.. 

Then comes the issue of defaults -  where to locate  and apply.    Possible
options are a) top level type  ( which for example could be type
corresponding to 01 level COBOL structure)  b)  A separate structure
available at tooling and runtime which contains the defaults.  We used the
latter in our implementation. 

Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools 
Tel : 905-413-3923  T/L  969-3923
Fax : 905-413-4850
Internet ID : kalia at ca.ibm.com 
----- Forwarded by Suman Kalia/Toronto/IBM on 11/19/2004 03:04 PM ----- 

"Myers, James D" <jim.myers at pnl.gov> 
Sent by: owner-dfdl-wg at ggf.org 

11/19/2004 02:25 PM 

To
dfdl-wg at gridforum.org 

cc

Subject
RE: [dfdl-wg] Annotation complexity	

I guess I'm not sure how restricting annotations to elements solves
things. I think I can recreate the problems in Martin's examples without
putting annotations on types:

The issue of it being hard to understand that triple overrides the
dfdlfromstrings param would seem to be the same whether the triple type
has an annotation or if some subelements within it get annotations
(either first second and third, or consider a triple type that specifies
an annotated element containing those three). In all these cases, it is
clear that you have to walk down the logical hierarchy which is broken
into parts in the dfdl/xsd file and keep a stack of contexts if we allow
any default/scoped annotations.

If annotations are allowed on both types and elements, what I find even
more difficult are situations where the triple type has one default, and
the element in "data" with that type has an annotation specifying the
opposite param value. Do we consider the element to be above the type in
the scope hierarchy? 

For more fun, what if triple is derived some other type where the
annotation is defined. Would an annotation on the "data" element be
inherited by the sub-element of type triple, or would the inheritance
from the triple base type win (i.e. neither the element of the type
triple or the triple type itself are directly annotated). (Or consider
an annotation on the "first" element defined in the base type for triple
rather than on the base type directly - does the element annotation
inherited from the type hierarchy trump the one from the element
hierarchy?)

An attempt at a picture where only elements have annotations:

Element A : param=littleendian                                  
 SubElement B: type ST

Type T:

SubElement C: param:
bigendian

Type ST: subtype of T

What is the param value of element C at A/B/C?

I guess I see a need to keep some hierarchically scoped defaults (a file
that has some ascii info and then a base64 encoded section of
littleendian stuff), but xsd makes it hard to define a single hierarchy.
Perhaps some rule of precedence - resolve annotations from type to
subtype first, then push those onto the stack of element scopes - would
make things unambiguous, if not user friendly.

 Jim

> -----Original Message-----
> From: owner-dfdl-wg at ggf.org [mailto:owner-dfdl-wg at ggf.org] On 
> Behalf Of Martin Westhead
> Sent: Friday, November 19, 2004 1:38 PM
> To: Martin Westhead
> Cc: dfdl-wg at gridforum.org
> Subject: Re: [dfdl-wg] Annotation complexity
> 
> 
> Sorry the elements in the triple were all supposed to be of a simple 
> type e.g.:
> 
>     <xs:complexType name="triple">
>       <xs:annotation>
>         <xs:appinfo>
>           <dfdlFromBinary/>
>         </xs:appinfo>
>       </xs:annotation>
>       <xs:sequence>
>         <xs:element name="first" type="xs:int"/>
>         <xs:element name="second" type="xs:int"/>
>         <xs:element name="third" type="xs:int"/>
>       </xs:sequence>
>     </xs:complexType>
> 
> 
>     <xs:complexType name="data">
>       <xs:annotation>
>         <xs:appinfo>
>           <dfdlFromStrings/>
>         </xs:appinfo>
>       </xs:annotation>
>       <xs:sequence>
>         <xs:element name="triple"/>
>       </xs:sequence>
>     </xs:complexType>
> 
> Martin Westhead wrote:
> > Hi,
> > 
> > I think I understand Suman's issue with annotations on the Schema 
> > tree.
> > (Please Suman tell me if I am right here).  The problem is, that 
> > lexically there are many trees in an XSD. Whilst in 
> practice these can 
> > clearly be considered as a single tree (including, I think, 
> even the 
> > simple type hierarchies) by placing all the type 
> definitions inline, 
> > this is not the way they appear to the user. So for example 
> if I have a 
> > file with conflicting annotations looking like:
> > 
> >   <xs:complexType name="triple">
> >     <xs:annotation>
> >       <xs:appinfo>
> >         <dfdlFromBinary/>
> >       </xs:appinfo>
> >     </xs:annotation>
> >     <xs:sequence>
> >       <xs:element name="first" type="xs:int"/>
> >       <xs:element name="second"/>
> >       <xs:element name="third"/>
> >     </xs:sequence>
> >   </xs:complexType>
> > 
> > 
> >   <xs:complexType name="data">
> >     <xs:annotation>
> >       <xs:appinfo>
> >         <dfdlFromStrings/>
> >       </xs:appinfo>
> >     </xs:annotation>
> >     <xs:sequence>
> >       <xs:element name="triple"/>
> >     </xs:sequence>
> >   </xs:complexType>
> > 
> > So what I imagined is that we would assume that the "triple" type is
> > considered _inside_ the scope of the "data" type and so the 
> > "dfdlFromBinary" tag wins.
> > 
> > On the other hand the user sees two trees of equal depth with
> > conflicting annotations. The examples can obviously get much more 
> > intricate.
> > 
> > The issue is really that the scope of the annotations is 
> not lexically
> > defined.  At some level this is just like having globally included 
> > variables in a programming language. On the other hand we 
> have arbitrary 
> > levels of these.
> > 
> > Suman is this the problem?
> > 
> > If this is the problem, and we agree that it is too confusing to the
> > user (my opinion is still out on this). Then I see that the 
> conclusion 
> > is to adopt an approach similar to IBM's that annotations 
> can appear 
> > only on <element> and <attribute> tags. Even the top level 
> of the file 
> > is confusing since there may be many files involved. I 
> guess we can also 
> > have runtime defaults and default settings set in the 
> standard. I don't 
> > like this conclusion incidentally, can someone convince me 
> it is the 
> > wrong one?
> > 
> > Martin
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20041119/4d7b8858/attachment.htm