[DFDL-WG] Go Meta - RE: DFDL Schema - Mulitple forms of Annotation syntax

DFDL mbeckerle.dfdl at gmail.com
Tue Apr 28 22:01:17 CDT 2009


Suman,

I think your email just documented the algorithm needed to make sense  
of the annotation properties with the mix of unqualified long form and  
qualified short form.

So while I might have been concerned about this complexity before, You  
talked me out of it. :-)

...mikeb


On Apr 27, 2009, at 7:31 PM, Suman Kalia <kalia at ca.ibm.com> wrote:

>
> Mike - We can create  meta schema and  write XSL utility to create  
> ultimate DFDL schema or code it by hand; the latter is a one time  
> effort and  undoubtedly there would be some overhead to update the  
> schema by hand from one version to another but I don't expect many  
> drastic changes to come from one version of the spec to other.   The  
> key part of the issue I am trying to resolve is whether the  
> attributes in the special annotations will be namespace qualified or  
> whether they will be local attributes (ie without namespace  
> qualification).  It affects how the ultimate DFDL schema describing  
> annotations will be used by  users and tools..
>
> Say if we go with local attributes in complex types and global  
> attributes (which by the way be required to support short form of  
> annotations), then there will be a processing overhead as you cannot  
> easily determine if a particular attribute inherited from  parent  
> container or specified through short from is overriding the  
> attribute in the special annotation or not.. While making such  
> decisions you would have to do comparison based on base name of the  
> attribute and not explicit NCName of the attribute etc.  Also you  
> would have to ensure that you take only those attributes from short  
> form of annotations which belong to dfdl namespace, strip their  
> namespace part for comparing with local attribute name and ensuring  
> that the local attributes being matched are part of the enumeration  
> set defined through dfdl spec etc..
>
> However if you make the attributes in complex type of special  
> annotations namespace qualified, then you have somewhat usability  
> issue as I hilited in the example but processing wise it is quite  
> efficient and consistent.
>
> Suman Kalia
> IBM Toronto Lab
> WMB Toolkit Architect and Development Lead
> WebSphere Business Integration Application Connectivity Tools
>
> http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
>
> Tel : 905-413-3923  T/L  969-3923
> Fax : 905-413-4850 T/L  969-4850
> Internet ID : kalia at ca.ibm.com
>
>
> From:	"Mike Beckerle" <mbeckerle.dfdl at gmail.com>
> To:	Suman Kalia/Toronto/IBM at IBMCA, "'Alan Powell'" <alan_powell at uk.ibm.com 
> >, "'Steve Hanson'" <smh at uk.ibm.com>
> Cc:	<dfdl-wg at ogf.org>
> Date:	04/22/2009 08:30 AM
> Subject: 	Go Meta - RE: DFDL Schema - Mulitple forms of Annotation  
> syntax
>
>
>
>
> These difficulties, all seem to arise from the intellectual  
> objective to leverage XML Schema validation against DFDL schema  
> annotations maximally. Given the 3 syntaxes, this is a bit painful  
> if done directly.
>
> I suggest the following approach. Write a "Meta XML Schema" for DFDL  
> annotations. This XML document would contain exactly one list of all  
> the DFDL annotations. E.g., something like:
>
> <!--- This is not a DFDL Schema - it's a meta document --->
> <dfdl:defineProperty name="initiator" allowedOn="complexType  
> sequence element" type="string" expressionAllowed="yes"/>
> <dfdl:defineProperty name="terminator" allowedOn="complexType  
> sequence element" type="string" expressionAllowed="yes"/>
> ...
>
> This document could be turned into the set of redundant annotations  
> on a "real" XML Schema for DFDL annotations by a XSL stylesheet  
> translation. This would actually be a pretty easy XSL stylesheet.  
> Each dfdl:defineProperty in the meta schema would end up putting  
> content into the big enum, a local and global attribute definition,  
> and anywhere else needed in the "real" XML Schema for DFDL  
> annotations.
>
> I would suggest that doc strings might also be good to put in: e.g.,
>
> <dfdl:defineProperty name="initiator" allowedOn="complexType  
> sequence element" type="string" expressionAllowed="yes">
> Defines a string used to recognize the beginning of a  
> construct..blah blah blah
> </dfdl:defineProperty>
>
> This could then be used to generate online doc also.
>
> I always use this sort of "go meta" approach whenever I have  
> redundant, but isomorphic things happening in code. I always create  
> a spec and generate the redundant instances from that. This helps  
> tremendously with QA, because whole classes of inconsistency errors  
> are ruled out.
>
> ...mike
>
>
>
>
>
> 1. Shot form syntax
> <xs:element name="foo" type="xs:int" maxOccurs="unbounded"  
> dfdl:representation="text" dfdl:initiator="{" dfdl:terminator=”}” 
> />
>
> Note: It requires all properties to be defined as global attributes  
> in schema as each attribute has to be uniquely identifiable and  
> accessible within the DFDL namespace
>
> 2. Long form syntax ( used for dfdl:format and specialized  
> annotations such as dfdl:element etc)
>
> <xs:element name="foo" type="xs:int" maxOccurs="unbounded">
>      <xs:annotation><xs:appinfo source=”http://www.ogf.org/dfdl/”>
>        <dfdl:format representation="text" initiator=”{“   
> terminator="}"/>
>      </xs:appinfo></xs:annotation>
> </xs:element>
>
> Note: Attributes specified here are not qualified with dfdl  
> namespace prefix. It means the attributes have to be defined locally  
> within the complex type are not accessible directly (like in the  
> short form syntax)
>
> 3.  Long form property representation to ease syntactic expression  
> difficulties with data in attribute values:
>  <xs:annotation>
>    <xs:appinfo source=”http://www.ogf.org/dfdl/”>
>        <dfdl:format>
>         <dfdl:property name='encoding'>utf-8</dfdl:property>
>         <dfdl:property name='separator'>\n</dfdl:property>
> <dfdl:property name=’initiator’><[CDATA[<!-- ]]></dfdl:property>
>        </dfdl:format>
>    </xs:appinfo>
> </xs:annotation>
>
> Note: The name attribute will be enumeration of base name of all  
> DFDL properties (ie without namespace prefix). On the specialized  
> annotations (ie dfdl:element) we will also need to have a repeating  
> child element definition for dfdl:property. This annotation will  
> appear on specialized annotation element for dfdl attributes whose  
> value cannot be stored as XML attribute.
>
> PS: Alan - when you update the spec for specialized annotation  
> please mention this use case  and put an example also.
>
> To support  syntax form 1 & 2, it  requires defining all attribute  
> definition for each  DFDL property multiple times ie one global  
> attribute and a local attribute in the complex type of each long  
> form annotation element.  Using Attribute groups we can address the  
> latter issue ie define related local attributes once in the  
> attribute group and include the attribute group in one or more  
> complex type. But it still requires defining the attribute  
> definition for each DFDL property atleast twice (ie one global and  
> one local attribute definition).  This duplication can potentially  
> lead to inconsistency and inadvertently updating one definition and  
> not the other.
>
> One option that we can consider to remove this redundancy is to  
> define only the global attribute definition for each DFDL property  
> and include it through attribute reference in each of the complex  
> type of  dfdl:format and specialized annotations.  The downside of  
> this approach is that attributes will appear qualified in the  
> dfdl:format and specialized annotation elements. Given below is how  
> the dfdl:format annotation  will appear in the user defined schema  
> where each attribute is prefixed with dfdl (hilited in green).
>
> <xs:element name="foo" type="xs:int" maxOccurs="unbounded">
>      <xs:annotation><xs:appinfo source=”http://www.ogf.org/dfdl/”>
>        <dfdl:format dfdl:representation="text"  
> dfdl:initiator=”{“  dfdl:terminator="}"/>
>      </xs:appinfo></xs:annotation>
> </xs:element>
>
> From technical perspective (e.g. schema integrity , avoiding  
> duplication definitions etc.), this is better but it does have  
> usability issue.
>
>
> Suman Kalia
> IBM Toronto Lab
> WMB Toolkit Architect and Development Lead
> WebSphere Business Integration Application Connectivity Tools
>
> http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
>
> Tel : 905-413-3923  T/L  969-3923
> Fax : 905-413-4850 T/L  969-4850
> Internet ID : kalia at ca.ibm.com
> Fax : 905-413-4850 T/L  969-4850
> Internet ID : kalia at ca.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20090428/bed0ced8/attachment.html 


More information about the dfdl-wg mailing list