[DFDL-WG] Go Meta - RE: DFDL Schema - Mulitple forms of Annotation syntax

Mike Beckerle mbeckerle.dfdl at gmail.com
Wed Apr 22 07:27:29 CDT 2009


These difficulties, all seem to arise from the intellectual objective to
leverage XML Schema validation against DFDL schema annotations maximally.
Given the 3 syntaxes, this is a bit painful if done directly.
 
I suggest the following approach. Write a "Meta XML Schema" for DFDL
annotations. This XML document would contain exactly one list of all the
DFDL annotations. E.g., something like:
 
<!--- This is not a DFDL Schema - it's a meta document --->
<dfdl:defineProperty name="initiator" allowedOn="complexType sequence
element" type="string" expressionAllowed="yes"/>
<dfdl:defineProperty name="terminator" allowedOn="complexType sequence
element" type="string" expressionAllowed="yes"/>
...
 
This document could be turned into the set of redundant annotations on a
"real" XML Schema for DFDL annotations by a XSL stylesheet translation. This
would actually be a pretty easy XSL stylesheet. Each dfdl:defineProperty in
the meta schema would end up putting content into the big enum, a local and
global attribute definition, and anywhere else needed in the "real" XML
Schema for DFDL annotations. 
 
I would suggest that doc strings might also be good to put in: e.g., 
 
<dfdl:defineProperty name="initiator" allowedOn="complexType sequence
element" type="string" expressionAllowed="yes">
Defines a string used to recognize the beginning of a construct..blah blah
blah
</dfdl:defineProperty>
 
This could then be used to generate online doc also.
 
I always use this sort of "go meta" approach whenever I have redundant, but
isomorphic things happening in code. I always create a spec and generate the
redundant instances from that. This helps tremendously with QA, because
whole classes of inconsistency errors are ruled out.
 
...mike
 
 
 
 

1. Shot form syntax 
<xs:element name="foo" type="xs:int" maxOccurs="unbounded"
dfdl:representation="text" dfdl:initiator="{" dfdl:terminator="}"/> 

Note: It requires all properties to be defined as global attributes in
schema as each attribute has to be uniquely identifiable and accessible
within the DFDL namespace 

2. Long form syntax ( used for dfdl:format and specialized annotations such
as dfdl:element etc) 

<xs:element name="foo" type="xs:int" maxOccurs="unbounded"> 
      <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/"> 
        <dfdl:format representation="text" initiator="{"  terminator="}"/> 
      </xs:appinfo></xs:annotation> 
</xs:element> 

Note: Attributes specified here are not qualified with dfdl namespace
prefix. It means the attributes have to be defined locally within the
complex type are not accessible directly (like in the short form syntax) 

3.  Long form property representation to ease syntactic expression
difficulties with data in attribute values: 
  <xs:annotation> 
    <xs:appinfo source="http://www.ogf.org/dfdl/"> 
        <dfdl:format> 
         <dfdl:property name='encoding'>utf-8</dfdl:property> 
         <dfdl:property name='separator'>\n</dfdl:property> 
<dfdl:property name='initiator'><[CDATA[<!-- ]]></dfdl:property> 
        </dfdl:format> 
    </xs:appinfo> 
</xs:annotation> 

Note: The name attribute will be enumeration of base name of all DFDL
properties (ie without namespace prefix). On the specialized annotations (ie
dfdl:element) we will also need to have a repeating child element definition
for dfdl:property. This annotation will appear on specialized annotation
element for dfdl attributes whose value cannot be stored as XML attribute.

  
PS: Alan - when you update the spec for specialized annotation please
mention this use case  and put an example also. 

To support  syntax form 1 & 2, it  requires defining all attribute
definition for each  DFDL property multiple times ie one global attribute
and a local attribute in the complex type of each long form annotation
element.  Using Attribute groups we can address the latter issue ie define
related local attributes once in the attribute group and include the
attribute group in one or more complex type. But it still requires defining
the attribute definition for each DFDL property atleast twice (ie one global
and one local attribute definition).  This duplication can potentially lead
to inconsistency and inadvertently updating one definition and not the
other. 

One option that we can consider to remove this redundancy is to define only
the global attribute definition for each DFDL property and include it
through attribute reference in each of the complex type of  dfdl:format and
specialized annotations.  The downside of this approach is that attributes
will appear qualified in the dfdl:format and specialized annotation
elements. Given below is how the dfdl:format annotation  will appear in the
user defined schema where each attribute is prefixed with dfdl (hilited in
green). 

<xs:element name="foo" type="xs:int" maxOccurs="unbounded"> 
      <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/"> 
        <dfdl:format dfdl:representation="text" dfdl:initiator="{"
dfdl:terminator="}"/> 
      </xs:appinfo></xs:annotation> 
</xs:element> 

>From technical perspective (e.g. schema integrity , avoiding duplication
definitions etc.), this is better but it does have usability issue. 


Suman Kalia
IBM Toronto Lab
WMB Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools 

 
<http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.h
tml>
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.ht
ml

Tel : 905-413-3923  T/L  969-3923
Fax : 905-413-4850 T/L  969-4850
Internet ID : kalia at ca.ibm.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20090422/8541b709/attachment-0001.html 


More information about the dfdl-wg mailing list