[DFDL-WG] Go Meta - RE: DFDL Schema - Mulitple forms of Annotation syntax
Suman Kalia
kalia at ca.ibm.com
Mon Apr 27 18:31:11 CDT 2009
Mike - We can create meta schema and write XSL utility to create
ultimate DFDL schema or code it by hand; the latter is a one time effort
and undoubtedly there would be some overhead to update the schema by hand
from one version to another but I don't expect many drastic changes to
come from one version of the spec to other. The key part of the issue I
am trying to resolve is whether the attributes in the special annotations
will be namespace qualified or whether they will be local attributes (ie
without namespace qualification). It affects how the ultimate DFDL schema
describing annotations will be used by users and tools..
Say if we go with local attributes in complex types and global attributes
(which by the way be required to support short form of annotations), then
there will be a processing overhead as you cannot easily determine if a
particular attribute inherited from parent container or specified through
short from is overriding the attribute in the special annotation or not..
While making such decisions you would have to do comparison based on base
name of the attribute and not explicit NCName of the attribute etc. Also
you would have to ensure that you take only those attributes from short
form of annotations which belong to dfdl namespace, strip their namespace
part for comparing with local attribute name and ensuring that the local
attributes being matched are part of the enumeration set defined through
dfdl spec etc..
However if you make the attributes in complex type of special annotations
namespace qualified, then you have somewhat usability issue as I hilited
in the example but processing wise it is quite efficient and consistent.
Suman Kalia
IBM Toronto Lab
WMB Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia at ca.ibm.com
From:
"Mike Beckerle" <mbeckerle.dfdl at gmail.com>
To:
Suman Kalia/Toronto/IBM at IBMCA, "'Alan Powell'" <alan_powell at uk.ibm.com>,
"'Steve Hanson'" <smh at uk.ibm.com>
Cc:
<dfdl-wg at ogf.org>
Date:
04/22/2009 08:30 AM
Subject:
Go Meta - RE: DFDL Schema - Mulitple forms of Annotation syntax
These difficulties, all seem to arise from the intellectual objective to
leverage XML Schema validation against DFDL schema annotations maximally.
Given the 3 syntaxes, this is a bit painful if done directly.
I suggest the following approach. Write a "Meta XML Schema" for DFDL
annotations. This XML document would contain exactly one list of all the
DFDL annotations. E.g., something like:
<!--- This is not a DFDL Schema - it's a meta document --->
<dfdl:defineProperty name="initiator" allowedOn="complexType sequence
element" type="string" expressionAllowed="yes"/>
<dfdl:defineProperty name="terminator" allowedOn="complexType sequence
element" type="string" expressionAllowed="yes"/>
...
This document could be turned into the set of redundant annotations on a
"real" XML Schema for DFDL annotations by a XSL stylesheet translation.
This would actually be a pretty easy XSL stylesheet. Each
dfdl:defineProperty in the meta schema would end up putting content into
the big enum, a local and global attribute definition, and anywhere else
needed in the "real" XML Schema for DFDL annotations.
I would suggest that doc strings might also be good to put in: e.g.,
<dfdl:defineProperty name="initiator" allowedOn="complexType sequence
element" type="string" expressionAllowed="yes">
Defines a string used to recognize the beginning of a construct..blah blah
blah
</dfdl:defineProperty>
This could then be used to generate online doc also.
I always use this sort of "go meta" approach whenever I have redundant,
but isomorphic things happening in code. I always create a spec and
generate the redundant instances from that. This helps tremendously with
QA, because whole classes of inconsistency errors are ruled out.
...mike
1. Shot form syntax
<xs:element name="foo" type="xs:int" maxOccurs="unbounded"
dfdl:representation="text" dfdl:initiator="{" dfdl:terminator=”}”/>
Note: It requires all properties to be defined as global attributes in
schema as each attribute has to be uniquely identifiable and accessible
within the DFDL namespace
2. Long form syntax ( used for dfdl:format and specialized annotations
such as dfdl:element etc)
<xs:element name="foo" type="xs:int" maxOccurs="unbounded">
<xs:annotation><xs:appinfo source=”http://www.ogf.org/dfdl/”>
<dfdl:format representation="text" initiator=”{“ terminator="}"/>
</xs:appinfo></xs:annotation>
</xs:element>
Note: Attributes specified here are not qualified with dfdl namespace
prefix. It means the attributes have to be defined locally within the
complex type are not accessible directly (like in the short form syntax)
3. Long form property representation to ease syntactic expression
difficulties with data in attribute values:
<xs:annotation>
<xs:appinfo source=”http://www.ogf.org/dfdl/”>
<dfdl:format>
<dfdl:property name='encoding'>utf-8</dfdl:property>
<dfdl:property name='separator'>\n</dfdl:property>
<dfdl:property name=’initiator’><[CDATA[<!-- ]]></dfdl:property>
</dfdl:format>
</xs:appinfo>
</xs:annotation>
Note: The name attribute will be enumeration of base name of all DFDL
properties (ie without namespace prefix). On the specialized annotations
(ie dfdl:element) we will also need to have a repeating child element
definition for dfdl:property. This annotation will appear on specialized
annotation element for dfdl attributes whose value cannot be stored as XML
attribute.
PS: Alan - when you update the spec for specialized annotation please
mention this use case and put an example also.
To support syntax form 1 & 2, it requires defining all attribute
definition for each DFDL property multiple times ie one global attribute
and a local attribute in the complex type of each long form annotation
element. Using Attribute groups we can address the latter issue ie define
related local attributes once in the attribute group and include the
attribute group in one or more complex type. But it still requires
defining the attribute definition for each DFDL property atleast twice (ie
one global and one local attribute definition). This duplication can
potentially lead to inconsistency and inadvertently updating one
definition and not the other.
One option that we can consider to remove this redundancy is to define
only the global attribute definition for each DFDL property and include it
through attribute reference in each of the complex type of dfdl:format
and specialized annotations. The downside of this approach is that
attributes will appear qualified in the dfdl:format and specialized
annotation elements. Given below is how the dfdl:format annotation will
appear in the user defined schema where each attribute is prefixed with
dfdl (hilited in green).
<xs:element name="foo" type="xs:int" maxOccurs="unbounded">
<xs:annotation><xs:appinfo source=”http://www.ogf.org/dfdl/”>
<dfdl:format dfdl:representation="text" dfdl:initiator=”{“ dfdl:
terminator="}"/>
</xs:appinfo></xs:annotation>
</xs:element>
From technical perspective (e.g. schema integrity , avoiding duplication
definitions etc.), this is better but it does have usability issue.
Suman Kalia
IBM Toronto Lab
WMB Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia at ca.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20090427/1b6db621/attachment-0001.html
More information about the dfdl-wg
mailing list