Arrays issue - Re: [dfdl-wg] Issues: additional data types
Mike Beckerle
beckerle at us.ibm.com
Tue Sep 6 10:48:01 CDT 2005
The need for an approach to arrays is clear and is acute to many DFDL
constituencies.
The first step in any approach to arrays for DFDL is an XML model for
array data and an XSD for describing it. Then DFDL can put properties on
this.
I suggest the following model. Consider a 2-d case. This will generalize
to N dimensions.
Each axis is named. The array itself is represented as elements, with
attributes used to identify the position of the value on each axis
conceptually like so:
<a x="5" y="-2">51</a>
That is, you think of each array element as having attributes identifying
its position in the array. Of course DFDL allows data to be processed
without ever creating elements like that, so this is a conceptual model
only, particularly for a dense array.
That element is of an array named 'a', at position x=5, y=-2, having value
51.
The declaration in XSD would be like this:
<element name="a" maxOccurs="unbounded">
<complexType>
<extension base="int">
<simpleContent>
<attribute name="x">
<simpleType>
<restriction base="int">
<maxInclusive value="5"/>
<minInclusive value="-5"/>
</restriction>
</simpleType>
</attribute>
<attribute name="y">
<simpleType>
<restriction base="int">
<maxInclusive value="10"/>
<minInclusive value="-10"/>
</restriction>
</simpleType>
</attribute>
</simpleContent>
</extension>
</complexType>
</element>
Notice how the ranges of the index values are captured in XSD by use of
the simple type restriction, and can cover arbitrary sections of the
integer space, including negative indices.
DFDL would then provide properties for
1) declaring that 'a' is an array and that 'x' and 'y' are array indices
(and therefore do not have values stored anywhere in the data).
2) declaring the storage-order of the array. This can be an ordered list
of the dimension names. E.g., "x y" or "y x" depending on which index
changes fastest in the storage ordering.
Access to elements would be by XPath expressions like this: ..../a[x='5'
and y='-2']. Processors would recognize that x and y are array indices
based on DFDL annotations and would thereby recognize predicates involving
the indices and treat them specially. For example, we could preclude
slicing arrays like this: ..../a[x='0'] that is, where the 'y' axis is
unconstrained.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20050906/b8e3e2fb/attachment.html
More information about the dfdl-wg
mailing list