Enums - Re: split into multiple topics - Re: [dfdl-wg] Issues: additional data types

Mike Beckerle beckerle at us.ibm.com
Tue Sep 6 11:40:12 CDT 2005


About enums. Here's starting thoughts:

Here's a real-world example from COBOL:

01  AS-CPST-REC.
06  AS-CPCOM. 
        09  AS-COM-STORE-TYPE           PIC X.
        09  AS-COM-STORE-NO             PIC 9(05).
        09  AS-COM-TRAN-ID              PIC X(04).
                88  TRAN-COUPON          VALUE 'CP80'.
                88  TRAN-REVENUE         VALUE 'IC40'
                                               'RA40'.
                88  TRAN-SALES           VALUE 'IC40'.
                88  TRAN-DELIVER         VALUE 'IC44'.
                88  TRAN-RENTS           VALUE 'RA40' 
                                               'RA42'.
                88  TRAN-RENT-RETURN     VALUE 'RA41'.
        09  AS-COM-QUANTITY             PIC S9(05).
        09  AS-COM-PART-NO              PIC 9(06).
        .... more fields elided ....

Those "88" entries in there are enumerated constants. Note that for the 
TRAN-REVENUE and TRAN-RENTS constants, multiple values are associated with 
the same name. On reference this means that the constant matches either 
value. When written, this means the first value is used. Cobol doesn't 
strongly associate these enumerated values with the field to which they 
can be assigned, but usually it's obvious. In this case it is the 
AS-COM-TRAN-ID field which is the 4-character-long string which has the 
string constants associated with it.

This particular example is a common one. The record has variant structure 
(not shown above) depending on the tag field which is this AS-COM-TRAN-ID 
field.

Working out how we want this example to work in XSD is the first step. 

I like using a hidden field here. For example: Here's a possible idea for 
how this is represented in XSD: 


<xs:element  name="AS-COM-TRAN-ID"> <!-- logical tag element -->
   <xs:simpleType>
        <xs:restriction base="xs:NCName">
            <xs:enumeration value="TRAN-COUPON"/>
            <xs:enumeration value="TRAN-REVENUE"/>
            <xs:enumeration value="TRAN-SALES"/>
            <xs:enumeration value="TRAN-DELIVER"/>
           <xs:enumeration value="TRAN-RENTS"/>
        </xs:restriction>
    </xs:simpleType>
</xs:element>
<xs:sequence>
   <xs:annotation>
     <xs:appinfo source="http://dataformat.org/">
         <xs:layer name="rep" type="AS-COM-TRAN-ID-repType"/> <!-- hidden 
physical tag rep -->
    </xs:appinfo>
  </xs:annotation>
</xs:sequence>

<xs:simpleType name="AS-COM-TRAN-ID-repType">
     <xs:restriction base="xs:string>
         <xs:enumeration value="CP80"/>
         <xs:enumeration value="IC40"/> <!-- rest of the tag values elided 
here -->
         ....
      </xs:restriction>
</xs:simpleType>

Now in an annotation (not shown above) on the element AS-COM-TRAN-ID there 
would be a dfdl:valueCalc property which would compute the value of 
AS-COM-TRAN-ID based on the value of the hidden field. Symmetrically, the 
'rep' hidden field would have a dfdl:repCalc property which would give the 
inverse formula for output.

One difficulty I have with this is the notion that we're projecting into 
the string type. I.e., these symbolic constants aren't names for integers, 
but rather we're expressing operations on strings. In the above example 
the enumerated constants actually are strings, but in other examples they 
would be integers. The next tier of interpretation, i.e., where we're 
decidng the variant based on the value of AS-COM-TRAN-ID would be 
expressed as string comparisons which is potentially inefficient. This is 
part of the problem with using XSD as our type system basis. XSD doesn't 
have a notion of symbolic named constant.

Alternative: There is the DTD named entity stuff. Does anybody want to 
propose that? 

Mike Beckerle
Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA



Mike Beckerle/Worcester/IBM at IBMUS 
Sent by: owner-dfdl-wg at ggf.org
09/02/2005 04:34 PM

To
"Robert E. McGrath" <mcgrath at ncsa.uiuc.edu>
cc
dfdl-wg at gridforum.org, owner-dfdl-wg at ggf.org
Subject
split into multiple topics - Re: [dfdl-wg] Issues: additional data types







I'd like to split this topic into several distinct ones: 

Arrays - I have a placeholder for this in the doc. 

Opaque and "code" types are separate. This is related also to the concept 
of "open content". 

Enums 

Bitfields 

Pointers 


Mike Beckerle
Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA 


"Robert E. McGrath" <mcgrath at ncsa.uiuc.edu> 
Sent by: owner-dfdl-wg at ggf.org 
09/02/2005 03:13 PM 


To
dfdl-wg at gridforum.org 
cc

Subject
[dfdl-wg] Issues: additional data types








Greetings,

Here is an "issue" for the DFDL: additional data types that should
be considered.

Please see attached.

---
Robert E. McGrath
National Center for Supercomputing Applications
University of Illinois, Urbana-Champaign
Champaign, Illinois 61820
(217)-333-6549

mcgrath at ncsa.uiuc.edu 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20050906/e09fedf8/attachment.html 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20050906/e09fedf8/attachment.htm 


More information about the dfdl-wg mailing list