Fw: Enums - Re: split into multiple topics - Re: [dfdl-wg] Issues: additional data types

Steve Hanson smh at uk.ibm.com
Tue Sep 6 13:49:01 CDT 2005


You can consider the MRM model as an IBM proprietary equivalent to DFDL.
The logical model is XML schema, the physical model is our equivalent of
DFDL annotations.

I don't see the real world case for having xs:enums on the schema simple
types whose values are the names of the 88s. The only requirement I have
seen is for level 88 values to participate in validation of the byte
stream, which means the xs:enums are the values of the 88s, not the names.

Regards, Steve

Steve Hanson
WebSphere Business Integration Brokers,
IBM Hursley, England
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 06/09/2005 19:43 -----
                                                                           
             "Robert E.                                                    
             McGrath"                                                      
             <mcgrath at ncsa.uiu                                          To 
             c.edu>                    dfdl-wg at gridforum.org               
             Sent by:                                                   cc 
             owner-dfdl-wg at ggf                                             
             .org                                                  Subject 
                                       Re: Enums - Re: split into multiple 
                                       topics - Re: [dfdl-wg] Issues:      
             06/09/2005 18:49          additional data types               
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




On Tuesday 06 September 2005 12:21, Steve Hanson wrote:
> The MRM model COBOL importer gives the user an option to create
> xs:enumeration values from level 88 values, and that is all.  This is so
> the MRM parser can validate a byte stream against the model. This is in
> keeping with the rule that values supplied with metadata are part of the
> logical model, not the physical model. What's the motivation for doing
> anything more than this?

Sorry, I don't know what "the logical model" vs. "the physical model"
means.

IMO, the goal is to have a way for a producer to generate data
(e.g., with 88s as discussed below), which a consumer can interpret
as either 88 or symbols, depending on the purpose of the consumer.


>
> Using your example that gives:
>
> <xs:element  name="AS-COM-TRAN-ID">
>    <xs:simpleType>
>         <xs:restriction base="xs:string">
>             <xs:enumeration value="CP80"/>
>             <xs:enumeration value="IC40"/>
> ...
>         </xs:restriction>
>     </xs:simpleType>
> </xs:element>
>
> Regards, Steve
>
> Steve Hanson
> WebSphere Business Integration Brokers,
> IBM Hursley, England
> Internet: smh at uk.ibm.com
> Phone (+44)/(0) 1962-815848
>
>
>
>              Mike Beckerle
>              <beckerle at us.ibm.
>              com>
To
>              Sent by:                  dfdl-wg at gridforum.org
>              owner-dfdl-wg at ggf
cc
>              .org
>
Subject
>                                        Enums - Re: split into multiple
>              06/09/2005 17:40          topics - Re: [dfdl-wg] Issues:
>                                        additional data types
>
>
>
>
>
>
>
>
>
>
>
> About enums. Here's starting thoughts:
>
> Here's a real-world example from COBOL:
>
> 01  AS-CPST-REC.
> 06  AS-CPCOM.
>         09  AS-COM-STORE-TYPE           PIC X.
>         09  AS-COM-STORE-NO             PIC 9(05).
>         09  AS-COM-TRAN-ID              PIC X(04).
>                 88  TRAN-COUPON          VALUE 'CP80'.
>                 88  TRAN-REVENUE         VALUE 'IC40'
>                                                'RA40'.
>                 88  TRAN-SALES           VALUE 'IC40'.
>                 88  TRAN-DELIVER         VALUE 'IC44'.
>                 88  TRAN-RENTS           VALUE 'RA40'
>                                                'RA42'.
>                 88  TRAN-RENT-RETURN         VALUE 'RA41'.
>         09  AS-COM-QUANTITY             PIC S9(05).
>         09  AS-COM-PART-NO              PIC 9(06).
>         .... more fields elided ....
>
> Those "88" entries in there are enumerated constants. Note that for the
> TRAN-REVENUE and TRAN-RENTS constants, multiple values are associated
with
> the same name. On reference this means that the constant matches either
> value. When written, this means the first value is used. Cobol doesn't
> strongly associate these enumerated values with the field to which they
can
> be assigned, but usually it's obvious. In this case it is the
> AS-COM-TRAN-ID field which is the 4-character-long string which has the
> string constants associated with it.
>
> This particular example is a common one. The record has variant structure
> (not shown above) depending on the tag field which is this AS-COM-TRAN-ID
> field.
>
> Working out how we want this example to work in XSD is the first step.
>
> I like using a hidden field here. For example: Here's a possible idea for
> how this is represented in XSD:
>
>
> <xs:element  name="AS-COM-TRAN-ID"> <!-- logical tag element -->
>    <xs:simpleType>
>         <xs:restriction base="xs:NCName">
>             <xs:enumeration value="TRAN-COUPON"/>
>             <xs:enumeration value="TRAN-REVENUE"/>
>             <xs:enumeration value="TRAN-SALES"/>
>             <xs:enumeration value="TRAN-DELIVER"/>
>            <xs:enumeration value="TRAN-RENTS"/>
>         </xs:restriction>
>     </xs:simpleType>
> </xs:element>
> <xs:sequence>
>    <xs:annotation>
>      <xs:appinfo source="http://dataformat.org/">
>          <xs:layer name="rep" type="AS-COM-TRAN-ID-repType"/> <!-- hidden
> physical tag rep -->
>     </xs:appinfo>
>   </xs:annotation>
> </xs:sequence>
>
> <xs:simpleType name="AS-COM-TRAN-ID-repType">
>      <xs:restriction base="xs:string>
>          <xs:enumeration value="CP80"/>
>          <xs:enumeration value="IC40"/> <!-- rest of the tag values
elided> here -->
>          ....
>       </xs:restriction>
> </xs:simpleType>
>
> Now in an annotation (not shown above) on the element AS-COM-TRAN-ID
there
> would be a dfdl:valueCalc property which would compute the value of
> AS-COM-TRAN-ID based on the value of the hidden field. Symmetrically, the
> 'rep' hidden field would have a dfdl:repCalc property which would give
the
> inverse formula for output.
>
> One difficulty I have with this is the notion that we're projecting into
> the string type. I.e., these symbolic constants aren't names for
integers,
> but rather we're expressing operations on strings. In the above example
the
> enumerated constants actually are strings, but in other examples they
would
> be integers. The next tier of interpretation, i.e., where we're decidng
the
> variant based on the value of AS-COM-TRAN-ID would be expressed as string
> comparisons which is potentially inefficient. This is part of the problem
> with using XSD as our type system basis. XSD doesn't have a notion of
> symbolic named constant.
>
> Alternative: There is the DTD named entity stuff. Does anybody want to
> propose that?
>
> Mike Beckerle
> Architect, Scalable Computing
> IBM Software Group
> Information Integration Solutions
> Westborough, MA
>
>
>  Mike
>  Beckerle/Worcester/IBM at IBMUS
>
>  Sent by:
To
>  owner-dfdl-wg at ggf.org                    "Robert E. McGrath"
>                                           <mcgrath at ncsa.uiuc.edu>
>
cc
>  09/02/2005 04:34 PM                      dfdl-wg at gridforum.org,
>                                           owner-dfdl-wg at ggf.org
>
Subject
>                                           split into multiple topics -
Re:
>                                           [dfdl-wg] Issues: additional
>                                           data types
>
>
>
>
>
>
>
>
>
>
>
>
>
> I'd like to split this topic into several distinct ones:
>
> Arrays - I have a placeholder for this in the doc.
>
> Opaque and "code" types are separate. This is related also to the concept
> of "open content".
>
> Enums
>
> Bitfields
>
> Pointers
>
>
> Mike Beckerle
> Architect, Scalable Computing
> IBM Software Group
> Information Integration Solutions
> Westborough, MA
>
>  "Robert E. McGrath"
>  <mcgrath at ncsa.uiuc.edu>
>  Sent by: owner-dfdl-wg at ggf.org
>
To
>
dfdl-wg at gridforum.org
>  09/02/2005 03:13 PM
cc
>
>
Subject
>                                                      [dfdl-wg] Issues:
>                                                      additional data
types
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Greetings,
>
> Here is an "issue" for the DFDL: additional data types that should
> be considered.
>
> Please see attached.
>
> ---
> Robert E. McGrath
> National Center for Supercomputing Applications
> University of Illinois, Urbana-Champaign
> Champaign, Illinois 61820
> (217)-333-6549
>
> mcgrath at ncsa.uiuc.edu (See attached file: DT.htm)

--
---
Robert E. McGrath
National Center for Supercomputing Applications
University of Illinois, Urbana-Champaign
Champaign, Illinois 61820
(217)-333-6549

mcgrath at ncsa.uiuc.edu






More information about the dfdl-wg mailing list