[DFDL-WG] DFDL infoset and hexBinary

Suman Kalia kalia at ca.ibm.com
Tue Nov 20 14:08:20 CST 2007


------------------------- replication failed so I am not sure if it ever 
got distributed..

Mike - You are right. Here is a simplified description of these types .. 
They are both binary . 

One alternative is to treat them as native XSD types, honor the encoding 
schemas as defined in schema and allow only the set of dfdl:annotations 
pertaining to the set of valid XSD facets. 
enumeration, length, maxLength, minLength, pattern

hexBinary
The value space of xsd:hexBinary is the set of all binary contents; its 
lexical space is a simple coding of each octet as its hexadecimal value.

base64Binary
The value space of xsd:base64Binary is the set of arbitrary binary 
contents. Its lexical space is the same set after base64 coding. This 
coding is described in Section 6.8 of RFC 2045.
Restrictions
RFC 2045 describes the transfer of binary contents over text-based mail 
systems. It imposes a line break at least every 76 characters to avoid the 
inclusion of arbitrary line breaks by the mail systems. Sending base64 
content without line breaks is nevertheless a common usage for 
applications such as SOAP and the W3C XML Schema Working Group. After a 
request from other W3C Working Groups, the W3C XML Schema Working Group 
decided to remove the obligation to include these line breaks from the 
constraints on the lexical space. (This decision was made after the 
publication of the W3C XML Schema Recommendation. It is now noted in the 
errata.)

Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools 
Tel : 905-413-3923  T/L  969-3923
Fax : 905-413-4850 T/L  969-4850
Internet ID : kalia at ca.ibm.com



Mike Beckerle <beckerle at us.ibm.com> 
Sent by: dfdl-wg-bounces at ogf.org
11/16/2007 04:55 PM

To
Mike Beckerle <beckerle at us.ibm.com>
cc
dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org
Subject
Re: [DFDL-WG] DFDL infoset and hexBinary







Let me start answering my own question by way of these two definitions 
found in the XSD spec: 

The ·value space· of hexBinary is the set of finite-length sequences of 
binary octets. 

The ·value space· of base64Binary is the set of finite-length sequences of 
binary octets. 

Note that the value of these is not some sort of encoded string form. 
I.e., hexBinary values do not involve characters 0-9A-F, rather. It's the 
octets that are the value. 

To me this means that hexBinary and base64Binary are really identical in 
DFDL, except if we provide some sort of expressions for converting them 
to/from strings in that case the conversions are different and 
produce/consume different strings. 

...mike




Mike Beckerle/Worcester/IBM at IBMUS 
Sent by: dfdl-wg-bounces at ogf.org 
11/16/2007 04:23 PM 


To
dfdl-wg at ogf.org 
cc

Subject
[DFDL-WG] DFDL infoset and hexBinary









What is the DFDL infoset value for a hexBinary type element? 

Our doc says an element of type hexBinary has a value member in the 
infoset whose type is a hexBinary, but what is a hexBinary anyway? 

Consider this 12 bytes of data, dumped here as hex: 
003100320033004100420043 

If that's a string in UTF-16BE encoding, it looks like "123ABC" 

It is 12 bytes long. 

Suppose we describe this using this DFDL: 

<element name="a" type="hexBinary" length='12'  /> 

As I understand it, this is correct XSD in that the length of a hexBinary 
is always specified as the number of 'octets' of the binary data, not 
characters. 

Now, the contents of the basic XML infoset for an element matching this 
schema are these 24 characters:  "003100320033004100420043". This is 
because the XML infoset has no idea what this string is because it is not 
schema aware. 

An XML document corresponding to this might look like: 

<! xml version='1.0' encoding="UTF-16BE" !> 
<a>003100320033004100420043</a> 

So, the question is...... 

What are the contents of the DFDL Infoset for a hexBinary like this? Same 
24 characters? or 12 'octets' of data? 

This is essentially the answer to the question "what IS a hexBinary?" or 
maybe "what is a hexBinary in the PSVI other than just a string?" 


...mikeb --
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 http://www.ogf.org/mailman/listinfo/dfdl-wg --
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  http://www.ogf.org/mailman/listinfo/dfdl-wg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20071120/88a39a50/attachment.html 


More information about the dfdl-wg mailing list