[DFDL-WG] Action 254: Updated words for zoned numbers
Steve Hanson
smh at uk.ibm.com
Thu Apr 10 09:29:55 EDT 2014
Updated description for textZonedSignStyle to reflect that not all EBCDIC
encodings use the same characters for the code points. Text in red is
from public comment 116.
----------------------
Specifies the code points that are used to overpunch the sign nibble when
the encoding is an ASCII-derived character set.encoding. The location of
this sign nibble is indicated in the dfdl:textNumberPattern.
This property is applicable when dfdl:textNumberRep is 'zoned'
Used only when dfdl:encoding is an ASCII-derived character set encoding.
The encoding must provide the character to single byte code point mapping
used by the specified value of dfdl:textZonedSignStyle, as stated below.
Valid values 'asciiStandard', 'asciiTranslatedEBCDIC',
'asciiCARealiaModified', and 'asciiTandemModified'
Which characters are used to represent 'overpunched' (included) positive
and negative signs, varies by encoding, Cobol compiler and system. The
code points are fixed for EBCDIC systems but not for ASCII.
In EBCDIC-based encodings, code points 0xC0 to 0xC9 or 0xF0 to 0xF9
represent a positive sign and digits 0 to 9 (typically characters
'{ABCDEFGHI' or '0123456789'), and code points 0xD0 to 0xD9 or 0xB0 to
0xB9 represent a negative sign and digits 0 to 9 (typically characters
'}JKLMNOPQR' or '^£¥·©§¶¼½¾ ' ). On parsing both ranges will be accepted.
On unparsing the range 0xC0 to 0xC9 will be produced for positive signs
and the range 0xD0 to 0xD9 will be produced for negative signs.
asciiStandard: ASCII characters '0123456789' represent a positive sign and
the corresponding digit. (Sign nibble for '+' is 0x3, which is the high
nibble of these code points unmodified.) ASCII characters 'pqrstuvwxy'
represent negative sign and digits 0 to 9. (Code points 0x70 to 0x79)
asciiTranslatedEBCDIC: The overpunched character is the ASCII equivalent
of the typical EBCDIC above. So the characters '{ABCDEFGHI' still
represent a positive sign and digits 0 to 9. (These are code points 0x7B,
0x41 through 0x49). The characters '}JKLMNOPQR' still represent negative
sign and digits 0 to 9. (These are code points 0x7D, 0x4A through 0x52).
This case comes up if EBCDIC zoned decimal data is translated to ASCII as
if it were textual data.)
asciiCARealiaModified: In this style, the ASCII characters '0123456789'
represent positive sign and digits 0 to 9 as in standard. However, ASCII
characters from code points 0x20 to 0x29 are used for negative sign and
the corresponding decimal digit. This doesn't translate well into printing
characters. These characters include the space (' ') for zero, characters
'!"#$%&' for 1 through 6, the single quote character "'" for 7, and the
parenthesis '()' for 8 and 9.
asciiTandemModified: In this style the ASCII characters '0123456789'
represent positive sign and digits 0 to 9, but code points 0x80 to 0x89
are used to represent negative sign and a digit. There are no
corresponding code points in the standard ASCII encoding since these
values are all above 128 (decimal). This means the resultant bytes are not
code points in standard ASCII, so the modeller must specify an encoding
like ISO-8859-1 in order for such zoned decimals to parse without an
encoding error. (Note that neither ISO-8859-1 encoding nor Unicode have
assigned glyphs for these code points. They are considered control
characters.)
---------------------
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140410/e27109b8/attachment.html>
More information about the dfdl-wg
mailing list