[DFDL-WG] DFDL Bidirectional support
Alan Powell
alan_powell at uk.ibm.com
Tue Jun 30 10:21:16 CDT 2009
I thought I would tackle one of the "easier" work items so I started
looking at BiDi support assuming it would just be a matter of bringing the
existing supplement up to date. Of course it hasn't turned out that simple
as the supplement only covers one aspect.
There are a couple of web sites that give helpful overviews of BiDi
Bidirectional script support: A primer
http://www.ibm.com/developerworks/websphere/library/techarticles/bidi/bidigen.html
Unicode Bidirectional Algorithm
http://www.unicode.org/reports/tr9/
Most of the complexities occur when stings contain a mixture of LTR and
RTL segments.
For DFDL there are two main aspects to consider: 1) using BiDi strings
within a DFDL schema and 2) using BiDi stings within DFDL described data.
Using BiDi within a DFDL schema would be needed is we have to support BiDi
string for syntax (initiators, etc) or representation values (nilValues.
etc).
Arabic and Hebrew characters have normal codepoints in unicode so DFDL
presumably already allows them to be used in a schema and the initiators
etc will inherit whatever BiDi properties we put on the component. This
clearly has tooling implications.
For BiDi strings in data we need to decide how they should be represented
in the infoset. The choice is between 1) the same order scheme as in the
data but converted to unicode or 2) transformed to a canonical order
scheme.
I suggest that we should convert to a canonical logical, LTR ordering
scheme to hide the complexities from the application. There are however
due to limitations of bidirectional transformation that is not transitive,
this might lead to ambiguity in the bidirectional data interpretation.
Note this is how some IBM products work.
This is the existing supplement.
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell at uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20090630/e74a1b21/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ggf-dfdl-supplement-bidirectional-text-properties-v1.0-001.doc
Type: application/octet-stream
Size: 156160 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/dfdl-wg/attachments/20090630/e74a1b21/attachment-0001.obj
More information about the dfdl-wg
mailing list