[DFDL-WG] DFDL Bidirectional support

Alan Powell alan_powell at uk.ibm.com
Tue Jun 30 10:21:16 CDT 2009


I thought I would tackle one of the "easier" work items so I started 
looking at BiDi support assuming it would just be a matter of bringing the 
existing supplement up to date. Of course it hasn't turned out that simple 
as the supplement only covers one aspect.

There are a couple of web sites that give helpful overviews of BiDi
Bidirectional script support: A primer
http://www.ibm.com/developerworks/websphere/library/techarticles/bidi/bidigen.html

Unicode Bidirectional Algorithm
http://www.unicode.org/reports/tr9/

Most of the complexities occur when stings contain a mixture of LTR and 
RTL segments.


For DFDL there are two main aspects to consider: 1) using BiDi strings 
within a DFDL schema and 2) using BiDi stings within DFDL described data.

Using BiDi within a DFDL schema would be needed is we have to support BiDi 
string for syntax (initiators, etc) or representation values (nilValues. 
etc). 
Arabic and Hebrew characters have normal codepoints in unicode so DFDL 
presumably already allows them to be used in a schema and the initiators 
etc will inherit whatever BiDi properties we put on the component.  This 
clearly has tooling implications.

For BiDi strings in data we need to decide how they should be represented 
in the infoset. The choice is between 1) the same order scheme as in the 
data but  converted to unicode or 2) transformed to a canonical order 
scheme.

I suggest that we should convert to a canonical logical, LTR ordering 
scheme to hide the complexities from the application. There are however 
due to limitations of bidirectional transformation that is not transitive, 
this might lead to ambiguity in the bidirectional data interpretation. 
Note this is how some IBM products work.

This is the existing supplement.



Alan Powell

 MP 211, IBM UK Labs, Hursley,  Winchester, SO21 2JN, England
 Notes Id: Alan Powell/UK/IBM     email: alan_powell at uk.ibm.com 
 Tel: +44 (0)1962 815073                  Fax: +44 (0)1962 816898






Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20090630/e74a1b21/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ggf-dfdl-supplement-bidirectional-text-properties-v1.0-001.doc
Type: application/octet-stream
Size: 156160 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/dfdl-wg/attachments/20090630/e74a1b21/attachment-0001.obj 


More information about the dfdl-wg mailing list