[DFDL-WG] DFDL regular expressions and Unicode

Cranford, Jonathan W. jcranford at mitre.org
Fri Jul 5 19:55:00 EDT 2013


Update: I just found errata 3.29, which answers this question, I think.

>From the description in the errata, and looking at the documentation for java 7 regular expressions, it looks like DFDL regular expressions conform to level 1 of Unicode Regular expressions (UTS#18).

I still think there would be value in stating such conformance in the DFDL spec, but I suppose that would take some legwork for someone to actually confirm the conformance of ICU and Java7 to level 1.

Very respectfully,

-- Jonathan Cranford


>-----Original Message-----
>From: Cranford, Jonathan W.
>Sent: Friday, July 05, 2013 1:36 PM
>To: dfdl-wg at ogf.org
>Subject: DFDL regular expressions and Unicode
>
>I've been going through the spec recently, and I have a few questions about DFDL
>regular expressions.
>
>Rather than put them into one long email, I'll break them up into separate emails.
>
>First question:  What level of conformance to Unicode Technical Standard #18
>UNICODE
>    REGULAR EXPRESSIONS do DFDL regular expressions claim?
>
>    For example,
>    * XML Schema regular expressions are "targeted at support of 'Level 1'
>features"
>        (http://www.w3.org/TR/xmlschema-2/#dt-ccesN)
>    * Java 1.4 regular expressions "implement its second level of support"
>        (http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html)
>    * Perl 5.18 seems to implement most of Level 1
>        (http://perldoc.perl.org/perlunicode.html#Unicode-Regular-Expression-
>Support-Level)
>
>    I think the conformance level should be specified in the DFDL spec so that it is
>clear to schema
>    designers what a regular expression would really match against.  Details
>    like case conversion and canonical equivalence make a difference when
>    matching against a Unicode string.
>
>Thanks in advance,
>
>--
>Jonathan W. Cranford <jcranford at mitre.org>
>Senior Information Systems Engineer
>The MITRE Corporation (http://www.mitre.org)



More information about the dfdl-wg mailing list