[DFDL-WG] missing XPath functions?

Mike Beckerle mbeckerle.dfdl at gmail.com
Mon Jul 28 12:57:14 EDT 2014


Did we omit these by mistake? They seem quite important even though in DFDL
they could only be used for a single character (because we only allow
single node return sequences).

I have a data format I am modeling where there is a sort of ad-hoc encoding
using 5 bits.

The codepoints are values 0 - 31 assigned to the letters 0-7 A-H J-N P-Z
(that's characters 0-7 and A-Z without I and O). The best I can do without
functions that create characters from codepoint integers is a 32-leaf
if-then-else tree statement on an inputValueCalc.

This is ok, but it seems we should have a function to convert from a
character to its codepoint as an integer, and back.


7.2 Functions to Assemble and Disassemble Strings  Function Meaning
fn:codepoints-to-string
<http://www.w3.org/TR/xpath-functions/#func-codepoints-to-string> Creates
an xs:string from a sequence of Unicode code points.
fn:string-to-codepoints
<http://www.w3.org/TR/xpath-functions/#func-string-to-codepoints> Returns
the sequence of Unicode code points that constitute an xs:string.
7.2.1 fn:codepoints-to-string
fn:codepoints-to-string($arg as xs:integer*) as xs:string

Summary: Creates an xs:string from a sequence of [The Unicode Standard]
<http://www.w3.org/TR/xpath-functions/#Unicode4> code points. Returns the
zero-length string if $arg is the empty sequence. If any of the code points
in $arg is not a legal XML character, an error is raised [err:FOCH0001
<http://www.w3.org/TR/xpath-functions/#ERRFOCH0001>].
 7.2.1.1 Examples

   -

   fn:codepoints-to-string((2309, 2358, 2378, 2325)) returns "अशॊक"

  7.2.2 fn:string-to-codepoints
fn:string-to-codepoints($arg as xs:string?) as xs:integer*

Summary: Returns the sequence of [The Unicode Standard]
<http://www.w3.org/TR/xpath-functions/#Unicode4> code points that
constitute an xs:string. If $arg is a zero-length string or the empty
sequence, the empty sequence is returned.
 7.2.2.1 Examples

   -

   fn:string-to-codepoints("Thérèse") returns the sequence (84, 104, 233,
   114, 232, 115, 101)



Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140728/766fff14/attachment.html>


More information about the dfdl-wg mailing list