[DFDL-WG] DFDL character entities in DFDL expressions

Mike Beckerle mbeckerle.dfdl at gmail.com
Wed Dec 5 08:17:48 EST 2012


Well, yes, I think we're discussing exactly how that should work.

No matter what, we do need to be clear that string values created in DFDL's
expression language can contain these XML-illegal characters, since they
are allowed in DFDL's infoset. This means that DFDL implementations can
only re-use an existing XPath implementation to create their DFDL
expression language implementation to the extent that it does NOT enforce
the XML-illegal characters restrictions all over the place.

I am currently working with standard Saxon-B XPath and will report back.

But let's be optimistic. The question then is just what is the solution to
creating a string-literal including these characters. That cannot be done
without some beyond-XML mechanism. DFDL has a string-literal notation for
expressing these characters, so we either say that string literals in the
expression language can use the DFDL character and numeric entities, or we
can do something more 'library like', and provide a function which
interprets the string-literal notation, and isolate the implementation
concerns a bit.

As a language embedded in XML schema, we already straddle the fence of two
somewhat inconsistent language environments.

E.g., the literals one can use as the value of the default attribute on an
element declaration cannot use DFDL character entities, as this is a purely
XML Schema construct.

Similarly, the regular expressions one can use for the XML schema pattern
facet are more restrictive than the DFDL regular expressions one can use in
a dfdl:assert, or a dfdl:lengthKind='pattern'.

So, it's acceptable to me to say that expressions also have some split
where the dfdl-specific aspects, like the dfdl character and numeric
entities notation, is isolated in a sub-construct.



On Wed, Dec 5, 2012 at 6:37 AM, Andrew Coleman <andrew_coleman at uk.ibm.com>wrote:

> No, casting a hexBinary to a string will just write out the octets - i.e.
> the string will be '00'.
>
> XPath itself has no mechanism for interpreting entity references or
> character references.  Its hosting language (XQuery or XSLT/XML) provides
> this.  Since DFDL is XML, wouldn't that provide a mechanism?
>
> Regards,
> - Andy
>
> __________________________________________
> Andrew Coleman
> WebSphere Message Broker Development
> IBM Hursley Park
>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB,
> Cc:        dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org, Mike Beckerle <
> mbeckerle.dfdl at gmail.com>, Andrew Coleman/UK/IBM at IBMGB
> Date:        05/12/2012 11:06
> Subject:        Re: [DFDL-WG] DFDL character entities in DFDL expressions
> ------------------------------
>
>
> Aren't XPath facilities sufficient here?
>
> outputValueCalc="{   if (fn:string-length(../s) lt 64) then
> fn:concat(../s, xs:string(xs:hexBinary('00'))) else ../s   }"
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        Mike Beckerle <mbeckerle.dfdl at gmail.com>,
> Cc:        dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org
> Date:        05/12/2012 10:51
> Subject:        Re: [DFDL-WG] DFDL character entities in DFDL expressions
> Sent by:        dfdl-wg-bounces at ogf.org
> ------------------------------
>
>
>
> I think the restriction was aimed at avoiding things like this:
>
> outputValueCalc="{   if (fn:string-length(../s) lt 64) then
> fn:concat(../s, '%#rFF;') else ../s   }"
>
> I agree that a total ban is too restrictive. My personal preference would
> be for the dfdl:string() function because it makes the usage of
> DFDL-specific features obvious in the DFDL expression. But what would be
> the return type of dfdl:string()? It it returned a sequence of characters
> then the raw byte entity ( %#rnn; ) would still need to be disallowed.
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        dfdl-wg at ogf.org,
> Date:        04/12/2012 23:36
> Subject:        [DFDL-WG] DFDL character entities in DFDL expressions
> Sent by:        dfdl-wg-bounces at ogf.org
>  ------------------------------
>
>
>
>
> We currently have this language in the spec:
>
> "Within an expression, a string is never interpreted as a DFDL string
> literal."
>
> To me this means one cannot use DFDL character entities in an expression.
>
> However, I need to do this:
>
>         outputValueCalc="{   if (fn:string-length(../s) lt 64) then
> fn:concat(../s, '%NUL;') else ../s   }"
>
> Basically, I need to append a NUL on the end of the string in the output
> value case.
>
> Unless I can put a %NUL; into an expression and have it interpreted as a
> DFDL String literal,  I am not sure how I can achieve this.
>
> At minimum I need a new DFDL function which might be an alternate string
> constructor, such as dfdl:string('....') which interprets the argument as
> something where the contents are to be scanned for DFDL character entities
> and they are substituted so that the resulting string can contain the
> characters that are disallowed in XML. (like NUL)
>
> --
> Mike Beckerle | OGF DFDL WG Co-Chair | Tresys Technologies
> Tel:  781-330-0412
>
> --
> dfdl-wg mailing list
> dfdl-wg at ogf.org
> *https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>



-- 
Mike Beckerle | OGF DFDL WG Co-Chair | Tresys Technologies
Tel:  781-330-0412
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121205/20415a9f/attachment.html>


More information about the dfdl-wg mailing list