[DFDL-WG] DFDL character entities in DFDL expressions

Mike Beckerle mbeckerle.dfdl at gmail.com
Wed Dec 5 09:13:28 EST 2012


Right. So I think we're in agreement that embedding something like DFDL
into the XML technologies like XML Schema, XPath, etc. should be done in a
manner that provides clarity when one is crossing the line from reusing an
XML technology directly, to something DFDL specific.

Using a standard DFDL function as a way to embed the DFDL-specific entities
notation into the expression strings sounds like a good way to do this.
Hence, dfdl:string(...) as additional string constructor function.

On Wed, Dec 5, 2012 at 8:28 AM, Tim Kimber <KIMBERT at uk.ibm.com> wrote:

> The key point for me is this: what happens when a DFDL expression ( which
> looks exactly like an XPath 2.0 expression ) gets lifted out of the DFDL
> xsd and used by a non-DFDL XPath processor?  If we allow the DFDL entities
> to be used like XML entities then the expression will appear to be a valid
> XPath expression,  but it will fail in some unpredictable way. On the other
> hand, if DFDL entities can only be used in conjunction with a DFDL function
> then  (presumably ) the non-DFDL XPath engine will report that an unknown
> extension function is being used.
>
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  kimbert at uk.ibm.com
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        Andrew Coleman/UK/IBM at IBMGB,
> Cc:        Steve Hanson/UK/IBM at IBMGB, Tim Kimber/UK/IBM at IBMGB,
> dfdl-wg at ogf.org
> Date:        05/12/2012 13:17
> Subject:        Re: [DFDL-WG] DFDL character entities in DFDL expressions
> ------------------------------
>
>
>
> Well, yes, I think we're discussing exactly how that should work.
>
> No matter what, we do need to be clear that string values created in
> DFDL's expression language can contain these XML-illegal characters, since
> they are allowed in DFDL's infoset. This means that DFDL implementations
> can only re-use an existing XPath implementation to create their DFDL
> expression language implementation to the extent that it does NOT enforce
> the XML-illegal characters restrictions all over the place.
>
> I am currently working with standard Saxon-B XPath and will report back.
>
> But let's be optimistic. The question then is just what is the solution to
> creating a string-literal including these characters. That cannot be done
> without some beyond-XML mechanism. DFDL has a string-literal notation for
> expressing these characters, so we either say that string literals in the
> expression language can use the DFDL character and numeric entities, or we
> can do something more 'library like', and provide a function which
> interprets the string-literal notation, and isolate the implementation
> concerns a bit.
>
> As a language embedded in XML schema, we already straddle the fence of two
> somewhat inconsistent language environments.
>
> E.g., the literals one can use as the value of the default attribute on an
> element declaration cannot use DFDL character entities, as this is a purely
> XML Schema construct.
>
> Similarly, the regular expressions one can use for the XML schema pattern
> facet are more restrictive than the DFDL regular expressions one can use in
> a dfdl:assert, or a dfdl:lengthKind='pattern'.
>
> So, it's acceptable to me to say that expressions also have some split
> where the dfdl-specific aspects, like the dfdl character and numeric
> entities notation, is isolated in a sub-construct.
>
>
>
> On Wed, Dec 5, 2012 at 6:37 AM, Andrew Coleman <*andrew_coleman at uk.ibm.com
> * <andrew_coleman at uk.ibm.com>> wrote:
> No, casting a hexBinary to a string will just write out the octets - i.e.
> the string will be '00'.
>
> XPath itself has no mechanism for interpreting entity references or
> character references.  Its hosting language (XQuery or XSLT/XML) provides
> this.  Since DFDL is XML, wouldn't that provide a mechanism?
>
> Regards,
> - Andy
>
> __________________________________________
> Andrew Coleman
> WebSphere Message Broker Development
> IBM Hursley Park
>
>
>
>
> From:        Steve Hanson/UK/IBM
> To:        Tim Kimber/UK/IBM at IBMGB,
> Cc:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>, *dfdl-wg-bounces at ogf.org*<dfdl-wg-bounces at ogf.org>,
> Mike Beckerle <*mbeckerle.dfdl at gmail.com* <mbeckerle.dfdl at gmail.com>>,
> Andrew Coleman/UK/IBM at IBMGB
> Date:        05/12/2012 11:06
> Subject:        Re: [DFDL-WG] DFDL character entities in DFDL expressions
>  ------------------------------
>
>
>
> Aren't XPath facilities sufficient here?
>
> outputValueCalc="{   if (fn:string-length(../s) lt 64) then
> fn:concat(../s, xs:string(xs:hexBinary('00'))) else ../s   }"
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*<mbeckerle.dfdl at gmail.com>>,
>
> Cc:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>, *dfdl-wg-bounces at ogf.org*<dfdl-wg-bounces at ogf.org>
> Date:        05/12/2012 10:51
> Subject:        Re: [DFDL-WG] DFDL character entities in DFDL expressions
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
> I think the restriction was aimed at avoiding things like this:
>
> outputValueCalc="{   if (fn:string-length(../s) lt 64) then
> fn:concat(../s, '%#rFF;') else ../s   }"
>
> I agree that a total ban is too restrictive. My personal preference would
> be for the dfdl:string() function because it makes the usage of
> DFDL-specific features obvious in the DFDL expression. But what would be
> the return type of dfdl:string()? It it returned a sequence of characters
> then the raw byte entity ( %#rnn; ) would still need to be disallowed.
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*<mbeckerle.dfdl at gmail.com>
> >
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        04/12/2012 23:36
> Subject:        [DFDL-WG] DFDL character entities in DFDL expressions
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
>
> We currently have this language in the spec:
>
> "Within an expression, a string is never interpreted as a DFDL string
> literal."
>
> To me this means one cannot use DFDL character entities in an expression.
>
> However, I need to do this:
>
>         outputValueCalc="{   if (fn:string-length(../s) lt 64) then
> fn:concat(../s, '%NUL;') else ../s   }"
>
> Basically, I need to append a NUL on the end of the string in the output
> value case.
>
> Unless I can put a %NUL; into an expression and have it interpreted as a
> DFDL String literal,  I am not sure how I can achieve this.
>
> At minimum I need a new DFDL function which might be an alternate string
> constructor, such as dfdl:string('....') which interprets the argument as
> something where the contents are to be scanned for DFDL character entities
> and they are substituted so that the resulting string can contain the
> characters that are disallowed in XML. (like NUL)
>
> --
> Mike Beckerle | OGF DFDL WG Co-Chair | Tresys Technologies
> Tel:  *781-330-0412* <781-330-0412>
>
> --
> dfdl-wg mailing list*
> **dfdl-wg at ogf.org* <dfdl-wg at ogf.org>*
> **https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
>  dfdl-wg mailing list
>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
>
> --
> Mike Beckerle | OGF DFDL WG Co-Chair | Tresys Technologies
> Tel:  781-330-0412
>
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>



-- 
Mike Beckerle | OGF DFDL WG Co-Chair | Tresys Technologies
Tel:  781-330-0412
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121205/c7706ec2/attachment.html>


More information about the dfdl-wg mailing list