[DFDL-WG] Fw: Spec question about extraEscapedCharacters
Steve Hanson
smh at uk.ibm.com
Tue May 25 03:58:24 EDT 2021
Mike
Fyi, from 2009.
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
----- Forwarded by Steve Hanson/UK/IBM on 25/05/2021 08:56 -----
From: Steve Hanson/UK/IBM
To: Alan Powell/UK/IBM at IBMGB
Cc: Dragan Besevic/Boca Raton/IBM at IBMUS, Stephanie
Fetzer/Charlotte/IBM at IBMUS
Date: 04/11/2009 11:14
Subject: Re: Spec question about extraEscapedCharacters
Alan
On parsing, MRM just looks for the escape character (like DFDL). On
writing, MRM needs to know what to escape and the reserved characters list
is used for that (markup is not automatically included, as you say).
So why does DFDL need extraEscapedCharacters property? Let's say I
serialise an infoset corresponding to a structure. That will escape any
in-scope markup characters in the data. But if I then include that
serialised data as a BLOB is some envelope structure that also has
terminating markup, I won't have escaped any of the envelope's markup
characters in my BLOB. To do that I need extraEscapedCharacters property.
I think the spec needs clarifying to say that extraEscapedCharacters is an
output only control.
Regards
Steve Hanson
Programming Model Architect, WebSphere Message Brokers,
OGF DFDL WG Co-Chair,
Hursley, UK,
Internet: smh at uk.ibm.com,
Phone (+44)/(0) 1962-815848
From:
Alan Powell/UK/IBM
To:
Stephanie Fetzer/Charlotte/IBM at IBMUS, Steve Hanson/UK/IBM at IBMGB
Cc:
Dragan Besevic/Boca Raton/IBM at IBMUS
Date:
03/11/2009 10:48
Subject:
Re: Spec question about extraEscapedCharacters
Stephanie
extraEscapedCharacters comes from the WMB Reserved Characters property. It
isn't exactly the same as WMB doesn't automatically look for markup.
Also the documentation says that Reserved Characters aren't used on
parsing which doesn't seem consistent. Steve do you know why?
As currently specified your example wouldn't be a parsing error but may
cause the data to be interpreted incorrectly if ^ is in fact markup.
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell at uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
From:
Stephanie Fetzer/Charlotte/IBM at IBMUS
To:
Alan Powell/UK/IBM at IBMGB
Cc:
Dragan Besevic/Boca Raton/IBM at IBMUS
Date:
02/11/2009 21:05
Subject:
Spec question about extraEscapedCharacters
Alan:
Real quick question about extraEscapedCharacters. The current definition
of the term is:
String
A space separated list of single characters that must be escaped in
addition to markup.
Annotation: dfdl: escapeScheme
So - my interpretation is that is someone wants to enforce...even though
the ^ is not an initiator, terminator, delimiter, or any other types or
markup in this document - I want to force the sender of the data to always
escape it. So
extraEscapedCharacters="^"
and the data to be parsed looks like: "NAME#ADDRESS#PHONENUMB^ER"
That would be an error (unless the B is an escape char). We would expect
the ^ to always be escaped in the data.
So even though we don't need it to be escaped from a parsing standpoint we
would expect it to be a parsing error?
Or is the property merely a suggestion on parsing?
Any hints that you can provide on the reasoning for this property would be
appreciated.
Cheers,
-Steph
WebSphere Transformation Extender
Industry Packs - Software Engineer
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20210525/6f005c86/attachment.html>
More information about the dfdl-wg
mailing list