[DFDL-WG] Fw: Spec question about extraEscapedCharacters

Steve Hanson smh at uk.ibm.com
Tue May 25 03:58:24 EDT 2021


Mike

Fyi, from 2009.

Regards
 
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday 
----- Forwarded by Steve Hanson/UK/IBM on 25/05/2021 08:56 -----

From:   Steve Hanson/UK/IBM
To:     Alan Powell/UK/IBM at IBMGB
Cc:     Dragan Besevic/Boca Raton/IBM at IBMUS, Stephanie 
Fetzer/Charlotte/IBM at IBMUS
Date:   04/11/2009 11:14
Subject:        Re: Spec question about extraEscapedCharacters


Alan

On parsing, MRM just looks for the escape character (like DFDL). On 
writing, MRM needs to know what to escape and the reserved characters list 
is used for that (markup is not automatically included, as you say).

So why does DFDL need extraEscapedCharacters property?  Let's say I 
serialise an infoset corresponding to a structure. That will escape any 
in-scope markup characters in the data. But if I then include that 
serialised data as a BLOB is some envelope structure that also has 
terminating markup, I won't have escaped any of the envelope's markup 
characters in my BLOB.  To do that I need extraEscapedCharacters property. 


I think the spec needs clarifying to say that extraEscapedCharacters is an 
output only control.

Regards

Steve Hanson
Programming Model Architect, WebSphere Message  Brokers,
OGF DFDL WG Co-Chair,
Hursley, UK,
Internet: smh at uk.ibm.com,
Phone (+44)/(0) 1962-815848



From:
Alan Powell/UK/IBM
To:
Stephanie Fetzer/Charlotte/IBM at IBMUS, Steve Hanson/UK/IBM at IBMGB
Cc:
Dragan Besevic/Boca Raton/IBM at IBMUS
Date:
03/11/2009 10:48
Subject:
Re: Spec question about extraEscapedCharacters


Stephanie

extraEscapedCharacters comes from the WMB Reserved Characters property. It 
isn't exactly the same as WMB doesn't automatically look for markup.
Also the documentation says that Reserved Characters aren't used on 
parsing which doesn't seem consistent. Steve do you know why?

As currently specified your example wouldn't be a parsing error  but may 
cause the data to be interpreted incorrectly if ^ is in fact markup. 



Alan Powell

 MP 211, IBM UK Labs, Hursley,  Winchester, SO21 2JN, England
 Notes Id: Alan Powell/UK/IBM     email: alan_powell at uk.ibm.com 
 Tel: +44 (0)1962 815073                  Fax: +44 (0)1962 816898





From:
Stephanie Fetzer/Charlotte/IBM at IBMUS
To:
Alan Powell/UK/IBM at IBMGB
Cc:
Dragan Besevic/Boca Raton/IBM at IBMUS
Date:
02/11/2009 21:05
Subject:
Spec question about extraEscapedCharacters


Alan:

Real quick question about extraEscapedCharacters.  The current definition 
of the term is:

String
A space separated list of single characters that must be escaped in 
addition to markup.
Annotation: dfdl: escapeScheme

So - my interpretation is that is someone wants to enforce...even though 
the ^ is not an initiator, terminator, delimiter, or any other types or 
markup in this document - I want to force the sender of the data to always 
escape it.  So 

extraEscapedCharacters="^"
and the data to be parsed looks like:   "NAME#ADDRESS#PHONENUMB^ER"
That would be an error (unless the B is an escape char). We would expect 
the ^ to always be escaped in the data.
So even though we don't need it to be escaped from a parsing standpoint we 
would expect it to be a parsing error?
Or is the property merely a suggestion on parsing?

Any hints that you can provide on the reasoning for this property would be 
appreciated.

Cheers,
-Steph

WebSphere Transformation Extender
Industry Packs - Software Engineer





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20210525/6f005c86/attachment.html>


More information about the dfdl-wg mailing list