[DFDL-WG] how to trim inside of escape block?

Steve Hanson smh at uk.ibm.com
Wed Nov 22 04:11:25 EST 2017


I don't think there is a way to achieve what you want. As you say, 
trimming pad chars takes precedence over applying escape scheme.

I wondered if you could define the escapeBlockStart and End as "%WSP*; and 
%WSP*;" respectively but the white space entities are not allowed as 
escape character or in escape block start/end.

Regards
 
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date:   22/11/2017 01:28
Subject:        [DFDL-WG] how to trim inside of escape block?
Sent by:        "dfdl-wg" <dfdl-wg-bounces at ogf.org>




I have a CSV file

Some lines look like this

a,b,"   started with spaces, appearing right after the escape block 
start   ",c,d,e

I reviewed the spec, and I see that pad characters appear outside of the 
quotation marks (escape block start/end).

What I'm trying to do is remove the whitespace after the escape block 
start, and before the escape block end. This is just spurious whitespace, 
appears because some of these CSV files were edited by people. 

In my data the quoting characters are not always present. They are only 
there if a comma appears in the data string.

Is there a technique for getting rid of the leading/trailing whitespace 
inside the escape block start/end that I have forgotten?

...mikeb

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy
--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ogf.org_mailman_listinfo_dfdl-2Dwg&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=AJa9ThEymJXYnOqu84mJuw&m=vfzt-MyHajT591zYQmbcxckPT-mZLjNRPlTrg8kgRgY&s=6PDI_r_U7OUsqAxzv24ZiCuH5zPpWFyzXbneqH1GPXk&e=

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20171122/dfa03ee4/attachment.html>


More information about the dfdl-wg mailing list