[DFDL-WG] Ignore extraneous CRLF w/ space?

Garriss Jr., James P. jgarriss at mitre.org
Wed Jun 5 11:24:47 EDT 2013


> Is the problem that the dfdl:terminator '%CR;%LF;' for the end of the header record is firing prematurely when it encounters the CRLF in the data?

Exactly.

> I would model the data as unbounded repeating records, and use a discriminator to distinguish the repeats from the next header.

Uh, could you repeat that in English?  Maybe with a small example?  I freely admit that I don’t understand what you just said.  Thanks!

From: Steve Hanson [mailto:smh at uk.ibm.com]
Sent: Wednesday, June 05, 2013 5:21 AM
To: Garriss Jr., James P.
Cc: dfdl-wg at ogf.org; dfdl-wg-bounces at ogf.org
Subject: Re: [DFDL-WG] Ignore extraneous CRLF w/ space?

James

Is the problem that the dfdl:terminator '%CR;%LF;' for the end of the header record is firing prematurely when it encounters the CRLF in the data?

If so then I'm not sure that DFDL can ignore the extra %CR;%LF; without using an escape scheme - but there isn't an escape scheme to use.

I would model the data as unbounded repeating records, and use a discriminator to distinguish the repeats from the next header.

Regards

Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group<http://www.ogf.org/dfdl/>
IBM SWG, Hursley, UK
smh at uk.ibm.com<mailto:smh at uk.ibm.com>
tel:+44-1962-815848



From:        "Garriss Jr., James P." <jgarriss at mitre.org<mailto:jgarriss at mitre.org>>
To:        "dfdl-wg at ogf.org<mailto:dfdl-wg at ogf.org>" <dfdl-wg at ogf.org<mailto:dfdl-wg at ogf.org>>,
Date:        04/06/2013 19:56
Subject:        [DFDL-WG] Ignore extraneous CRLF w/ space?
Sent by:        dfdl-wg-bounces at ogf.org<mailto:dfdl-wg-bounces at ogf.org>
________________________________



Long IMF headers, such as Received, can be wrapped onto the next line by using a CRLF and then a space.  This example has 3 such wrappings:

Received: from smtpksrv1.mitre.org (localhost.localdomain [127.0.0.1])
 by localhost (Postfix) via Exchange Front-End Server webmail.afmc.af.mil
 ([131.28.34.85]) with SMTP id 0A8791F116E for <jgarriss at mitre.org<mailto:jgarriss at mitre.org>>; Tue,
  4 Jun 2013 14:03:24 -0400 (EDT)

How do I get DFDL to ignore these wrappings?  For most of the header, it’s not an issue, because I can use a lengthPattern to lookahead to the ; before the date starts.  But once the date starts, I have no way of knowing when it ends, so I need to ignore any CRLF with a space.

TIA

 --
 dfdl-wg mailing list
 dfdl-wg at ogf.org<mailto:dfdl-wg at ogf.org>
 https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130605/fd9622b9/attachment.html>


More information about the dfdl-wg mailing list