[DFDL-WG] delimited binary data - clarifications
Steve Hanson
smh at uk.ibm.com
Wed Nov 8 12:35:47 EST 2017
1. Yes, all delimiters allowed
2. No restriction on the encoding of such delimiters
3. Escape schemes only apply to text representation
The dfdl:escapeSchemeRef property only appears in section 13.2 Properties
Common to All Simple Types with Text representation.
When creating a DFDL schema that involves delimited binary data, you have
to be careful that your data can't contain any bytes that match any
in-scope delimiter.
I believe that IBM DFDL's byte scanner converts in-scope delimiters into
the equivalent bytes using the dfdl:encoding of the object, then matches
the bytes.
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
smh at uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
From: Mike Beckerle <mbeckerle.dfdl at gmail.com>
To: "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, Josh Adams
<jadams at tresys.com>
Date: 08/11/2017 13:49
Subject: [DFDL-WG] delimited binary data - clarifications
Sent by: "dfdl-wg" <dfdl-wg-bounces at ogf.org>
Daffodil project is implementing various packed formats, and looking at
the TLOG schema on the DFDL Schemas site.
The DFDL spec is clear that lengthKind delimited is allowed for packed
formats (all variants of packed) and hexBinary.
My question is whether there is any restriction on the generality of this
that was intended, but not stated in the spec, where we should be issuing
a clarification.
E.g.,
1. Can binary data have all of initiators, terminators, and
separators?
2. Is there a restriction on the charset encoding used to specify
these, e.g., SBCS? Or do the byte patterns being used to scan for these
require conversion of the specified delimiter to bytes from any supported
encoding?
3. Do escape schemes apply to delimited binary?
If, in fact, all these things are allowed, then I believe we should add a
one-liner to section 12.3.2.2 specifying that all aspects of delimited
parsing including the above, are specifically allowed.
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ogf.org_mailman_listinfo_dfdl-2Dwg&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=AJa9ThEymJXYnOqu84mJuw&m=R0T7EfuN11XJCtCsg2SR2uygmOXAvBpa2q-Z5aWuazM&s=-ryCrn-ycFskwbf9Uv-Ewr56JAk2s4vV7T0uyCE088U&e=
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20171108/57548eac/attachment.html>
More information about the dfdl-wg
mailing list