[DFDL-WG] Unparsing - Pad to "big enough length to fit..." - can it be achieved in DFDL?
Steve Hanson
smh at uk.ibm.com
Fri Nov 16 11:03:29 EST 2012
Mike
A while back we took the following errata (in my under-construction v011
doc), which effectively treats lengthKind 'explicit' with an expression as
variable length data to be treated the same as 'prefixed'. In other words
the expression would not be used. I think that prevents your scenario from
working even with pre-processed lengths.
2.100. Section 12.3.1. State that when unparsing an element with
lengthKind ‘explicit’ and where length is an expression, then the data in
the Infoset is treated as variable length and not fixed length. The
behaviour is the same as lengthKind ‘prefixed’.
At least, that was what I minuted,. I'll forward the email thread that
discussed this as it has some examples.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl at gmail.com>
To: dfdl-wg at ogf.org,
Date: 14/11/2012 18:23
Subject: [DFDL-WG] Unparsing - Pad to "big enough length to fit..."
- can it be achieved in DFDL?
Sent by: dfdl-wg-bounces at ogf.org
I have a format where fields are supposed to be padded so they all have
the length of the longest one.
I.e., something like this with fields terminated by | chars
FIELD1|FIELD2 |ST|PH |
ABC |CAPE COD|MA|888-999-0000|
DEFG |BOSTON |MA|- |
I believe I can warn if all the participants in a column of this tabular
thing don't have the same length. I would just set a variable with the
length of the first row, then on subsequent rows (parsed with a different
element in the schema) I would verify that the length equals what is
stored in the variable by way of an assertion. I believe this is what
recoverableError kind is for actually, when you have redundant information
you want to warn about.
The problem I have is unparsing. How in DFDL I could determine how wide to
pad each field before unparsing it. The spec requires me to measure the
width of the fields (including that quasi-header row which is first, and
measure how long each field is, then pad them to that maximum.
I believe there is no way to do this currently in DFDL, because there is
no way that an expression can refer to more than one record, nor any way
to iterate or operate over sets of data.
Instead an application must be written which can use DFDL v1.0 but the
application has to provide some of the smarts.
In this case, the application would have to walk the infoset and determine
the maximum length for each such field, then provide these length maximums
to the DFDL schema by way of externally set variables.
Is there any other way? Second issue: In addition, the spec of my format
requires the total maximum length for all the fields of a row to be at
most 72 characters long. When unparsing, we don't evaluate assertions. So
how can I test this?
--
Mike Beckerle | OGF DFDL WG Co-Chair
Tel: 781-330-0412--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121116/01f5f68d/attachment-0001.html>
More information about the dfdl-wg
mailing list