[DFDL-WG] Unparsing and outputValueCalc
Alan Powell
alan_powell at uk.ibm.com
Wed Jun 25 06:14:53 CDT 2008
Mike
Augmented infoset should include hidden elements. Although these almost
certainly have an outputValueCalc by including hidden elements we could
use the same concept on parsing to describe the infoset used by the
expression language.
When unparsing, an element declaration and the infoset are considered as
follows:
0) If the element declaration has a dfdl:inputValueCalc
property then the infoset value is ignored and nothing is output
a) If the element declaration has a dfdl:outputValueCalc property
then the expression which is the dfdl:outputValueCalc property value is
evaluated and the resulting value becomes the value of the element item in
the augmented infoset. Any pre-existing value for the infoset item is
superseded by this new value.
References to other augmented infoset items from within the
outputValueCalc expression must obtain their values from the augmented
infoset directly (when the value is already present) or by recursively
using these methods (a) and (b) as needed.
b) If the element declaration has no corresponding value in the
augmented infoset, and the element declaration is for a required item, and
it has a default value specified, then an element item having the default
value is created in the augmented infoset.
c) If any infoset item?s value is requested recursively as a part of
(a) above and (a) does not apply, and the corresponding value is not
present, and (b) does not apply then it is a processing error.
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell at uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
From:
"Mike Beckerle" <mbeckerle.dfdl at gmail.com>
To:
<mbeckerle.dfdl at gmail.com>, <dfdl-wg at ogf.org>
Date:
21/06/2008 13:18
Subject:
Re: [DFDL-WG] Unparsing and outputValueCalc
Revised per our discussions last week on the call.
Unparsing
Definition: augmented infoset. When unparsing one begins with the DFDL
schema and conceptually with the logical infoset. As the values of items
are filled in by defaulting, and by use of the DFDL outputValueCalc
property, these new item values augment the infoset. The resulting infoset
is called the augmented infoset.
Definition: an element declaration in the schema describes a potentially
represented item if that element declaration does not have an
inputValueCalc property. Whether the element declaration describes an item
that is actually represented or not depends on whether the element
declaration is for a required or optional element, and whether the element
has a corresponding value in the augmented infoset.
When unparsing, an element declaration and the infoset are considered as
follows:
a) If the element declaration has a dfdl:outputValueCalc property
then the expression which is the dfdl:outputValueCalc property value is
evaluated and the resulting value becomes the value of the element item in
the augmented infoset. Any pre-existing value for the infoset item is
superseded by this new value.
References to other augmented infoset items from within the
outputValueCalc expression must obtain their values from the augmented
infoset directly (when the value is already present) or by recursively
using these methods (a) and (b) as needed.
b) If the element declaration has no corresponding value in the
augmented infoset, and the element declaration is for a required item, and
it has a default value specified, then an element item having the default
value is created in the augmented infoset.
c) If any infoset item?s value is requested recursively as a part of
(a) above and (a) does not apply, and the corresponding value is not
present, and (b) does not apply then it is a processing error.
Given this augmented infoset, then if the potentially represented element
declaration has a corresponding infoset item then that item is serialized
according to its DFDL properties. If the element declaration is for a
required item, and there is no value in the augmented infoset then it is a
processing error.
Because rule (a) above is used even if the augmented infoset item already
exists and has a value, it is possible for an outputValueCalc expression
to be evaluated multiple times. DFDL implementations are free to cache
values and avoid this repeated evaluation for efficiency, as the semantics
of DFDL require that the outputValueCalc expression return the same value
every time it is evaluated.
In expressions, the function dfdl:length() can be called to determine the
representation length of an item. If an element declaration is not
potentially represented, then dfdl:length() is defined to return 0.
Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc.
Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451 |
mbeckerle.dfdl at gmail.com
From: Mike Beckerle [mailto:mbeckerle.dfdl at gmail.com]
Sent: Thursday, June 12, 2008 12:01 AM
To: dfdl-wg at ogf.org
Subject: Unparsing and outputValueCalc
Based on our discussions on the call today, I was thinking about the
definition of output ?unparsing? and how outputValueCalc is dealt with.
I believe the following language is sufficient to explain how such
expressions are evaluated.
When unparsing, an element declaration in the schema must have a
corresponding value in the infoset.
If one exists then that value is serialized based on its properties. If
there is no corresponding value in the infoset then a value is computed as
follows:
d) If the element declaration is required, and has a default value
specified, then an element item having the default value is created in the
infoset
e) If the element declaration has an outputValueCalc property then
the expression which is the property value is evaluated and the resulting
value becomes the value of the element item in the infoset. References to
other infoset elements from within the outputValueCalc expression must
obtain their values from the infoset directly (when the value is already
present) or by recursively using these methods (a) and (b) as needed.
f) If any infoset element?s value is requested and neither (a) nor
(b) applies, then it is a processing error.
Seems ok to me. This is the ordinary stuff of language specification.
The function dfdl:length() needs some additional discussion.
I think we can restrict dfdl:length() to accept only paths to element info
items. I.e., the first argument must be an explicit path.
(Alternatively, we can make the dfdl:length() be a member of an info item,
as in ../x.dfdl:length(?bytes?) - of however we want to notate obtaining
the length from a path, instead of dfdl:length(../x, ?bytes?). Either
notation style is ok with me.)
The path for the dfdl:length must be to an element which has
representation. That is, it cannot have the inputValueCalc property. This
insures that it is meaningful to ask for the dfdl:length, that is the
representation length of the item measured in the requested units.
There is already an Xpath function count() which returns the number of
occurrences of an item. Both count() and dfdl:length() potentially imply
buffering in the unparser implementation.
Comments?
Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc.
Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451 |
mbeckerle.dfdl at gmail.com --
dfdl-wg mailing list
dfdl-wg at ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/dfdl-wg/attachments/20080625/2bd9516d/attachment.html
More information about the dfdl-wg
mailing list