[DFDL-WG] DFDL Statement Evaluation Timing (Assert, Discriminator, SetVariable, NewVariableInstance)
Steve Hanson
smh at uk.ibm.com
Wed Oct 31 06:20:32 EDT 2012
Mike
IBM DFDL already supports asserts and discriminators on complex elements,
so that must remain. There's a clear use case for this - a choice with
branches that are element refs to complex global elements. Discrimination
is possible at the time the choice is processed. You would put a
discriminator on the element refs. Also, asserts and discriminators are
intended to be the equivalent of WTX component rules which are allowed on
complex elements.
The last point about WTX made me realise why we had disallowed asserts and
discriminators on global elements. In xsd terms they are associated with a
particle. If we are going to allow them on global elements then we need to
be clear that this is no longer the case.
Agree that only one setVariable annotation for a given variable can exist
when annotations are combined from multiple objects. Same for
newVariableInstance.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl at gmail.com>
To: dfdl-wg at ogf.org,
Date: 30/10/2012 18:26
Subject: Re: [DFDL-WG] DFDL Statement Evaluation Timing (Assert,
Discriminator, SetVariable, NewVariableInstance)
Sent by: dfdl-wg-bounces at ogf.org
Revision 1 (based on discussion on WG call 2012-10-30)
Principles: disallow statements except where their scope and timing are
clear and where the timing is easy to understand from the way it appears
textually in the schema document
See changes in RED.
On Mon, Oct 29, 2012 at 6:54 PM, Mike Beckerle <mbeckerle.dfdl at gmail.com>
wrote:
I'll write this up like an errata, but this is for discussion of whether
we believe this is clear and complete.
-------------------------------------
Glossary: DFDL Statements are the annotation elements dfdl:assert,
dfdl:discriminator, dfdl:setVariable, and dfdl:newVariableInstance.
Errata: Locations where dfdl:assert, dfdl:discriminator and
dfdl:setVariable are allowed to appear are extended to also include Global
Element Declarations for elements of simple type, and on Simple Type
Definitions.
Errata: dfdl:newVariableInstance, may appear only as an annotation on a
sequence or choice.
Errata: dfdl:setVariable, dfdl:assert, dfdl:discriminator may appear only
as an annotation on a sequence, choice, or a simpleType definition, or an
element declaration/reference having simpleType. Note this removes ability
for complex typed element decls/refs carry any DFDL Statement annotations
including asserts/discriminators.
(I'm trying to get really minimal here. Not even assert on complexType
elements.)
Errata: Clarification about discriminators: Discriminators exclude
Assertions even when combining across references.
Beyond the stipulation that there can be only one dfdl:discriminator at
any annotation point of the DFDL schema, there are further constraints.
A single dfdl:discriminator annotation may appear on an element reference,
or on the global element declaration it refers to, or on the simple type
appearing immediately within or referenced from the global element
declaration. But only one of those places. In addition, if a discriminator
occupies one of those places, then no dfdl:assert annotations may appear
in any of those locations.
A dfdl:discriminator annotation may appear on a group reference or on the
model group within the global group definition it refers to. But only one
of those places, and similarly, if a discriminator appears in any of those
places, then no dfdl:assert annotations may appear in any of those
locations.
(TBD: constraints that you can't have multiple setVariable statements of
the same variable in these places either, just as you can't have multiple
setVariables of the same variable at one annotation point.)
Errata: Clarification about the execution order of DFDL Statements when
they appear on an element reference or element declaration.
DFDL Statement annotations for a given element are executed as follows:
(Keep in mind that this element will have simpleType, as complexType
elements cannot carry statement annotations at all.)
1) all relevant DFDL statement annotations are gathered to form a single
list which preserves schema-definition order.
For a simpleType definition, the DFDL statement annotations found
immediately on it are kept in schema-definition order, and are appended to
the end of a list of those from any base simpleType definition.
For an element declaration having simple type, the DFDL statement
annotations found immediately on the declaration are appended to the end
of the list of those from its simple type.
For an element reference, DFDL statement annotations found immediately on
the element reference are appended to the end of the list of those from
the global element declaration it references.
2) given the combined list, the annotations are executed as follows:
1. before any parsing of the element, a dfdl:discriminator with
testKind="pattern" is executed.
2. if there is no discriminator, then all dfdl:asserts (there could
be several) with testKind="pattern" are executed in the order they appear
in the list of DFDL statements.
3. Any properties having runtime evaluation are evaluated. (e.g.,
delimiters with expressions)
4. The element itself is parsed, or its inputValueCalc property is
evaluated to create its value.
5. REMOVED: no longer allowed on elements at all: all
newVariableInstance annotations are executed and new variables are placed
into scope for the duration of these remaining steps. The statements are
executed in the order they appear in the list of DFDL statements.
6. all setVariable annotations are executed. The statements are
executed in the order they appear in the list of DFDL statements.
7. if a discriminator is present it is executed
8. if no discriminator is present, then assert annotations can be
present, and they are executed. If there are multiple assert annotations
the statements are executed in the order they appear in the list of DFDL
statements.
If the element reference or local element declaration is an array, then
this evaluation is repeated for each occurrence of the array.
A DFDL implementation that wishes to optimize is free to analyze the
expressions used, and evaluate them sooner so long as the behavior is
equivalent to the above description.
Discussion/Illustration: (this is all revised, so I'm switching back to
black ink)
Suppose you have this situation:
<sequence>
...
<element ref="foo"/> <!-- I want to add DFDL statement annotations
before and after this -->
...
</sequence>
To add them before, so they are scoped over or visible to the parsing of
the entire foo element:
<sequence>
...
<sequence> <!-- inserted sequence -->
<annotation><appinfo ...>
<dfdl:newVariableInstance ref="myVar" default="{...}"/>
<dfdl:setVariable ref="myOtherVar" value="{...}"/>
</appinfo></annotation>
<element ref="foo"/> <!-- I want to add DFDL statement annotations
before this -->
</sequence>
...
</sequence>
To add them after so they can reference downward into that element:
<sequence>
...
<element ref="foo"/> <!-- I want to add DFDL statement annotations after
this -->
<sequence> <!-- inserted sequence -->
<annotation><appinfo ...>
<dfdl:assert>{ foo/bar/baz.... }</dfdl:assert>
<dfdl:setVariable ref="yetAnotherVar" value="{ foo/bar[2]/baz + 1
}"/>
</appinfo></annotation>
</sequence>
...
</sequence>
The above illustration is about complex type elements, but note that the
timing issue really doesn't care whether the element/element-ref had
simple or complex type. You can perfectly control when evaluation occurs
relative to the element itself.
The timing is then clear (to me anyway). The optimization opportunity to
hoist the assert for earlier evaluation is clear, but it's also clear that
the assert CAN look into foo how it wishes to make its decision.
If you put the annotation directly on the element (which then must be of
simple type), then it is exactly equivalent to putting it in a sequence
AFTER. (Modulo replacing "." in expressions with "foo")
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121031/e5c3a2f0/attachment.html>
More information about the dfdl-wg
mailing list