[DFDL-WG] DFDL Statement Evaluation Timing (Assert, Discriminator, SetVariable, NewVariableInstance)

Mike Beckerle mbeckerle.dfdl at gmail.com
Tue Oct 30 14:10:56 EDT 2012


Revision 1 (based on discussion on WG call 2012-10-30)

Principles: disallow statements except where their scope and timing are
clear and where the timing is easy to understand from the way it appears
textually in the schema document.

See changes in RED.

On Mon, Oct 29, 2012 at 6:54 PM, Mike Beckerle <mbeckerle.dfdl at gmail.com>wrote:

> I'll write this up like an errata, but this is for discussion of whether
> we believe this is clear and complete.
>
> -------------------------------------
>
> *Glossary*: DFDL Statements are the annotation elements dfdl:assert,
> dfdl:discriminator, dfdl:setVariable, and dfdl:newVariableInstance.
>
> *Errata*: Locations where dfdl:assert, dfdl:discriminator and
> dfdl:setVariable are allowed to appear are extended to also include Global
> Element Declarations for elements of simple type, and on Simple Type
> Definitions.


Errata: dfdl:newVariableInstance,  may appear only as an annotation on a
sequence or choice.

Errata: dfdl:setVariable, dfdl:assert, dfdl:discriminator may appear only
as an annotation on a sequence, choice, or a simpleType definition, or an
element declaration/reference having simpleType. Note this removes ability
for complex typed element decls/refs carry any DFDL Statement annotations
including asserts/discriminators.

(I'm trying to get really minimal here. Not even assert on complexType
elements.)


> *Errata*: Clarification about discriminators: Discriminators exclude
> Assertions even when combining across references.
>
> Beyond the stipulation that there can be only one dfdl:discriminator at
> any annotation point of the DFDL schema, there are further constraints.
>
> A single dfdl:discriminator annotation may appear on an element reference,
> or on the global element declaration it refers to, or on the simple type
> appearing immediately within or referenced from the global element
> declaration. But only one of those places. In addition, if a discriminator
> occupies one of those places, then no dfdl:assert annotations may appear in
> any of those locations.
>
> A dfdl:discriminator annotation may appear on a group reference or on the
> model group within the global group definition it refers to. But only one
> of those places, and similarly, if a discriminator appears in any of those
> places, then no dfdl:assert annotations may appear in any of those
> locations.
>

(TBD: constraints that you can't have multiple setVariable statements of
the same variable in these places either, just as you can't have multiple
setVariables of the same variable at one annotation point.)


> *Errata*: Clarification about the execution order of DFDL Statements when
> they appear on an element reference or element declaration.
>
> DFDL Statement annotations for a given element are executed as follows: (Keep
> in mind that this element will have simpleType, as complexType elements
> cannot carry statement annotations at all.)
>
> 1) all relevant DFDL statement annotations are gathered to form a single
> list which preserves schema-definition order.
>
>    - For a simpleType definition, the DFDL statement annotations found
>    immediately on it are kept in schema-definition order, and are appended to
>    the end of  a list of those from any base simpleType definition.
>    - For an element declaration having simple type, the DFDL statement
>    annotations found immediately on the declaration are appended to the end of
>    the list of those from its simple type.
>    - For an element reference, DFDL statement annotations found
>    immediately on the element reference are appended to the end of the list of
>    those from the global element declaration it references.
>
> 2) given the combined list, the annotations are executed as follows:
>
>    1. before any parsing of the element, a dfdl:discriminator with
>    testKind="pattern" is executed.
>    2. if there is no discriminator, then all dfdl:asserts (there could be
>    several) with testKind="pattern" are executed in the order they appear in
>    the list of DFDL statements.
>    3. Any properties having runtime evaluation are evaluated. (e.g.,
>    delimiters with expressions)
>    4. The element itself is parsed, or its inputValueCalc property is
>    evaluated to create its value.
>    5. *REMOVED: no longer allowed on elements at all: all
>    newVariableInstance annotations are executed and new variables are placed
>    into scope for the duration of these remaining steps. The statements are
>    executed in the order they appear in the list of DFDL statements.*
>    6. all setVariable annotations are executed. The statements are
>    executed in the order they appear in the list of DFDL statements.
>    7. if a discriminator is present it is executed
>    8. if no discriminator is present, then assert annotations can be
>    present, and they are executed. If there are multiple assert annotations
>    the statements are executed in the order they appear in the list of DFDL
>    statements.
>
> If the element reference or local element declaration is an array, then
> this evaluation is repeated for each occurrence of the array.
> **


A DFDL implementation that wishes to optimize is free to analyze the
expressions used, and evaluate them sooner so long as the behavior is
equivalent to the above description.

*Discussion/Illustration: *(this is all revised, so I'm switching back to
black ink)

Suppose you have this situation:


<sequence>
  ...
  <element ref="foo"/> <!-- I want to add DFDL statement annotations before
and after this -->
  ...
</sequence>

To add them before, so they are scoped over or visible to the parsing of
the entire foo element:

<sequence>
  ...
  <sequence> <!-- inserted sequence -->
     <annotation><appinfo ...>
        <dfdl:newVariableInstance ref="myVar" default="{...}"/>
        <dfdl:setVariable ref="myOtherVar" value="{...}"/>
     </appinfo></annotation>
     <element ref="foo"/> <!-- I want to add DFDL statement annotations
before this -->
  </sequence>
  ...
</sequence>

To add them after so they can reference downward into that element:

<sequence>
  ...

  <element ref="foo"/> <!-- I want to add DFDL statement annotations after
this -->
  <sequence> <!-- inserted sequence -->
     <annotation><appinfo ...>
        <dfdl:assert>{ foo/bar/baz.... }</dfdl:assert>
        <dfdl:setVariable ref="yetAnotherVar" value="{ foo/bar[2]/baz + 1
}"/>
     </appinfo></annotation>
  </sequence>
  ...
</sequence>

The above illustration is about complex type elements, but note that the
timing issue really doesn't care whether the element/element-ref had simple
or complex type. You can perfectly control when evaluation occurs relative
to the element itself.

The timing is then clear (to me anyway). The optimization opportunity to
hoist the assert for earlier evaluation is clear, but it's also clear that
the assert CAN look into foo how it wishes to make its decision.

If you put the annotation directly on the element (which then must be of
simple type), then it is exactly equivalent to putting it in a sequence
AFTER. (Modulo replacing "." in expressions with "foo")
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121030/516d2ab8/attachment.html>


More information about the dfdl-wg mailing list