[DFDL-WG] DFDL Statement Evaluation Timing (Assert, Discriminator, SetVariable, NewVariableInstance)

Mike Beckerle mbeckerle.dfdl at gmail.com
Mon Oct 29 18:54:12 EDT 2012


I'll write this up like an errata, but this is for discussion of whether we
believe this is clear and complete.

-------------------------------------

*Glossary*: DFDL Statements are the annotation elements dfdl:assert,
dfdl:discriminator, dfdl:setVariable, and dfdl:newVariableInstance.

*Errata*: Locations where DFDL Statements are allowed to appear are
extended to also include Global Element Declarations, and on Simple Type
Definitions.

*Errata*: Clarification about discriminators: Discriminators exclude
Assertions even when combining across references.

Beyond the stipulation that there can be only one dfdl:discriminator at any
annotation point of the DFDL schema, there are further constraints.

A single dfdl:discriminator annotation may appear on an element reference,
or on the global element declaration it refers to, or on the simple type
appearing immediately within or referenced from the global element
declaration. But only one of those places. In addition, if a discriminator
occupies one of those places, then no dfdl:assert annotations may appear in
any of those locations.

A dfdl:discriminator annotation may appear on a group reference or on the
model group within the global group definition it refers to. But only one
of those places, and similarly, if a discriminator appears in any of those
places, then no dfdl:assert annotations may appear in any of those
locations.

*Errata*: Clarification about the execution order of DFDL Statements when
they appear on an element reference or element declaration.

DFDL Statement annotations for a given schema component are executed as
follows:

1) all relevant DFDL statement annotations are gathered to form a single
list which preserves schema-definition order.

   - For a simpleType definition, the DFDL statement annotations found
   immediately on it are kept in schema-definition order, and are appended to
   the end of  a list of those from any base simpleType definition.
   - For an element declaration having simple type, the DFDL statement
   annotations found immediately on the declaration are appended to the end of
   the list of those from its simple type.
   - For an element reference, DFDL statement annotations found immediately
   on the element reference are appended to the end of the list of those from
   the global element declaration it references.

2) given the combined list, the annotations are executed as follows:

   1. before any parsing of the element, a dfdl:discriminator with
   testKind="pattern" is executed.
   2. if there is no discriminator, then before any parsing of the element,
   all dfdl:asserts (there could be several) with testKind="pattern" are
   executed in the order they appear in the list of DFDL statements.
   3. The element itself is parsed, or its inputValueCalc property is
   evaluated to create its value.
   4. all newVariableInstance annotations are executed and new variables
   are placed into scope for the duration of these remaining steps. The
   statements are executed in the order they appear in the list of DFDL
   statements.
   5. all setVariable annotations are executed. The statements are executed
   in the order they appear in the list of DFDL statements.
   6. if a discriminator is present it is executed
   7. if no discriminator is present, then assert annotations can be
   present, and they are executed. If there are multiple assert annotations
   the statements are executed in the order they appear in the list of DFDL
   statements.

If the element reference or local element declaration is an array, then
this evaluation is repeated for each occurrence of the array.

*Discussion: *

The above allows the default expressions associated with any statement to
refer to the value of the element itself as "."

However, there's this anomaly of syntax where things don't seem right:

<dfdl:newVariableInstance ref="foo:bar" default="{....some expression
...}"/>  creates a new variable for the scope of the entity it annotates.

It's clear what this means if this annotation is placed on a sequence or
choice. For the children of that sequence/choice the new variable instance
is in effect.

On a simpleType or element having simple type, similarly it is clear (in
that case it's a very local variable, just for the expressions in other
newVariableInstance statemsnts, setVariableStatements, and discriminators
and assertions).

The rub: on an element declaration (or reference) when there is a complex
type, its not clear.

<element name="foo">
<annotation><appinfo ....>
   <dfdl:newVariableInstance......> <!-- what can refer to this? -->
</appinfo></annotation>
<complexType>
   <sequence>
      <annotation><appinfo ....>
           <dfdl:assert>... according to rules above this cannot refer to
the newVariableInstance...</dfdl:assert>
     </appinfo></annotation>
     ....
   </sequence>
</complexType>
</element>

The above rules say the newVariableInstance isn't evaluated until AFTER the
element is parsed, so the assert down inside on the sequence will NOT see
this new variable even tho textually, the newVariableInstance annotation
looks like it would be in scope over the sequence.

Possible ways to avoid this oddity without messing up evaluation order:
allow newVariableInstance only on simpleType, element declarations and
element references having simpleType, group references, and
sequence/choice. Disallow them on element declarations of complex type or
on element references to those. The schema author can always introduce an
extra tier of sequence to provide the exact behavior they need, and this
otherwise error-prone issue can be avoided.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121029/bb62fad1/attachment.html>


More information about the dfdl-wg mailing list