[DFDL-WG] DFDL Statement Evaluation Timing (Assert, Discriminator, SetVariable, NewVariableInstance)

Steve Hanson smh at uk.ibm.com
Wed Oct 31 06:20:32 EDT 2012


Mike

IBM DFDL already supports asserts and discriminators on complex elements, 
so that must remain.  There's a clear use case for this - a choice with 
branches that are element refs to complex global elements. Discrimination 
is possible at the time the choice is processed. You would put a 
discriminator on the element refs.  Also, asserts and discriminators are 
intended to be the equivalent of WTX component rules which are allowed on 
complex elements.

The last point about WTX made me realise why we had disallowed asserts and 
discriminators on global elements. In xsd terms they are associated with a 
particle. If we are going to allow them on global elements then we need to 
be clear that this is no longer the case.

Agree that only one setVariable annotation for a given variable can exist 
when annotations are combined from multiple objects. Same for 
newVariableInstance.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     dfdl-wg at ogf.org, 
Date:   30/10/2012 18:26
Subject:        Re: [DFDL-WG] DFDL Statement Evaluation Timing (Assert, 
Discriminator, SetVariable, NewVariableInstance)
Sent by:        dfdl-wg-bounces at ogf.org



Revision 1 (based on discussion on WG call 2012-10-30)

Principles: disallow statements except where their scope and timing are 
clear and where the timing is easy to understand from the way it appears 
textually in the schema document
See changes in RED.

On Mon, Oct 29, 2012 at 6:54 PM, Mike Beckerle <mbeckerle.dfdl at gmail.com> 
wrote:
I'll write this up like an errata, but this is for discussion of whether 
we believe this is clear and complete.

-------------------------------------

Glossary: DFDL Statements are the annotation elements dfdl:assert, 
dfdl:discriminator, dfdl:setVariable, and dfdl:newVariableInstance.

Errata: Locations where dfdl:assert, dfdl:discriminator and 
dfdl:setVariable are allowed to appear are extended to also include Global 
Element Declarations for elements of simple type, and on Simple Type 
Definitions. 

Errata: dfdl:newVariableInstance,  may appear only as an annotation on a 
sequence or choice. 

Errata: dfdl:setVariable, dfdl:assert, dfdl:discriminator may appear only 
as an annotation on a sequence, choice, or a simpleType definition, or an 
element declaration/reference having simpleType. Note this removes ability 
for complex typed element decls/refs carry any DFDL Statement annotations 
including asserts/discriminators. 

(I'm trying to get really minimal here. Not even assert on complexType 
elements.)


Errata: Clarification about discriminators: Discriminators exclude 
Assertions even when combining across references.

Beyond the stipulation that there can be only one dfdl:discriminator at 
any annotation point of the DFDL schema, there are further constraints.

A single dfdl:discriminator annotation may appear on an element reference, 
or on the global element declaration it refers to, or on the simple type 
appearing immediately within or referenced from the global element 
declaration. But only one of those places. In addition, if a discriminator 
occupies one of those places, then no dfdl:assert annotations may appear 
in any of those locations. 

A dfdl:discriminator annotation may appear on a group reference or on the 
model group within the global group definition it refers to. But only one 
of those places, and similarly, if a discriminator appears in any of those 
places, then no dfdl:assert annotations may appear in any of those 
locations.

(TBD: constraints that you can't have multiple setVariable statements of 
the same variable in these places either, just as you can't have multiple 
setVariables of the same variable at one annotation point.) 


Errata: Clarification about the execution order of DFDL Statements when 
they appear on an element reference or element declaration.

DFDL Statement annotations for a given element are executed as follows: 
(Keep in mind that this element will have simpleType, as complexType 
elements cannot carry statement annotations at all.)

1) all relevant DFDL statement annotations are gathered to form a single 
list which preserves schema-definition order. 
For a simpleType definition, the DFDL statement annotations found 
immediately on it are kept in schema-definition order, and are appended to 
the end of  a list of those from any base simpleType definition. 
For an element declaration having simple type, the DFDL statement 
annotations found immediately on the declaration are appended to the end 
of the list of those from its simple type.
For an element reference, DFDL statement annotations found immediately on 
the element reference are appended to the end of the list of those from 
the global element declaration it references.
2) given the combined list, the annotations are executed as follows:
1.      before any parsing of the element, a dfdl:discriminator with 
testKind="pattern" is executed. 
2.      if there is no discriminator, then all dfdl:asserts (there could 
be several) with testKind="pattern" are executed in the order they appear 
in the list of DFDL statements.
3.      Any properties having runtime evaluation are evaluated. (e.g., 
delimiters with expressions)
4.      The element itself is parsed, or its inputValueCalc property is 
evaluated to create its value.
5.      REMOVED: no longer allowed on elements at all: all 
newVariableInstance annotations are executed and new variables are placed 
into scope for the duration of these remaining steps. The statements are 
executed in the order they appear in the list of DFDL statements.
6.      all setVariable annotations are executed. The statements are 
executed in the order they appear in the list of DFDL statements. 
7.      if a discriminator is present it is executed
8.      if no discriminator is present, then assert annotations can be 
present, and they are executed. If there are multiple assert annotations 
the statements are executed in the order they appear in the list of DFDL 
statements.
If the element reference or local element declaration is an array, then 
this evaluation is repeated for each occurrence of the array.

A DFDL implementation that wishes to optimize is free to analyze the 
expressions used, and evaluate them sooner so long as the behavior is 
equivalent to the above description. 

Discussion/Illustration: (this is all revised, so I'm switching back to 
black ink)

Suppose you have this situation:


<sequence>
  ...
  <element ref="foo"/> <!-- I want to add DFDL statement annotations 
before and after this -->
  ...
</sequence>

To add them before, so they are scoped over or visible to the parsing of 
the entire foo element:

<sequence>
  ...
  <sequence> <!-- inserted sequence -->
     <annotation><appinfo ...>
        <dfdl:newVariableInstance ref="myVar" default="{...}"/>
        <dfdl:setVariable ref="myOtherVar" value="{...}"/>
     </appinfo></annotation>
     <element ref="foo"/> <!-- I want to add DFDL statement annotations 
before this -->
  </sequence>
  ...
</sequence>

To add them after so they can reference downward into that element:

<sequence>
  ...

  <element ref="foo"/> <!-- I want to add DFDL statement annotations after 
this -->
  <sequence> <!-- inserted sequence -->
     <annotation><appinfo ...>
        <dfdl:assert>{ foo/bar/baz.... }</dfdl:assert>
        <dfdl:setVariable ref="yetAnotherVar" value="{ foo/bar[2]/baz + 1 
}"/>
     </appinfo></annotation>
  </sequence>
  ...
</sequence>

The above illustration is about complex type elements, but note that the 
timing issue really doesn't care whether the element/element-ref had 
simple or complex type. You can perfectly control when evaluation occurs 
relative to the element itself.

The timing is then clear (to me anyway). The optimization opportunity to 
hoist the assert for earlier evaluation is clear, but it's also clear that 
the assert CAN look into foo how it wishes to make its decision. 

If you put the annotation directly on the element (which then must be of 
simple type), then it is exactly equivalent to putting it in a sequence 
AFTER. (Modulo replacing "." in expressions with "foo")

--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121031/e5c3a2f0/attachment.html>


More information about the dfdl-wg mailing list