[DFDL-WG] DFDL Statement Evaluation Timing (Assert, Discriminator, SetVariable, NewVariableInstance)

Steve Hanson smh at uk.ibm.com
Thu Nov 1 12:28:51 EDT 2012


Mike thanks for writing this up, I think we are close. Comments in-line.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     dfdl-wg at ogf.org, 
Date:   31/10/2012 19:08
Subject:        Re: [DFDL-WG] DFDL Statement Evaluation Timing (Assert, 
Discriminator, SetVariable, NewVariableInstance)
Sent by:        dfdl-wg-bounces at ogf.org



Revision 2 per workgroup call on 2012-10-31

This is a rewrite, not a set of edits.

---------------------------------------------


Clarification: At any single annotation point of the schema, there can be 
only one format annotation (dfdl:format, dfdl:element, dfdl:sequence, 
dfdl:choice, dfdl:group, dfdl:simpleType).  

Glossary: DFDL Statement annotations, or just DFDL Statements, are the 
annotation elements dfdl:assert, dfdl:discriminator, dfdl:setVariable, and 
dfdl:newVariableInstance.
SMH: Nice idea. What about dfdl:defineVariable, is that a statement 
annotation too?  Where does that leave dfdl:defineFormat and 
dfdl:defineEscapeScheme - they are not format annotations (that's their 
content). Do we have 'global', 'statement' and 'format' annotations?

Glossary: Combined annotations: When annotations are combined between a 
group reference and the sequence or choice of the referenced global group, 
or among an element reference, an element declaration, and its type 
definition, the combined set of is referred to as the combined annotations
. 

DFDL Statement Annotation Placement

dfdl:assert and dfdl:discriminator can be placed as annotations on 
sequence, choice, group references, local and global element declarations, 
element references, and simple type definitions.

dfdl:setVariable may be placed as an annotation on sequence, choice, group 
references, local and global element declarations for elements of simple 
type, element references to elements of simple type, and simple type 
definitions.

dfdl:newVariableInstance can be placed as an annotation on sequence, 
choice, and group references.

The combined annotations for any schema component can contain only a 
single dfdl:discriminator, or any number of dfdl:assert statements, but 
not both asserts and a discriminator. It is a schema definition error 
otherwise.

The combined annotations for any schema component can contain multiple 
dfdl:setVariable annotations, but they must each refer to a different 
variable. It is a schema definition error otherwise.

The combined annotations for any schema component can contain multiple 
dfdl:newVariableInstance annotations, but they must each refer to a 
different variable. It is a schema definition error otherwise.

Evaluation Order for Statement Annotations

Assertions Before:

dfdl:discriminator or dfdl:assert with testKind='pattern' are executed 
before parsing the annotated construct. 
SMH: Wording needs to cater for combined annotations.

Note that the pattern is used to match against the entire representation 
of the component; hence, the framing (including initiators, etc.) are all 
visible to the pattern. The dfdl:encoding property is used when decoding 
the data to characters before matching. 

It is a schema definition error if alignment is not 1 and a 
dfdl:discriminator or dfdl:assert with testKind='pattern' is used.
(TBD: restrictions on lengthKind='prefixed' as well? Any other 
framing-based incompatibilities? where assertions with testKind='pattern' 
are really incompatible?)
SMH: If  alignment <> 1 is schema definition error then so should 
leadingSkip <> 0.  I'd leave it there though.
Also schema definition error if encoding not set.

If there are multiple dfdl:assert statements with testKind='pattern' the 
order of execution among them is not specified. 
Schema authors can insert sequences to control the timing of evaluation of 
statements more precisely.

Assertions After:

dfdl:discriminator or dfdl:assert with testKind='expression' (the default) 
are executed after parsing the annotated construct.
SMH: Wording needs to cater for combined annotations.

Furthermore, an attempt to evaluate a discriminator must be made even if 
the parse of the annotated construct ended in a parse error. This is 
because a discriminator could evaluate to true thereby resolving a point 
of uncertainty even if the complete parsing of the construct ultimately 
caused a parse error. Such discriminator evaluation has access to the DFDL 
Infoset of the attempted parse as it existed immediately before detecting 
the parse failure. 

Implementations are free to optimize by recognizing and executing 
discriminators or assertions earlier so long as the resulting behavior is 
consistent with what results from the above description.  

If there are multiple dfdl:assert statements with testKind='expression', 
then the order of execution among them is not specified. Schema authors 
can insert sequences to control the timing of evaluation of statements 
more precisely.

The dfdl:newVariableInstance Statement

These statements are evaluated before the parsing of the annotated 
construct. When there is more than one newVariableInstance statement the 
order of execution among them is not specified.  Schema authors can insert 
sequences to control the timing of evaluation of statements more 
precisely.

All dfdl:newVariableInstance statements are executed before any 
dfdl:setVariable statements on the same annotated construct.
SMH: 

SMH: Wording needs to cater for combined annotations.

The dfdl:setVariable Statement

When a dfdl:setVariable annotation is found on an element reference, 
element declaration, or simple type definition, then it is executed after 
the parsing of the element, which implies after the evaluation of 
expressions corresponding to any computed format properties. That is, if 
an expression is used to provide the value of a format property such as 
dfdl:terminator, the evaluation of that expression occurs before any 
dfdl:setVariable annotation is executed; hence, the expression providing 
the value of the format property may not reference the variable.

When a dfdl:setVariable annotation is found in the combined set of 
annotations for a sequence, choice, or group reference, then it is 
executed after any dfdl:newVariableInstance statements in that same 
combined set, but it is executed before the parsing of the sequence, 
choice, or group reference. 

If there are multiple dfdl:setVariable statements in one combined set of 
annotations, then the order of evaluation among them is not specified. 
Schema authors can insert sequences to control the timing of evaluation of 
statements more precisely.

SMH: Wording needs to cater for combined annotations.

--
  dfdl-wg mailing list
  dfdl-wg at ogf.org
  https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121101/a2b02971/attachment.html>


More information about the dfdl-wg mailing list