[DFDL-WG] Fw: questions on setVariable

Steve Hanson smh at uk.ibm.com
Tue Sep 24 09:11:32 EDT 2013


For discussion on today's WG call...

Regards

Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 24/09/2013 14:09 -----

From:   Steve Hanson/UK/IBM
To:     Mike Beckerle <mbeckerle.dfdl at gmail.com>, 
Cc:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date:   11/09/2013 15:36
Subject:        Re: [DFDL-WG] questions on setVariable


Mike, I have several email threads that track the original design 
discussion for action 186. 

You originally wrote down an algorithm for combining statement annotations 
into an ordered list, but then the next version scrapped that and proposed 
the current scheme. I think there must have been some reason for this?

Are we just talking setVariable here, or also newVariableInstance and 
assert as well?  If so would the ordering rules be the same within, and 
across, annotation points?

A couple of users have asked about short form versions of statement 
annotations. Imposing an execution order is not compatible with this. This 
was discussed in one of the action 186 emails. 

Regards

Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Mike Beckerle <mbeckerle.dfdl at gmail.com>
To:     Steve Hanson/UK/IBM at IBMGB, 
Cc:     "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>
Date:   11/09/2013 14:48
Subject:        Re: [DFDL-WG] questions on setVariable




It is when there is more than one setVariable at one single annotation 
point that I believe they should definitely execute in lexical order. I am 
much less worried about across different annotation points that get 
resolved together which is what the Errata was clarifying. 

My preference would be this: If an element ref has 2 setVariable 
statements for variables A, B, and the element declaration has setVariable 
statements for variables X, Y, then ABXY, AXBY, AXYB, XYAB are all 
possible interleavings, but A is always before B, and X is always before Y 
so execution is always consistent with lexical order.

I want to just clarify the hole we've left by not being tight enough about 
specifying the order of this evaluation of setVariables within or across 
annotation points in a resolved set.

I can build a completely non-deterministic schema:

<defineVariable name="x" defaultValue="1"/>
<defineVariable name="y" defaultValue="2"/>
<defineVariable name="z" defaultValue="0"/>

<group name="coinFlip"/>
<choice>
    <sequence>
       <annotation><appinfo>
           <dfdl:setVariable name="x" value="{ if ($y = 2) then 5/$z else 
3 }"/> <!-- div by zero to create a proc error -->
           <dfdl:setVariable name="y" value="1"/>
       </appinfo></annotation>
      <element name="nonLexicalOrder" type="string" dfdl:inputValueCalc="{ 
'' }"/>
   </sequence>
    <element name="lexicalOrder" type="string" dfdl:inputValueCalc="{ '' 
}"/>
</choice>
</group>

The above group definition illustrates how we can test what order a 
particular implementation executes these setVariable statements. 

You get <lexicalOrder/> or <nonLexicalOrder/> depending on what the 
implementation does. It could even vary from run to run for one 
implementation if the implementation is using a hash implementation that 
is using different random seeds at different times. It could even vary 
within a run, if this group is reused several times in the schema. 

Now, why would somebody write this, I don't know, but it does illustrate 
the hole in the spec. Certainly it would be better if DFDL could not 
express things like this.





computing x reads the value of y. If the assignment to y has not yet 
happened, then we get a processing error when we divide by zero. This can 
cause backtracking, and the schema will effectively dodge the SDE that is 
supposed to happen to prevent this sort of assignment-timing stuff.







This doesn't fix the problem entirely. I think our current 
'non-specification' of the precise order here even across annotation 
points was a bit of a cop-out.






Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy



On Wed, Sep 11, 2013 at 6:26 AM, Steve Hanson <smh at uk.ibm.com> wrote:
"I think the execution order needs to be specific within the annotations 
at one annotation point." 

The problem is that action 186 allowed scenarios like setVariable 
annotations simultaneously on an element ref, its global element and the 
element's simple type (*). So there is no longer 'one annotation point', 
there are multiple. That's when we formalised the concept of a 'resolved 
set of annotations' for a component, and stated that for all annotations 
of the same kind (**) in that set, the order of execution is not defined. 
We also state schema authors can insert sequences to provide more precise 
control over when annotations are executed. 

Several reasons for this. Is the lexical order of elements in appinfos, or 
of multiple appinfos, guaranteed in XSDL? If not then an editor could 
serialize appinfos or their content in any order. We also wanted to 
reserve the right to add explicit timing control in the future. And it 
made implementations easier. 

IBM DFDL 1.0.3 has been out in the field with this support since 1Q 2013. 
Tim, can you say what the execution order is for a resolved set of 
setVariables please? Or can we not tell because of the model? 

(*) Must be different variables, else SDE. 
(**) Asserts and discriminators with testKind 'pattern' are a different 
'kind' from those with 'expression', for this purpose. 

Regards

Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 



From:        Suman Kalia <kalia at ca.ibm.com> 
To:        Mike Beckerle <mbeckerle.dfdl at gmail.com>, 
Cc:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, "dfdl-wg-bounces at ogf.org" 
<dfdl-wg-bounces at ogf.org>, Steve Hanson/UK/IBM at IBMGB 
Date:        10/09/2013 19:23 
Subject:        Re: [DFDL-WG] questions on setVariable 



Mike - In the use case hilited by Jonathan  where 2 annotations are 
defined on the same point.  I would suggest the execution order to be 
based on the lexical order  i.e. the order in which annotations appear.. 

Suman Kalia 
IBM Canada Lab 
WMB Toolkit Architect and Development Lead 
Tel: 905-413-3923 T/L 313-3923 
Email: kalia at ca.ibm.com 

For info on Message broker 
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html 






From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        Steve Hanson <smh at uk.ibm.com>, 
Cc:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org>, "dfdl-wg-bounces at ogf.org" 
<dfdl-wg-bounces at ogf.org> 
Date:        09/10/2013 02:14 PM 
Subject:        Re: [DFDL-WG] questions on setVariable 
Sent by:        dfdl-wg-bounces at ogf.org 



If the first setVariable executes first, that will read variable baz, 
which will cause the second setVariable to SDE because you cannot set 
after the variable has been read. I believe this is the right semantics 
for this schema, because both these setVariable annotations are written at 
the same annotation point.

If the second setVariable executes first, that will assign variable baz, 
which is OK, then when the first setVariable executes, it will get the 
value that was just assigned. 

While the spec says you shouldn't depend on this kind of ordering stuff, I 
think it is pretty unsatisfactory if the order is unspecified. Section 9.5 
lays out the order of the different kinds of statement annotations 
relative to each other, but doesn't say what happens between two 
setVariable statements that appear at the same annotation point. I think 
the execution order needs to be specific within the annotations at one 
annotation point. 


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com 
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy 



On Tue, Sep 10, 2013 at 1:50 PM, Steve Hanson <smh at uk.ibm.com> wrote: 
1) I'll let Tim comment more on this but a relative expression uses as its 
context the current element on the stack. So using '.' on a simple element 
or its type resolves to the simple element, and using '.' on a complex 
element or a group child (recursively) of a complex element resolves to 
the complex element. Same deal when using the '..' parent axis, so beware 
- as this example shows: 

<xs:element name="root"> 
       <xs:complexType> 
               <xs:sequence> 
                       <xs:element name="delim" type="xs:string"/> 
                       <xs:element name="body"> 
                               <xs:complexType> 
                                       <xs:sequence 
dfdl:separator="{../delim}"> 

...and not {../../delim} !  This is all implied by the way XPath works. 
XPath only knows about elements.   

2) I don't see why Jonathan's example could give an SDE.  The 
defineVariables will always be executed first. However the result is 
ambiguous, depending on the order each setVariable is executed, as 'bar' 
could end up as 'biff' or 'fooey'. "Schema authors can insert sequences to 
provide more precise control over when variables are set" - quote from 
section 9.5.3 (errata 3.25). 

3) To Mike's comment, the order of execution of setVariable and 
newVariableInstance on the same annotation is strictly defined, see 
section 9.5 (errata 3.25) 

Regards

Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 



From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        "Cranford, Jonathan W." <jcranford at mitre.org>, 
Cc:        "dfdl-wg at ogf.org" <dfdl-wg at ogf.org> 
Date:        10/09/2013 18:23 
Subject:        Re: [DFDL-WG] questions on setVariable 
Sent by:        dfdl-wg-bounces at ogf.org 



Those are both public-comment-class issues. 

The issue of what does "." mean in various places is interesting. It is 
referring to an infoset node, but how to explain exactly which one it 
is....

The out-of-order issue with evaluation that you mention is definitely an 
issue. The schema you gave will work or SDE depending on the order of 
evaluation of the statements and the language is not precise enough to say 
what the order is. I think the order of the setVariable and 
newVariableInstance statements at any annotation point, must be linear 
first to last. The order of setVariable and newVariableInstance combined 
into the resolved set of annotations, but from multiple annotation points 
(e.g., an element and a global type definition it references), those can 
be interleaved in any order.  


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | 
www.tresys.com 
Please note: Contributions to the DFDL Workgroup's email discussions are 
subject to the OGF Intellectual Property Policy 



On Tue, Sep 10, 2013 at 12:53 PM, Cranford, Jonathan W. <
jcranford at mitre.org> wrote: 
These questions may go towards public comment if the answers aren't 
straight-forward.  Section references are against latest editor's draft. 
  
·        Section 7.9 - 5th paragraph after first example - "A 
dfdl:setVariable value expression may refer to the value of this element 
using a relative path value '.'. " 
  
Which element?   
  
If dfdl:setVariable is attached to a simpleType, then the "element" 
referred to here is the element of that type, correct? 
  
But when it's attached to group reference, sequence, or choice, is it the 
parent element which contains the group reference, sequence, or choice? 
  
  
·        Section 7.9 - next-to-last paragraph:  "However, the order of 
execution among them is not specified." 
  
Wouldn't this be ambiguous, then? 
  
<xs:schema> 
    <xs:annotation> 
    <xs:appinfo source="http://www.ogf.org/dfdl/"> 
      <xs:defineVariable name="bar" /> 
      <xs:defineVariable name="baz" defaultValue="biff"/> 
    </xs:appinfo> 
    </xs:annotation> 
... 
    <xs:element name="foo"> 
      <xs:annotation> 
        <xs:appinfo source="http://www.ogf.org/dfdl/"> 
          <xs:setVariable ref="bar" value="{$tns:baz}"/> 
          <xs:setVariable ref="baz" value="fooey"/> 
        </xs:appinfo> 
      </xs:annotation> 
    </xs:element> 
... 
</xs:schema> 
  
If the order of execution isn't specified, isn't it ambiguous whether 
tns:bar gets the value of "biff" (defaultValue of tns:baz) or "fooey"? 
  
  
Respectfully, 
  
-- 
Jonathan W. Cranford 
Senior Information Systems Engineer 
The MITRE Corporation (http://www.mitre.org) 
  

--
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg 
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 

--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130924/30051882/attachment-0001.html>


More information about the dfdl-wg mailing list