[DFDL-WG] assert and discriminator - no more before/after

Mike Beckerle mbeckerle.dfdl at gmail.com
Mon Sep 17 09:58:34 EDT 2012


Thanks for this.

So, an expression on a discriminator wants to be able to be placed
somewhere syntactically such that it is clear that it will apply to a
specific point of uncertainty. Yet it then must reference forward into the
structure that would normally not have yet been created/processed at that
point, for example to look at a tag field.

I see the issue now. The problem is that all these forward references must
have their points of uncertainty resolved, and then we can evaluate the
expression without being fooled about whether something temporarily doesn't
exist or not.

...mikeb



On Mon, Sep 17, 2012 at 9:15 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> Some thoughts so far...added to agenda for next WG call.
>
> I think the IBM implementation was done that way to handle descendent
> references and their implications, and to compensate for the dropping of
> the 'timing' attribute. It uses a notify mechanism to speed up the point
> where a discriminator can be evaluated, and hence backtrack earlier. I
> believe it could have been implemented by evaluating at fixed points -
> evaluate straight away if no reference to self or descendent; evaluate when
> finished component if reference to self or descendent; evaluate if
> processing error occurs within component.
>
> I have been back through spec revisions, draft documents on speculative
> parsing, emails and WG minutes, to see what we have discussed previously:
>
> - The 'timing' attribute on asserts and discriminators existed in spec
> 038. This was dropped in spec 039 when the following was added: "The
> expression is evaluated when the referenced elements are known to exist or
> known not to exist.". This was Tim's suggestion.
>
> - Later on in spec 042 the 'timing' attribute was dropped from asserts
> too, and the wording for both changed to "Any element referred to by the
> expression must have already been processed or is a descendent of this
> element. The expression must have been evaluated by the time this element
> and it descendents have been processed." plus additionally for
> discriminators "or when a processing error occurs when processing this
> element or its descendents".
>
> - The motivating use case for supporting forward references to descendents
> is as follows:  "For the most common use case (choice resolution) the most
> natural place for a DFDL author to place the discriminators is on the root
> of the branch. This may well involve a forward reference into the branch
> content.  If we disallow this, you are forcing the author to place the
> discriminator in the branch content, which might be in a separate global
> element, perhaps in another schema".  And the problem with doing the latter
> is the rule that says a discriminator resolves the nearest active point of
> uncertainty. If you place a discriminator inside a global element then it
> will get evaluated at all points of use - with potentially undesirable
> results. So you place your discriminator as close to the point of
> uncertainty as possible.
>
> - Your dead-simple suggestion leaves the above choice scenario open to the
> problem stated.
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        Steve Hanson/UK/IBM at IBMGB
> Cc:        dfdl-wg at ogf.org
> Date:        15/09/2012 20:09
> Subject:        Re: [DFDL-WG] assert and discriminator - no more
> before/after
> ------------------------------
>
>
>
> That is a remarkably complex implementation.
>
> I am a bit concerned.... is it correct? Here's the scenario I'm worried
> about: When an expression is evaluated, some node set results for internal
> sub-expression results might be empty node sets. But there's no positive
> way to tell this happened because some part of the infoset had 'not yet'
> been filled in, because we cannot predict the future. So an expression
> could successfully evaluate, just produced a different value than it would
> if evaluated later.
>
> It seems to me some of this is inherent in XPath and the
> node-set-as-result model. An example of this might be a discriminator with
> an expression that evaluates to true if some subtree does NOT exist. At the
> start of the associated element, that tree doesn't exist, so you get empty
> node set back when you examine it, discriminator expression successfully
> evaluates to true. Later, when recursing into children of the element in
> question you add that sub-tree. After the element is complete the answer to
> the discriminator expression would have been false.
>
> Anyway, in the daffodil project we're trying to reuse an XPath
> implementation (Saxon-B), and create as our infoset trees in the JDOM
> object model.
>
> So the above strategy, even if it works, isn't available to us anyway
> because we're trying to use an expression evaluator that does call us back
> for variable access, but expects us to hand it a JDOM tree as the infoset,
> and it accesses that tree at will while evaluating the expression. We don't
> have intercept capability to the inquiries on the infoset/jdom.
>
> Other implementations might also prefer simpler implementation techniques,
> even at the expense of efficiency.
>
> So the question becomes are there simpler strategies for implementing the
> DFDL expressions on asserts/discriminators?
>
> I have a dead-simple suggestion that might work.
> 1.        an assert/discrim which annotates an element behaves as if
> evaluated after the element's value is computed, true whether simple or
> complex type.
> 2.        an assert/discrim which annotates a sequence or choice behaves
> as if evaluated before any child of the sequence/choice.
> Seems very simple. (Almost too simple?) I'm not sure it is correct, but it
> seems workable so far to me, in that you can control before/after by
> syntactic placement. This might have some impact on the model structure,
> but there are other things in DFDL that do as well.
>
> Is there some obvious flaw I'm missing?
>
> ...mike
>
>
> On Fri, Sep 14, 2012 at 5:14 AM, Steve Hanson <*smh at uk.ibm.com*<smh at uk.ibm.com>>
> wrote:
> Hi Mike
>
> I believe the rule is that an assert/discriminator expression is evaluated
> "as soon as it can be". In IBM DFDL, we try to evaluate an expression when
> it is encountered, and if it can't be evaluated at that time because it
> references element(s) that do not appear in the infoset, the parser
> continues but the expression manager registers an interest in the
> element(s). The expression manager gets notified when the element(s) appear
> in the infoset and the expression is re-evaluated at that time. If the
> parser gets to the end of the scope for the assert/discriminator and it is
> still not evaluated, it is an error. (Tim please correct any
> mis-information).
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:*+44-1962-815848* <%2B44-1962-815848>
>
>
>
> From:        Mike Beckerle <*mbeckerle.dfdl at gmail.com*<mbeckerle.dfdl at gmail.com>
> >
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
> Date:        13/09/2012 16:32
> Subject:        [DFDL-WG] assert and discriminator - no more before/after
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
>  ------------------------------
>
>
>
>
>
> I am looking in the spec for guidance about the evaluation order of assert
> statements.
>
> We used to have before/after control properties, but eliminated them.
>
> If I annotate a simpleType'd element with an assert that says { . eq 'x'
> }, that of necessity references the current value, so must execute after
> the value has been computed.
>
> If on the other hand I annotate a complexType element with a discriminator
> that says { ../flag eq 'C1' } then this of necessity must execute before I
> go after the contents because the whole point is to evalutate the
> discriminator first.
>
> Did we ever articulate exactly what the rules are here about order of
> evaluation?
>
> Thanks for reminders
>
> ..mikeb
>
> --
> Mike Beckerle | OGF DFDL WG Co-Chair
> Tel:  *781-330-0412* <781-330-0412>
> --
>  dfdl-wg mailing list
>  *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>
>  *https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
>
>
> --
> Mike Beckerle | OGF DFDL WG Co-Chair
> Tel:  781-330-0412
>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>



-- 
Mike Beckerle | OGF DFDL WG Co-Chair
Tel:  781-330-0412
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120917/57858ec4/attachment.html>


More information about the dfdl-wg mailing list