[DFDL-WG] Fw: What are the consequences of a failed assert?

Mike Beckerle mbeckerle.dfdl at gmail.com
Thu Jul 11 12:53:14 EDT 2013


DFDL's current behavior for assert with failureType="recoverableError" is
clearly specified. It says that an error evaluating the expression causes a
processing error, not a recoverable error.

Unless we want to revisit that, I think fatalError has to be consistent
with it.

The implications of this are that the fn:error function that we're
considering adding to DFDL is not adequate to issue recoverable or fatal
errors. We would probably change to dfdl:error(...) that takes a
failureType argument.

However, I think there is an alternative to this fn:error function. If we
allow the message property of dfdl:assert and dfdl:discriminator to be an
expression that returns a string, not just a string literal, then there is
no need for fn:error. This message expression would not be evaluated unless
the assert fails. (I will present this as an alternative proposal to the
fn:error proposal in a separate email thread.)

Below is a discussion of the pros/cons of assert semantics vis a vis the
evaluation of the test expression. My view is the above (consistent with
current spec for recoverable error) is the right choice, but I had to think
long and hard about it, so I think it is worth it to share the pro/con
ideas I had to weigh below.

There are two options here. Either

   - dfdl:assert is a control construct that effectively wraps a
   try-catch-like behavior around the expression evaluation, or
   - dfdl:assert is like a procedure call, which is passed a boolean
   argument after the expression is evaluated.

The latter, assert-as-procedure is simpler semantics. Picking up the
expression and putting it into a setVariable that precedes the assert
wouldn't change the meaning. If you want the expression to be 'protected'
so that if anything goes wrong in evaluating the expression you still get
the assert failure then you can build up that behavior in a verbose way
with a <choice> construct. A <choice> is DFDL's try-catch construct in
effect. Here's an example of what that might look like:

<sequence>
    <annotation><appinfo ...>
        <newVariableInstance ref="x"/>
    </appinfo></annontation>
<choice>
    <sequence><annotation><appinfo ...>
        <setVariable ref="x">....expr ...</setVariable>
    </appinfo></annontation></sequence>

    <sequence><annotation><appinfo ...>
        <setVariable ref="x">fn:false()</setVariable>
    </appinfo></annontation></sequence>
</choice>
<dfdl:assert failureType="recoverableError" test="{ $x }"/>
</sequence>

The assert-as-control-construct semantics can create behaviors where the
user believes the assert is failing, but really they've just written an
unreliable expression which gets errors sometimes when the user is
expecting it does not.

(Note: This issue of unreliable expressions generally needs attention paid
to it. I have found it hard to craft reliable expressions and I have found
I must use the fn:trace function - not currently part of DFDL - to figure
them out. But I believe this is beyond the scope of discussion of the
assert behavior because it comes up equally in every other use of
expressions in DFDL. I may propose fn:trace in a separate email thread.).

In the assert-as-procedure semantics, fn:error cannot be used to issue a
recoverable or fatal error. Since it is genuinely useful for a user to be
able to issue these conditionally from inside expressions, we either add
more functions (dfdl:recoverableError, dfdl:fatalError), or an argument to
a new dfdl:error to indicate the failureType we want.

The assert-as-procedure semantics is much easier to implement.

The assert-as-control-construct is more familiar in the following sense:
Programming language assert statements tend to be control constructs, not
just functions.  Example is java assert:

float x = z / y;
assert x != 1.0 ;

Suppose y is zero. Then the computation of the int x will cause a divide by
zero error, and that will happen regardless of whether assertion checking
is turned on or off. But if I write:

assert (z / y) != 1.0 ;

Now turning off assertion checking will suppress the divide by zero error.
However, if assertion checking is enabled, then the divide by zero error is
not converted into an assertion failure (I believe), but is an ordinary
divide by zero thrown error which can be caught via catch of the right
exception type.

The analogy of java asserts to DFDL asserts is imperfect because in DFDL
they are not something one can "turn off". Parsing must evaluate them.


On Thu, Jul 11, 2013 at 5:06 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> Let's say I have an assert with 'fatalError'. What happens if the assert
> has a processing error when it is being evaluated. Is this still a
> processing error, or does it get converted into a fatal error?  I include
> the use of fn:error() here as well.
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
> ----- Forwarded by Steve Hanson/UK/IBM on 11/07/2013 09:59 -----
>
> From:        "Garriss Jr., James P." <jgarriss at mitre.org>
> To:        Steve Hanson/UK/IBM at IBMGB, Tim Kimber/UK/IBM at IBMGB,
> Cc:        "Cranford, Jonathan W." <jcranford at mitre.org>, "dfdl-wg at ogf.org"
> <dfdl-wg at ogf.org>
> Date:        10/07/2013 18:06
> Subject:        RE: [DFDL-WG] What are the consequences of a failed
> assert?
> ------------------------------
>
>
>
> > What is being asked for here sounds like the ability to throw an error
> and stop the parser regardless of what points of uncertainty are in scope
>
> Exactly!
>
> > James, would this 'fatal error' give you the capability you are looking
> for?
>
> Yes, I think it would.  I would like to encourage the WG to add this
> feature if at all possible.
>
> *From:* Steve Hanson [mailto:smh at uk.ibm.com <smh at uk.ibm.com>] *
> Sent:* Monday, July 08, 2013 6:11 AM*
> To:* Tim Kimber*
> Cc:* Cranford, Jonathan W.; dfdl-wg at ogf.org; Garriss Jr., James P.*
> Subject:* RE: [DFDL-WG] What are the consequences of a failed assert?
>
> Errata 3.4 added the failureType attribute to dfdl:assert so that an
> assert could report the equivalent of a validation error and carry on
> ('recoverable error') instead of throwing a 'processing error'. What is
> being asked for here sounds like the ability to throw an error and stop the
> parser regardless of what points of uncertainty are in scope. That does
> sound like a genuinely useful thing to be able to do, and gives dfdl:assert
> the full range of failure actions. We'd have to create a new error type,
> say 'fatal error'. It would have the same effect as a 'schema definition
> error'.
>
> James, would this 'fatal error' give you the capability you are looking
> for?
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>*
> **tel:+44-1962-815848* <+44-1962-815848>
>
>
>
> From:        Tim Kimber/UK/IBM
> To:        "Cranford, Jonathan W." <*jcranford at mitre.org*<jcranford at mitre.org>>,
>
> Cc:        "*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>" <*dfdl-wg at ogf.org*<dfdl-wg at ogf.org>>,
> "*dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>" <*
> dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>>, "Garriss Jr., James
> P." <*jgarriss at mitre.org* <jgarriss at mitre.org>>, Steve Hanson/UK/IBM at IBMGB
> Date:        08/07/2013 09:29
> Subject:        RE: [DFDL-WG] What are the consequences of a failed
> assert?
> ------------------------------
>
>
>
> It depends on whether there was exactly one point of uncertainty active
> when the discriminator expression evaluated to 'true'. Points of
> uncertainty can be nested within one another. A POU can be resolved ( or
> made 'inactive' as I was trying to hint at ) by a discriminator that
> evaluates to 'true'. When a processing error occurs, it is caught by the
> nearest ( most deeply nested ) *active* POU. If there is no active POU then
> the parser halts and reports the error.
>
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
>
> From:        "Cranford, Jonathan W." <*jcranford at mitre.org*<jcranford at mitre.org>
> >
> To:        "Garriss Jr., James P." <*jgarriss at mitre.org*<jgarriss at mitre.org>>,
> Steve Hanson/UK/IBM at IBMGB, Tim Kimber/UK/IBM at IBMGB,
> Cc:        "*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>" <*dfdl-wg at ogf.org*<dfdl-wg at ogf.org>>,
> "*dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>" <*
> dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>>
> Date:        05/07/2013 20:31
> Subject:        RE: [DFDL-WG] What are the consequences of a failed
> assert?
> ------------------------------
>
>
>
>
> Wouldn’t the use of a discriminator suppress the backtracking behavior in
> the face of a processing error?
>
> That is, if a discriminator is used to confirm that the Content-Type
> header branch is the correct one, wouldn’t a subsequent assert that causes
> a processing error force the parser to halt with an error?
>
> -Jonathan Cranford
>  *
> From:* *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org> [*
> mailto:dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>] *On Behalf Of *Garriss
> Jr., James P.*
> Sent:* Friday, July 05, 2013 12:09 PM*
> To:* Steve Hanson; Tim Kimber*
> Cc:* *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>; *dfdl-wg-bounces at ogf.org*<dfdl-wg-bounces at ogf.org>
> *
> Subject:* Re: [DFDL-WG] What are the consequences of a failed assert?
>
> Very helpful, guys, as always.  Thank you.
>
> Question:  Is it possible to force the failure of an assert to be a
> validation error instead of a processing error?
>
> Here’s why:  All the email headers are modeled as an unordered list.  The
> last item on the list is a catch-all, designed to scoop up all the
> unexpected headers and hide them from the infoset (via hiddenGroupRef).
>
> So if I have an Content-Type header that is obviously invalid:
>
> Content-Type: bogus/data
>
> The assert fails as expected.  But since there is always another choice,
> the catch-all, the Content-Type header quietly disappears.  What I really
> want is for there to be a validation error for “bogus/data,” so that
> processing stops!
>
> Is that possible?  Is that reasonable?
>  *
> From:* *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org> [*
> mailto:dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>] *On Behalf Of *Steve
> Hanson*
> Sent:* Thursday, July 04, 2013 4:26 AM*
> To:* Tim Kimber*
> Cc:* *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>; *dfdl-wg-bounces at ogf.org*<dfdl-wg-bounces at ogf.org>
> *
> Subject:* Re: [DFDL-WG] What are the consequences of a failed assert?
>
> Almost.  A Recoverable Error does not cause backtracking. It was added to
> enable an assert to perform validation that checked the data stream rather
> than just the infoset. For example, checking the max physical length of a
> non-string value. So it has the same effect on the parse as a Validation
> Error. I've corrected your summary.
>
> The definition and description of the different errors is covered in the
> spec by section 2. There have been some errata in this section to clarify
> the behaviour. Some of the wording has been around since early drafts of
> the spec, so any suggestions for improvement are welcome.
>
> Regards
>
> Steve Hanson
> Architect, IBM Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>*
> **tel:+44-1962-815848* <+44-1962-815848>
>
>
>
> From:        Tim Kimber/UK/IBM at IBMGB
> To:        *dfdl-wg at ogf.org* <dfdl-wg at ogf.org>,
> Date:        03/07/2013 21:51
> Subject:        Re: [DFDL-WG] What are the consequences of a failed
> assert?
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
> ------------------------------
>
>
>
>
>
> There are four types of error in DFDL:
> - a Schema Definition Error : The schema itself is not valid. ( at least
> three kinds: xsd not valid, xsd not in the DFLD subset, DFDL annotations
> not following the rules )
> - A Processing Error : The data cannot be parsed. Or if unparsing, the
> info set cannot be unparsed.
> - A Recoverable Error : ( see the Errata ). This is effectively a
> user-defined form of Validation Error, raised while executing a DFDL
> assert,
> - A Validation Error : The info set does not conform to the XSD
>
> A Schema Definition Error is immediately fatal. Most of these can be
> detected by the processor before parsing/unparsing begins.
> A Processing Error or a Recoverable Error will cause the parser to suppress
> the error and backtrack to the nearest point of uncertainty. So it will
> only stop the parse if there are no points of uncertainty currently active.
> A Recoverable Error does not cause backtracking - the parser continues to
> parse after reporting the error.
> A Validation Error is only reported if validation is enabled in the DFDL
> processor. It does not cause backtracking - the parser continues to parse
> after reporting the error.
>
> That's the gist of it. Further details from other WG members may follow
> shortly, depending on how accurate I have managed to be.
>
> regards,
>
> Tim Kimber, DFDL Team,
> Hursley, UK
> Internet:  *kimbert at uk.ibm.com* <kimbert at uk.ibm.com>
> Tel. 01962-816742
> Internal tel. 37246742
>
>
>
>
> From:        "Garriss Jr., James P." <*jgarriss at mitre.org*<jgarriss at mitre.org>
> >
> To:        "*dfdl-wg at ogf.org* <dfdl-wg at ogf.org>" <*dfdl-wg at ogf.org*<dfdl-wg at ogf.org>>,
>
> Date:        03/07/2013 19:06
> Subject:        [DFDL-WG] What are the consequences of a failed assert?
> Sent by:        *dfdl-wg-bounces at ogf.org* <dfdl-wg-bounces at ogf.org>
> ------------------------------
>
>
>
>
>
> I have an element with an assert,
>
>         <xsd:element name="Type" dfdl:inputValueCalc="{
> fn:lower-case(../MixedCaseType) }">
>             <xsd:annotation>
>                 <xsd:appinfo source="*http://www.ogf.org/dfdl/dfdl-1.0/*<http://www.ogf.org/dfdl/dfdl-1.0/>
> ">
>                     <dfdl:assert test="{ dfdl:checkConstraints(.) }"message="The
> type must match one of the values on the enumerated list."/>
>                 </xsd:appinfo>
>             </xsd:annotation>
>             <xsd:simpleType>
>                 <xsd:restriction base="xsd:string">
>                     <xsd:enumeration value="application"/>
>                     <xsd:enumeration value="multipart"/>
>                     <xsd:enumeration value="message"/>
>                     <xsd:enumeration value="text"/>
>                 </xsd:restriction>
>             </xsd:simpleType>
>         </xsd:element>
>
> and the assert is failing (as it should in this case!).
>
> Parse Error: Assertion failed. The type must match one of the values on
> the enumerated list.
>
> What are the consequences of a failed assert?  I have an old version of
> the spec—is there a place to a get a current, complete copy?—but it says
> “An unsuccessful dfdl:assert causes a processing error.”
>
> 1.     What does “processing error” mean in English?
> 2.     Does it mean the input is invalid?
> 3.     Does it mean the processor should stop here and go no further?
> 4.     Does it mean the process should simply ignore the problem and move
> on to the next item in the schema?
>
> TIA
> --
> dfdl-wg mailing list*
> **dfdl-wg at ogf.org* <dfdl-wg at ogf.org>*
> **https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> --
> dfdl-wg mailing list*
> **dfdl-wg at ogf.org* <dfdl-wg at ogf.org>*
> **https://www.ogf.org/mailman/listinfo/dfdl-wg*<https://www.ogf.org/mailman/listinfo/dfdl-wg>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>



-- 
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130711/12b79050/attachment-0001.html>


More information about the dfdl-wg mailing list