[DFDL-WG] simpleType cannot contain assert?

Mike Beckerle mbeckerle.dfdl at gmail.com
Fri Oct 26 08:24:13 EDT 2012


The example is pretty simple. I have strings that need asserts to verify
proper format. Nothing like working on a real format to put the small but
critical issues into focus.

I overlooked that you cannot put an assert on a global element. I read that
too quickly, but there it is.

I'm not sure what we were thinking here. Perhaps it was as simplistic as
avoiding global constructs with ".." in the paths of expressions. But this
just flies in the face of creating tidy schemas that avoid repetition of
asserts or big complex expressions all over the place. And asserts are one
of the very untidy things because they contain regular expressions, which
are very unweildy and difficult to maintain and badly badly need to be
centralized in any reasonable schema.

So I'll amend my proposal: allow asserts on global element decls, and on
simple type defs. That is, they are orthogonal in placement to whether the
site is local or global.

Here's the example:

If I can put these as asserts on a simpleType, then I can abstract over
them in a schema, using the same regex assert for many fields with
different names.

If I cannot, then I have no choice but to nest them inside a complex type,
and all these simple string fields become complex types, which is adding a
whole tier of elements to the schema.

E.g.

<simpleType name="dField" dfdl:ref="ex:dFieldDefaults">
   <restriction base="xs:string">
</simpleType>

then all over the schema.....

<sequence dfdl:ref="dFieldListFormat">
   <element name="Foo" type="ex:dField>
      <xs:annotation>
        <xs:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/">
          <dfdl:assert testKind="pattern"

testPattern="((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)(((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)|&#x20;)*((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)|(\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)"
            message="Assertion failed for data_code" />
        </xs:appinfo>
      </xs:annotation>
   </element>

   .... repeat ad nauseum for all other elements of type dField, which is
EVERYTHING practically.

This absolutely flies in the face of any good principles of abstraction. I
want that regex, and the characteristics of data that have to obey it,
captured in one place. I also really do not want to have to turn every
element like Foo, which is logically just a string, into a complex type.

What I need to be able to do is rotate that assertion over into the
definition of dField.

I can't do it with a pattern facet, because that doesn't affect parsing,
and this assertion needs to guide speculative parsing.

(Note: XML Schema allows patterns on simpleType defs and so allows proper
abstraction of simple types that use pattern validation. DFDL is currently
not consistent with this way of abstracting.)

Now, currently the spec says I can't put an assert on a global element
either, so I can't fix the above issue by doing this:

<element name="dValue" type="xs:string">
....big mombo assert with regex ... NOT ALLOWED HERE EITHER
</element>

<complexType name="dField">
    <sequence>
        <element ref="ex:dValue"/>
   </sequence>
</complexType>

<sequence dfdl:ref="dFieldListDefaults">
   <element name="Foo" type="dField"/>
....

But I could push the assert down inside the complex type definition, where
it would be on a local element decl, and if everywhere I share use of that
complex type, then I can centralize the regex.

What is this restriction possibly achieving? The assertion is still "trying
to be on a global definition", we've just prevented it from being in a
convenient place for the modeler.

Also, as soon as you force me to use a complex type, now I'm stuck with the
complex type restrictions on nillability, which are insufficient for my
needs, so now I can't take advantage of mapping data containing non-empty
nil indicators to xsi:nil nillability. I have to create a <element
name="noValue" ....> and get into using choices, etc.

I should note that precedent in the spec already allows asserts on a
sequence/choice that is a global group definition and global group
references. I believe this is just a happy accident of the XML Schema
object model i.e., the fact that the sequence/choice object isn't the same
object as the global definition object, it is contained inside the global
group def object.

The point: there is already a precedent for having to combine the lists of
assertions from both defines and references to them. I think the rule is
simple, asserts on references go after the asserts on the defines, but
otherwise it is just merging the lists onto the reference schema component
to be executed at the reference.

The concern about global objeccts, that is about things having
expressions/properties that can't be resolved isn't one that keeping
assertions off global defs/decls will help.  A global def/decl can contain
a subset of DFDL properties such that it is useless in DFDL outside of some
complementary referencing context. This is inherent in DFDL, and any system
that allows separation of concerns when describing something complex
including XML Schema itself, which has global types, groups, etc. all of
which are useless without their referencing contexts.

I would close by noting that this change, (allowing these additional
locations for asserts), is backward compatible with our current spec,
because it only adds new places these assertions are allowed.

The right fix here: DFDL 'statements' (setVariable, assert, discriminator,
even newVariableInstance) should be allowed on simpleType and on global
element decls.

Even newVariableInstance is useful there. A new variable is created, used
in expressions in asserts/discriminators/setVariable/ other
newVariableInstance, and then it goes out of scope immediately at the end
of the element. The newVariableInstance is in fact our only way of really
controlling the complexity of an expression. It lets you create a big
complicated expression out of several variables each of which is bound to a
sub-calculation, and so it is useful even when the scope is just the
asserts/discriminators/setVariables/newVariableInstance statements of a
single simple-typed element. newVariableInstance basically lets us create
local variables for use in our expression language. Normal XPath doesn't
have this.

...mike

On Fri, Oct 26, 2012 at 4:11 AM, Steve Hanson <smh at uk.ibm.com> wrote:

> Mike, it's not an oversight, I'm sure it is for the same reason that you
> can't put an assert on a global element. I think the rationale is that
> asserts (and discriminators) are only allowed at annotation points where
> all properties can be resolved.
>
> Please can you show an example of what you want to achieve?
>
> Regards
>
> Steve Hanson
> Architect, Data Format Description Language (DFDL)
> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/>
> IBM SWG, Hursley, UK*
> **smh at uk.ibm.com* <smh at uk.ibm.com>
> tel:+44-1962-815848
>
>
>
> From:        Mike Beckerle <mbeckerle.dfdl at gmail.com>
> To:        dfdl-wg at ogf.org,
> Date:        25/10/2012 17:47
> Subject:        [DFDL-WG] simpleType cannot contain assert?
> Sent by:        dfdl-wg-bounces at ogf.org
> ------------------------------
>
>
>
> Is this just an oversight?
>
> I find it very tedious to model if I can't put the asserts on the
> simpleTypes so that I don't have to repeat them over and over in the model.
>
> The composition rule here is simple. If an element has asserts, and its
> simple type has asserts, both are executed, with the element's asserts run
> after the simpleType's asserts.
>
> ...mike
>
> --
> Mike Beckerle | OGF DFDL WG Co-Chair
> Tel:  781-330-0412
> --
>  dfdl-wg mailing list
>  dfdl-wg at ogf.org
>  https://www.ogf.org/mailman/listinfo/dfdl-wg
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>



-- 
Mike Beckerle | OGF DFDL WG Co-Chair
Tel:  781-330-0412
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121026/925f9c1e/attachment-0001.html>


More information about the dfdl-wg mailing list