[DFDL-WG] simpleType cannot contain assert?

Steve Hanson smh at uk.ibm.com
Mon Oct 29 09:06:18 EDT 2012


Suman

XML schema pattern facet does not always achieve same objective. In DFDL 
it is allowed only on xs:string, because it is matching the lexical value 
of the infoset value. The assert testKind 'pattern' applies directly to 
the physical data and is allowed for all types. But that's not really the 
point - Mike wants to keep the assert with the type from an encapsulation 
perspective.

The relative path issue just means that the path can't be validated when 
the simple type is validated. But that's true about any property on a 
simple type that can take an expression. IBM DFDL avoids this by 
validating simple types at their point of use.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848



From:   Suman Kalia <kalia at ca.ibm.com>
To:     Mike Beckerle <mbeckerle.dfdl at gmail.com>, 
Cc:     dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org, Steve 
Hanson/UK/IBM at IBMGB
Date:   29/10/2012 12:59
Subject:        Re: [DFDL-WG] simpleType cannot contain assert?



Mike - I see your point for  asserts being specified on simple types, this 
is like glorified pattern facet.  Question that I have is  " Can XML 
Schema pattern facet on simple type achieve the same objective" .  If you 
start allowing relative paths on the simple type in assert, then it 
becomes lot  more complex; from validation perspective you cannot validate 
the global simple type on its own and also having relative paths would 
restrict reuse  as the subject type could only be used where the preceding 
elements in the structures are identical..   


Suman Kalia 
IBM Canada Lab 
WMB Toolkit Architect and Development Lead 
Tel: 905-413-3923 T/L 313-3923 
Email: kalia at ca.ibm.com 

For info on Message broker 
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html 






From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        Steve Hanson <smh at uk.ibm.com>, 
Cc:        dfdl-wg at ogf.org, dfdl-wg-bounces at ogf.org 
Date:        10/28/2012 11:29 PM 
Subject:        Re: [DFDL-WG] simpleType cannot contain assert? 
Sent by:        dfdl-wg-bounces at ogf.org 



The example is pretty simple. I have strings that need asserts to verify 
proper format. Nothing like working on a real format to put the small but 
critical issues into focus.

I overlooked that you cannot put an assert on a global element. I read 
that too quickly, but there it is.

I'm not sure what we were thinking here. Perhaps it was as simplistic as 
avoiding global constructs with ".." in the paths of expressions. But this 
just flies in the face of creating tidy schemas that avoid repetition of 
asserts or big complex expressions all over the place. And asserts are one 
of the very untidy things because they contain regular expressions, which 
are very unweildy and difficult to maintain and badly badly need to be 
centralized in any reasonable schema.

So I'll amend my proposal: allow asserts on global element decls, and on 
simple type defs. That is, they are orthogonal in placement to whether the 
site is local or global.

Here's the example:

If I can put these as asserts on a simpleType, then I can abstract over 
them in a schema, using the same regex assert for many fields with 
different names.

If I cannot, then I have no choice but to nest them inside a complex type, 
and all these simple string fields become complex types, which is adding a 
whole tier of elements to the schema. 

E.g.

<simpleType name="dField" dfdl:ref="ex:dFieldDefaults">
   <restriction base="xs:string">
</simpleType>

then all over the schema.....

<sequence dfdl:ref="dFieldListFormat">
   <element name="Foo" type="ex:dField>
      <xs:annotation>
        <xs:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/">
          <dfdl:assert testKind="pattern"
 
testPattern="((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)(((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)|&#x20;)*((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)|(\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)"
            message="Assertion failed for data_code" />
        </xs:appinfo>
      </xs:annotation>
   </element>

   .... repeat ad nauseum for all other elements of type dField, which is 
EVERYTHING practically. 

This absolutely flies in the face of any good principles of abstraction. I 
want that regex, and the characteristics of data that have to obey it, 
captured in one place. I also really do not want to have to turn every 
element like Foo, which is logically just a string, into a complex type.

What I need to be able to do is rotate that assertion over into the 
definition of dField.

I can't do it with a pattern facet, because that doesn't affect parsing, 
and this assertion needs to guide speculative parsing. 

(Note: XML Schema allows patterns on simpleType defs and so allows proper 
abstraction of simple types that use pattern validation. DFDL is currently 
not consistent with this way of abstracting.)

Now, currently the spec says I can't put an assert on a global element 
either, so I can't fix the above issue by doing this:

<element name="dValue" type="xs:string">
....big mombo assert with regex ... NOT ALLOWED HERE EITHER
</element>

<complexType name="dField">
    <sequence>
        <element ref="ex:dValue"/>
   </sequence>
</complexType>

<sequence dfdl:ref="dFieldListDefaults">
   <element name="Foo" type="dField"/>
....

But I could push the assert down inside the complex type definition, where 
it would be on a local element decl, and if everywhere I share use of that 
complex type, then I can centralize the regex. 

What is this restriction possibly achieving? The assertion is still 
"trying to be on a global definition", we've just prevented it from being 
in a convenient place for the modeler. 

Also, as soon as you force me to use a complex type, now I'm stuck with 
the complex type restrictions on nillability, which are insufficient for 
my needs, so now I can't take advantage of mapping data containing 
non-empty nil indicators to xsi:nil nillability. I have to create a 
<element name="noValue" ....> and get into using choices, etc.

I should note that precedent in the spec already allows asserts on a 
sequence/choice that is a global group definition and global group 
references. I believe this is just a happy accident of the XML Schema 
object model i.e., the fact that the sequence/choice object isn't the same 
object as the global definition object, it is contained inside the global 
group def object.

The point: there is already a precedent for having to combine the lists of 
assertions from both defines and references to them. I think the rule is 
simple, asserts on references go after the asserts on the defines, but 
otherwise it is just merging the lists onto the reference schema component 
to be executed at the reference.

The concern about global objeccts, that is about things having 
expressions/properties that can't be resolved isn't one that keeping 
assertions off global defs/decls will help.  A global def/decl can contain 
a subset of DFDL properties such that it is useless in DFDL outside of 
some complementary referencing context. This is inherent in DFDL, and any 
system that allows separation of concerns when describing something 
complex including XML Schema itself, which has global types, groups, etc. 
all of which are useless without their referencing contexts. 

I would close by noting that this change, (allowing these additional 
locations for asserts), is backward compatible with our current spec, 
because it only adds new places these assertions are allowed.

The right fix here: DFDL 'statements' (setVariable, assert, discriminator, 
even newVariableInstance) should be allowed on simpleType and on global 
element decls. 

Even newVariableInstance is useful there. A new variable is created, used 
in expressions in asserts/discriminators/setVariable/ other 
newVariableInstance, and then it goes out of scope immediately at the end 
of the element. The newVariableInstance is in fact our only way of really 
controlling the complexity of an expression. It lets you create a big 
complicated expression out of several variables each of which is bound to 
a sub-calculation, and so it is useful even when the scope is just the 
asserts/discriminators/setVariables/newVariableInstance statements of a 
single simple-typed element. newVariableInstance basically lets us create 
local variables for use in our expression language. Normal XPath doesn't 
have this.

...mike

On Fri, Oct 26, 2012 at 4:11 AM, Steve Hanson <smh at uk.ibm.com> wrote: 
Mike, it's not an oversight, I'm sure it is for the same reason that you 
can't put an assert on a global element. I think the rationale is that 
asserts (and discriminators) are only allowed at annotation points where 
all properties can be resolved. 

Please can you show an example of what you want to achieve? 

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848 



From:        Mike Beckerle <mbeckerle.dfdl at gmail.com> 
To:        dfdl-wg at ogf.org, 
Date:        25/10/2012 17:47 
Subject:        [DFDL-WG] simpleType cannot contain assert? 
Sent by:        dfdl-wg-bounces at ogf.org 




Is this just an oversight? 

I find it very tedious to model if I can't put the asserts on the 
simpleTypes so that I don't have to repeat them over and over in the 
model. 

The composition rule here is simple. If an element has asserts, and its 
simple type has asserts, both are executed, with the element's asserts run 
after the simpleType's asserts.

...mike

-- 
Mike Beckerle | OGF DFDL WG Co-Chair 
Tel:  781-330-0412 
--
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 




-- 
Mike Beckerle | OGF DFDL WG Co-Chair 
Tel:  781-330-0412
--
 dfdl-wg mailing list
 dfdl-wg at ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20121029/2b390ad3/attachment-0001.html>


More information about the dfdl-wg mailing list