[Nml-wg] XML syntax for NML relations

Tue Aug 30 12:00:06 CDT 2011

All;

See inline a followup to an earlier promise I made:

On 8/26/11 9:46 AM, thus spake Jason Zurawski:
> Hi Jeroen/All;
>
> On 8/25/11 10:42 AM, thus spake Jeroen van der Ham:
>> Hi all,
>>
>> I've discussed with Freek yesterday, and I think the main issue here
>> is that there are different positions regarding validation and
>> parsing of XML files.
>>
>> Jason has the position of a programmer using some XML library to
>> parse XML files, create objects and general data out of it. The idea
>> is that you take an XML file handed to the program, process it and
>> make the best of it. There is no explicit validation. This seems
>> reasonably similar to how browsers process HTML files for example.
>>
>> Freek on the other hand was thinking of how XML is handled in the
>> SOAP/WSDL/Webservices world. There you have strict typing, explicit
>> validation, code generation, et cetera. Everything has to adhere to
>> reasonably strict schemas, otherwise most WS stacks refuse to work.
>>
>> While I understand that most of the NML files for monitoring will be
>> processed by PerfSONAR and similar tools, I would prefer that the
>> eventual schema would also be useful for WSDL style operations. I
>> know that the datatypes used there are reasonably strict, does anyone
>> know whether the current proposed XML schema is compatible with that
>> context as well?
>
> To some extent.  Certain fields can be typed, others it won't make sense
> to do.  The major objection (at least in prior mails) appears to be
> 'relation' being a generic chunk of XML that is hard to decode simply by
> syntactic validation.  The only way you can deference this is via a self
> enumerated list of possible 'types'.
>
> The type string would dictate what/how many of specific elements could
> be in there.  I can imagine a situation where you can get _limited_
> syntactic checking, but the tradeoff is that you would need to
> pre-define lots of these beforehand.  Let me self-assign an action to
> send a schema that describes this in some way.  I am not sure it walk
> calm the entire discussion, but it will be a good exercise.

See the following minor chunk of (unverified/unchecked) schema:

> namespace nml = "http://ogf.org/schema/nml/base/20110830/"
>
> NMLRelation = element nml:relation {
>     (
>         attribute type { "specifictype1" } &
>         NMLLink &
>         NMLLink
>     ) |
>     (
>         attribute type { "specifictype2" } &
>         NMLLink &
>         NMLPort
>     ) |
>     (
>         attribute type { xsd:string } &
>         # content
>     )
> }
>
> NMLLink = element nml:link {
>     # content
> }
>
> NMLPort = element nml:port {
>     # content
> }
>

The basic idea is that I have defined 2 'well known' relationships, but 
I have this 'anything else' sort of relationship to accept things I 
*don't* know about.  See the following minor chunk of XML:

> <!-- what the schema had in mind ... -->
> <nml:relation type="specifictype1" xmlns:nml="http://ogf.org/schema/nml/base/20110830/">
>   <nml:link />
>   <nml:link />
> </nml:relation>
>
> <!-- also what the schema had in mind ... -->
> <nml:relation type="specifictype2" xmlns:nml="http://ogf.org/schema/nml/base/20110830/">
>   <nml:link />
>   <nml:port />
> </nml:relation>
>
> <!-- not what the schema had in mind, but works ... -->
> <nml:relation type="specifictype1" xmlns:nml="http://ogf.org/schema/nml/base/20110830/">
>   <nml:link />
>   <nml:port />
> </nml:relation>
>
> <!-- something the schema shouldn't are about, works -->
> <nml:relation type="garbage" xmlns:nml="http://ogf.org/schema/nml/base/20110830/">
>   <nml:link />
>   <nml:link />
>   <nml:link />
>   <nml:link />
> </nml:relation>

#1 and #2 give the strict checking that appears to be desired.  But note 
that #3 is not what we want (still 'works' due to the backdoor of the 
'anything' check), and #4 is exercising our extension mechanism.  With 
one other change we can do this to the schema above:

>     (
>         attribute type { "unclassified" } &
>         # content
>     )

And force that anything we don't know about has to use a specific string 
in the type field.  Re-sending the XML through a verification field 
would mean that now #3 gets kicked out (as it should), and we would need 
to modify #4 to do the following:

> <!-- modified to pass the schema check ... we can do anything we want inside ... -->
> <nml:relation type="unclassified" xmlns:nml="http://ogf.org/schema/nml/base/20110830/">
>   <nml:link />
>   <nml:link />
>   <nml:link />
>   <nml:link />
> </nml:relation>

Personally I think this is a lot of work just to get semantic 
verification at the schema validation level, as a schema designer I 
would not want to do this.  Semantic validation is still better handled 
at the service level in my opinion, but I won't re-hash that argument 
now.

Thanks;

-jason

> Regarding WSDL, recall that there are different 'styles' of
> communication in web services world.  Here is a good intro:
>
> http://www.ibm.com/developerworks/webservices/library/ws-whichwsdl/
>
> perfSONAR/NMC/NM are 'document literal', e.g. basically a complete XML
> document that contains meaning that we will decode on our own (in the
> processing code).  This makes it very hard to strongly type things to
> the same level that RPC varieties would.  Typically the RPC varieties
> lend themselves to automatic stub generation and the like.
>
> Thanks;
>
> -jason