[Nml-wg] XML syntax for NML relations

Tue Aug 16 10:35:17 CDT 2011

Hi Freek;

I really believe we need to take a step back here, there were a lot of 
emails exchanged this morning and we are talking past each other on many 
of these key issues.  If there is a fundamental misunderstanding on 
either end, I would suggest a call instead of continuing via email. 
This is a very long and involved email, and if we are not on the same 
page I fear this is really not a constructive exercise.

This being said I will try to answer your further concerns below:

On 8/16/11 10:19 AM, thus spake Freek Dijkstra:
> Jason Zurawski wrote:
>
>>>       <nml:link id="urn:ogf:network:example.net:link_A-to-C">
>>>         <nmlserialcompound:relation>
>>>            ...
>>>         </nmlserialcompound:relation>
>>>       </nml:link>
>>
>> I would prefer there to be just 'nml:relation' because it is still not
>> clear to me what you intend to put in the 'nmlserialcompound'
>> relationship that is special enough to be in a different namespace.
>
> I meant it as a generic example of an extension to the base relations.
> It could be just as well:
>    <nmlversion2:some_future_relation_we_havent_thought_of_yet>
>
> or in your preferred syntax:
>    <nml:relation type="some_future_relation_we_havent_thought_of_yet">
>
> I meant nothing special by choosing this namespace in the example.

If there is a 'new' item that needs to be modeled in the base, than you 
make a new version of the schema - this is perfectly normal.  Extension 
to different namespaces implies something different.  This usually means 
that the base is insufficient for some reason, and the new namespace 
will include special elements that are too specific for the base or 
model a behavior of a service that will be using the dialect.

>> In any event, what I think doesn't matter in this case, because your
>> sub-namespace (nmlserialcompound) would derive from the base namespace
>> (nml).  Which is the entire point of doing things in this manner.  Even
>> though its a special namespace, it can be reduced to the base which
>> should make services happy.
>
> What do you mean with "special" namespace?
> I'm currently not making any assumptions about the namespace.
> (I only make the assumption that there is a schema that defines the
> element nmlserialcompound:relation as a subclass of nml:relation)

To review XML for one second:

   <prefix:element atrribute="something" 
xmlns:prefix="http://something.net" />

  - prefix = shorthand notation for a namspace specified in the 'xmlns:' 
definition

  - element = xml element.  When 'prefix' precedes the element name, the 
element is assumed to live (and take the definition from) the namespace. 
  Lack of a prefix implies the element is in the default namespace for a 
given instance document

  - attribute = attribute of the xml element.  Assumed to live in the 
same or default namespace.

  - schema = formal syntactic definition of the above

Each time you start your examples with 'nmlserialcompound:relation', 
this to me means you are proposing a new namespace that will include 
elements that go above and beyond what is included in 'nml:relation'. 
If you do not intend to do this, then you should use different notation 
to be clear about what you are trying to do.

In out experience, schema definitions are produced to describe the 
denizens of a specific namespace.  It is possible to use multiple 
namespaces in a single schema, but so far you examples are born out of 
instance documents, not actual schema.  I apologize if I am taking your 
examples the wrong way, but you seem to be confusing several different 
concepts at once.

>>> How should the parser know that nmlserialcompound:relation inherits from
>>> the base nml:relation? I can think of two things:
>>>
>>> - Because the parser has knowledge of the schema definition
>>
>> Yes, its written in the schema implicitly.  If the parser for some
>> service was loaded with the 'nmlserialcompound' version of the schema,
>> that schema has to have in it somewhere that
>> 'nmlserialcompound:relation' is really just an 'nml:relation' that has
>> been extended.  This service is intelligent enough to parse both 'nml'
>> and 'nmlserialcompound' versions.
>
> What I'm talking about is a service which does understand the generic
> nml:relation concept, and some nml:relation subclasses, but not all
> nml:relation subclasses.
>
> I'm thinking about some some future extension. Let's say we create a NML
> version 2 in 5 years time. How should a parser which knows about NML 1
> handle a NML version 2 message? Fail completely, or is there a way to
> make NML extensible from the start. E.g. by allowing someone to create a
> new nml:relation subclass, without completely rewriting the NML base.

You are still confusing two very different concepts, versioning and 
extension are not the same thing and really should not be treated as 
such.  I would not expect there to be explicit backwards compatibility 
in either method (new schema vs extension into a new version); backwards 
compatibility at the semantic level can easily be done but it is very 
hard to attempt this syntactic constructs.  Consider these poorly drawn 
pictures in tree form:

  v1
     -> sub namespace 1
                        -> sub sub namespace 1
     -> sub namespace 2
     -> sub namespace 3
     -> v2
           -> sub namespace 4
           ...

Note that adding in v2 as a extension of the first separates all of the 
sub-namespaces that are already in use.  I would claim this is cleaner, 
but still doesn't completely offer backward compatibility:

  v1
     -> sub namespace 1
                        -> sub sub namespace 1
     -> sub namespace 2
     -> sub namespace 3

  v2
     -> sub namespace 4

Authors of the old subnamespaces would have to adapt and re-package in 
either case.  As an example, NM v1 is not compatible at all with NM v2. 
  And as a furthering to that example, NM v2 has been in place for 
nearly 7 years. Do you have a large concern that NML will be creating 
new versions quiet frequently?  I really don't believe this will be the 
case considering how long it has taken to get the first version 'right'.

> Thus: would it be possible to create a new schema that extends NML base,
> and let old parsers (who are not aware of this new schema) know that the
> message contains some additional relations based on this new extension?
> It is my hope that a parser would be able to read the NML it does known,
> and ignore the new schema additions.

I am very confused as to why you want this behavior.  Rarely will it be 
the case that an entire service can simply be given a new schema and 
work flawlessly.  The introduction of new semantic concepts via the 
syntax will force underlying logical changes in old services.  It is 
unrealistic to assume that all old services will continue to function as 
they have before.  They may still 'work', in that they can downcast to 
the last known schema they happen to know about, but I would not exepct 
a service author to implement this sort of behavior anyway.  Its an 
additional layer of 'permissiveness' that does not get you much 
functionality.

>> If you don't know the schema, you fail.  This is how it has to be.
>
> You mean in the above scenario the parser should fail completely? Or
> only ignore the unknown extensions?

If you are relying solely on syntactic validation, which is what you 
seem to be proposing, yes.  If the instance doesn't match the schema, it 
is rejected at the parser level immediately.

This is why in our experience we avoid syntactic validation and rely on 
semantic rules (outside of the parser, and inside of the application) 
instead.  It is much more flexible to make a semantic decision which can 
allow for a richer set of rules than simply relying on a 'yes or no' 
from the parser.

>
>>> - Because the parser assumes that all elements named "relation" are
>>> subclasses of nml:relation.
>>
>> I dont think you can make this assumption.
>
> I'm pleased to hear that -- it was not clear to me from Roman's email if
> he made that assumption or not, so I was polling for that.
>
>
>
>>     <nmlserialcompound:relation  freeks_special_attribute="true" />
>>
>> If this was passed to a service that doesn't understand
>> 'nmlserialcompound', the service could simply reject the entire element.
>
> That sound good to me.
>
> It should reject the element, not the entire message. I think we agree
> on that (good!).

Again, it depends on the nature of your validation.  Parser syntactical 
validation will toss the entire instance.  Semantic validation can be 
constructed to skip unknown things.

>>    It also could accept the foreign namespace, and attempt to parse the
>> element with the only version of relation it happens to know about
>> (nml).  This would mean that the foreign attribute would most likely be
>> ignored, but it still allows the message to be parsed.
>
> I think we're on the right track here.
>
> I like the above idea, that a service may attempt to parse the element
> using some basic knowledge about the base. However, I don't like the
> idea of a parser that comes across any unknown element and simply tries
> to parse it using a base element, _without knowing if this unknown
> element is actually a subclass of this base element_. So I want to
> include some clue in the XML telling the parser just that ("hey, here is
> a nmlserialcompound:relation, and if you don't know what that is, simply
> treat it as a generic nml:relation")
>
> The original syntax:
>    <nml:relation type="serialcompound">
>
> did this perfectly.
>
> However, as I tried to explain, this particular syntax has a drawback if
> you want to do meaningful syntax validation.
>
> Hence my proposal.

And I will re-iterate that you will never be able to have all of the 
"meaningful syntax validation" you seem to want, and get full 
extensibility.  Its a trade off that needs to be made - either extremely 
expressive syntax that can be fully validated by the parser (only), or 
the ability to extend to additional use cases by using general concepts.

perfSONAR/NMC has gone with the latter, and I believe this has been very 
successful in allowing extensibility to other use cases beyond the base 
schema.  You have pointed out that you dislike that a human can encode 
something wrong and things 'fail', I stand by my statement from in Salt 
Lake City that this is a necessary evil - if you encode things wrong, it 
will fail.

I prefer <nml:relation type="serialcompound">, and based on the results 
of the meeting the audience seemed to agree that this was all that was 
needed.  I find it strange that we must keep going around on this, 
because the minimal method gives the system and the human everything 
they need to encode the data.

>>> For this reason, including the extra nesting with<nml:relations>   seems
>>> to me a relative simple solution to solve these problems.
>>
>> After your entire mail, I am not exactly sure how you reached this
>> conclusion.
>
> Sorry, I left out a crucial bit in the email (no point in having this
> clear in my head if I don't make it clear here):
>
> All child elements of nml:relations MUST be a subclass of nml:relation.
>
>> Lets say I had this:
>>
>> <nml:relations>
>>     <jason:relation>
>>       <!-- other things ... -->
>>     </jason:relation>
>> </nml:relations>
>>
>> I am still in the situation as you described above, and now I have an
>> extra element that really doesn't help me solve any problem.
>
> So<jason:relation>  must be a nml:relation subclass because of the above
> requirement.
>
>> If you are still unclear on the concept of the namespaces I am happy to
>> try to explain them more, but the exact things you are trying to solve
>> with the addition of more elements can be done through namespaces and
>> inheritance, NMC has done this for a while and it has worked well in
>> practice in perfSONAR.
>
> I understand it, and I agree for most part that it works well, with the
> exception of the syntax validation requirement.

I guess we are going to have to agree to disagree on how useful this is. 
  At this point, the conversation is between us, and I doubt we will 
convince each other to move from the current positions.  If you wish to 
instantiate a vote so we can move forward, this may be the best option.

> I was trying to come up with a solution that has all the benefits that
> the NMC gives, but also has this benefit. My first attempt at that
> (<nmlserialcompound:relation>) failed, well, miserably (sorry, I'm not
> that smart). I think (hope?) my second attempt
> (<nml:relations><nmlserialcompound:relation>) does a better job at that.
>
>
> Let me also try to clarify a point you made in your follow up email:
>
>> It is a mistake to assume that a parser by itself is capable of rich
>> semantic interpretation.
>
> I agree, that is not possible.
> However, I do hope that a parser is -by only knowing a base schema and a
> few (but not necessarily all!) schema extensions- able to validate the
> SYNTAX as good as it can.

You are confusing semantics and syntax again.  Parsers only know what 
they are told, and to my knowledge a parser can only verify against a 
single schema at any given time.  A parser must be told to verify an 
instance against a schema, and the schema itself has to have the tooling 
to 'include' the other possibilities that may be derived from other 
sources.  This is what I have been pushing all along - use the 
derivative namespaces when possible, but put most of the trust that the 
service will simply 'do the right thing' in at the semantic level.

Thanks;

-jason

> That is currently not the case in the current perfSONAR implementations,
> and I regret seeing that.
>
> I hope that by at least allowing easy and meaningful SYNTAX validation,
> the parser code can concentrate on what is important: the SEMANTICS.
> Thus the application specific stuff. I think that is where the real fun
> is, and I hope to use some library for the boring SYNTAX checking stuff.
>
> I DO think that SYNTAX checking is important (albeit boring); if we do
> it well, we can can specify how parsers should treat syntactically
> invalid messages, and avoid a slurry of implementation incompatibility
> problems later.
>
> Regards,
> Freek