[Nml-wg] RNC questions

Jason Zurawski zurawski at internet2.edu
Mon Jun 20 11:00:24 CDT 2011


Hi Freek;

Answers inline.  If you need a faster reference to RELAX, consider 
reading the online documentation:

http://books.xmlschemata.org/relaxng/page2.html

On 6/20/11 11:36 AM, thus spake Freek Dijkstra:
> Hi,
>
> Today, I was trying to create and improve an example topology file based
> on the RNC schema.
>
> Unfortunately, the current RNC schemata do not validate when used with a
> stricter parser. We tried last week with Jing-Trang, and that gave no
> errors. Today, I tried with http://validator.nu/ and got a few more errors.

Look into using MSV (works for many different schema languages): 
http://msv.java.net/

We use this along with Trang/Jing.  I have never used the website you 
speak of, so can't comment on if its useful or not.

Typically I have found that its best to use Trang to convert the RNC 
schema into different forms (RNG, and then XSD) and then use one of the 
other schema languages for instance verfication.  I believe the workflow 
looks like this:

   Trang -> RNC to RNG
   Trang -> RNG to XSD
   MSV -> validate XML against RNG or XSD
   MSV -> validate XML against RNG or XSD

Validating against the RNC can sometimes produce ambiguous parse errors 
for some of the items you note below (e.g. anyElement); converting can 
strengthen the meaning of the schema to remove ambiguous paths in the 
grammar.

> Could someone answer my noob questions on RNC? (Either on-list or off-list).
>
> 1) What is the difference between
>      Lifetime =
>        element lifetime {
>          StartTime,
>          (EndTime | Duration)?
>        }
> and
>      Lifetime =
>        element lifetime {
>          StartTime
>          &  (EndTime | Duration)?
>        }
> and which one should I use? The goal is a lifetime element with a start
> element (defined in the StartTime rule) and optionally an end OR
> duration element (respectively defined in the EndTime and Duration rules).

& = joining things, and not caring about the order.  , = joining things 
and enforcing ordering.  In your 1st example above you would only be 
able to do:

<lifetime>
   <startTime></startTime>
   <endTime></endTime>
</lifetime>

The second would allow something like this (wherein the first would view 
this as out of order):

<lifetime>
   <endTime></endTime>
   <startTime></startTime>
</lifetime>

In our experience we try to avoid the use of the comma when we can, 
enforcing ordering in an XML document places a lot of emphasis on the 
tools (or humans) creating the XML exactly in the order the schema 
mandates instead of allowing the XML to be 'structured' via nesting 
elements, and not caring about the specific order that sibling elements 
would appear.

My personal opinion would be to use '&' always, and avoid the ordering 
attempt since I still do not believe a 'list' element is required.

> 2) I read on http://relaxng.org/compact-tutorial-20030326.html that the
> order is relevant in RNC. Thus
> <location><latitude>51.5155</latitude><longitude>-0.0922</longitude></location>
> is different from
> <location><longitude>-0.0922</longitude><latitude>51.5155</latitude></location>.
> Is there a way to specify in the RNC schema that this order is
> irrelevant in the XML?

See above, the use of & and ,

> 3) The NML Group is -by it's current definition- recursive: A group is a
> NML NetworkObject, and a Group can contain NML NetworkObjects, thus
> including other groups. I have a problem with such recursive definitions
> in RNC. At least the validator complains about patterns defined later on
> in the document. Can't I do that, or am I just doing something wrong
> (I'm happy to provide offlist the URLs of RNC schema and example
> topology file I'm currently working on, so you can see the errors for
> yourself)

I dont understand this question/problem.  Is this a problem of not being 
able to validate something or is this just a perception problem where a 
recursive definition personally bothers you?  Messages from the parser 
(and which parser is being used/how it was invoked) would be helpful.

> 4) In the current RNC schema, extensibility was ensured using the
> "anyElement" rule. E.g.
>      BasePortContent =
>        NetworkObject
>        &  element capacity { xsd:float }?
>        &  anyElement*
> Unfortunately, the validator complained about this.

Was it a 'warning' or an 'error', both have different implications.  For 
example, a common warning that we have seen is 'choice between 
attributes and children cannot be represented; approximating' is caused 
by the use of anyElement frequently.  This will sometimes result in an 
ambiguous XSD being generated, but it can still be used to validate 
instance documents.

> When checking a
> document, it is unclear if a "location" element should be parsed
> according the second rule (element capacity { xsd:float }?) or third
> rule (anyElement*). When reading about this, it was suggested to remove
> the anyElement* from the BasePortContent, since it is possible to still
> add new allowed element in the following method:
>      BasePortContent =
>        NetworkObject
>        &  element capacity { xsd:float }?
>    # later extension:
>      BasePortContent&=
>        element my_extension { xsd:string }?

This is one of the dangers/benefits to using anyElement.  In practice 
define it as 'late' as possible in any rulset and the parser is smart 
enough to choose the longest match (e.g. location) first.  Without 
seeing what you have done, in terms of calling the tools/displaying 
error messages I won't be able to comment further.

Thanks;

-jason

> I have some more questions, but these were the most important ones. If
> some RNC expert could help me out or point me in the right way, GREAT!
>
> Freek


More information about the nml-wg mailing list