[Nml-wg] RNC questions
Jason Zurawski
zurawski at internet2.edu
Mon Jun 20 11:00:24 CDT 2011
Hi Freek;
Answers inline. If you need a faster reference to RELAX, consider
reading the online documentation:
http://books.xmlschemata.org/relaxng/page2.html
On 6/20/11 11:36 AM, thus spake Freek Dijkstra:
> Hi,
>
> Today, I was trying to create and improve an example topology file based
> on the RNC schema.
>
> Unfortunately, the current RNC schemata do not validate when used with a
> stricter parser. We tried last week with Jing-Trang, and that gave no
> errors. Today, I tried with http://validator.nu/ and got a few more errors.
Look into using MSV (works for many different schema languages):
http://msv.java.net/
We use this along with Trang/Jing. I have never used the website you
speak of, so can't comment on if its useful or not.
Typically I have found that its best to use Trang to convert the RNC
schema into different forms (RNG, and then XSD) and then use one of the
other schema languages for instance verfication. I believe the workflow
looks like this:
Trang -> RNC to RNG
Trang -> RNG to XSD
MSV -> validate XML against RNG or XSD
MSV -> validate XML against RNG or XSD
Validating against the RNC can sometimes produce ambiguous parse errors
for some of the items you note below (e.g. anyElement); converting can
strengthen the meaning of the schema to remove ambiguous paths in the
grammar.
> Could someone answer my noob questions on RNC? (Either on-list or off-list).
>
> 1) What is the difference between
> Lifetime =
> element lifetime {
> StartTime,
> (EndTime | Duration)?
> }
> and
> Lifetime =
> element lifetime {
> StartTime
> & (EndTime | Duration)?
> }
> and which one should I use? The goal is a lifetime element with a start
> element (defined in the StartTime rule) and optionally an end OR
> duration element (respectively defined in the EndTime and Duration rules).
& = joining things, and not caring about the order. , = joining things
and enforcing ordering. In your 1st example above you would only be
able to do:
<lifetime>
<startTime></startTime>
<endTime></endTime>
</lifetime>
The second would allow something like this (wherein the first would view
this as out of order):
<lifetime>
<endTime></endTime>
<startTime></startTime>
</lifetime>
In our experience we try to avoid the use of the comma when we can,
enforcing ordering in an XML document places a lot of emphasis on the
tools (or humans) creating the XML exactly in the order the schema
mandates instead of allowing the XML to be 'structured' via nesting
elements, and not caring about the specific order that sibling elements
would appear.
My personal opinion would be to use '&' always, and avoid the ordering
attempt since I still do not believe a 'list' element is required.
> 2) I read on http://relaxng.org/compact-tutorial-20030326.html that the
> order is relevant in RNC. Thus
> <location><latitude>51.5155</latitude><longitude>-0.0922</longitude></location>
> is different from
> <location><longitude>-0.0922</longitude><latitude>51.5155</latitude></location>.
> Is there a way to specify in the RNC schema that this order is
> irrelevant in the XML?
See above, the use of & and ,
> 3) The NML Group is -by it's current definition- recursive: A group is a
> NML NetworkObject, and a Group can contain NML NetworkObjects, thus
> including other groups. I have a problem with such recursive definitions
> in RNC. At least the validator complains about patterns defined later on
> in the document. Can't I do that, or am I just doing something wrong
> (I'm happy to provide offlist the URLs of RNC schema and example
> topology file I'm currently working on, so you can see the errors for
> yourself)
I dont understand this question/problem. Is this a problem of not being
able to validate something or is this just a perception problem where a
recursive definition personally bothers you? Messages from the parser
(and which parser is being used/how it was invoked) would be helpful.
> 4) In the current RNC schema, extensibility was ensured using the
> "anyElement" rule. E.g.
> BasePortContent =
> NetworkObject
> & element capacity { xsd:float }?
> & anyElement*
> Unfortunately, the validator complained about this.
Was it a 'warning' or an 'error', both have different implications. For
example, a common warning that we have seen is 'choice between
attributes and children cannot be represented; approximating' is caused
by the use of anyElement frequently. This will sometimes result in an
ambiguous XSD being generated, but it can still be used to validate
instance documents.
> When checking a
> document, it is unclear if a "location" element should be parsed
> according the second rule (element capacity { xsd:float }?) or third
> rule (anyElement*). When reading about this, it was suggested to remove
> the anyElement* from the BasePortContent, since it is possible to still
> add new allowed element in the following method:
> BasePortContent =
> NetworkObject
> & element capacity { xsd:float }?
> # later extension:
> BasePortContent&=
> element my_extension { xsd:string }?
This is one of the dangers/benefits to using anyElement. In practice
define it as 'late' as possible in any rulset and the parser is smart
enough to choose the longest match (e.g. location) first. Without
seeing what you have done, in terms of calling the tools/displaying
error messages I won't be able to comment further.
Thanks;
-jason
> I have some more questions, but these were the most important ones. If
> some RNC expert could help me out or point me in the right way, GREAT!
>
> Freek
More information about the nml-wg
mailing list