[Nml-wg] Identifiers

Jason Zurawski zurawski at internet2.edu
Mon Nov 8 08:57:50 CST 2010


Freek/All;

On 11/7/10 2:16 PM, Freek Dijkstra wrote:
> Hi all,
>
> After writing our ideas on identifiers down, I still have a five smaller
> questions.
>
> Quotes are from the meeting notes
> (http://forge.gridforum.org/sf/docman/do/downloadDocument/projects.nml-wg/docman.root.meeting_materials.ogf30/doc16105):
>
>> Rough consensus on:
>> - http://schemas.ogf.org/nml/base/2013/10/ (Jason's proposal)
>
> Question 1. Should the schema end with a / or #?
> a) http://schemas.ogf.org/nml/base/2013/10   (common for XML)
> b) http://schemas.ogf.org/nml/base/2013/10/  (current proposal)
> c) http://schemas.ogf.org/nml/base/2013/10#  (common for RDF)
>
> For XML I don't think it makes any difference; for RDF, I think it
> should be b or c. (We may decide on a different namespace for XML and
> RDF, but I propose not to do that unless there are compelling reasons to
> do so).


b) is what we use currently in the perfSONAR/NMC world.  Ex:

https://svn.internet2.edu/svn/perfSONAR-PS/trunk/perfSONAR_PS-LookupService/etc/requests/hLS/LSQueryRequest.xml


> Further recap from the meeting notes:
>
>> In Catania NML decided on Instance identifiers format: urn:ogf:network:<domain part>:<local part>
>> <local>  is opaque only processed by end parts
>> GLIF also agreed to use this format.
>>
>> Richard&  Freek did put together a doc for the IETF RFC to define the URI
>> Freek has translated to xml but he needs to consult Joel on web site details
>>
>> Case insensitive:
>> RFC says have to specify case sensitive/insensitive
>> So need to define urn:ogf:  at OGF level
>> Then :network: and the rest case insensitive.
>> i.e. have to define the lexical equivalence.
>>
>> Rough consensus on:
>> - Different objects eg link and port MUST have different identifiers
>> - instance identifiers are case insensitive
>> - instance identifiers are non-international (thus an URI instead of IRI)
>> - URI are not restricted by length, other than possible restrictions
>>    by RFC 2141 (the current GLIF recommendation is max 48 to 80 bytes)
>
> I forgot to mention in the notes that we a discussion how to refer to
> identifiers.
>
> (see slides 14-18 in http://forge.gridforum.org/sf/go/doc16081)
>
> - RDF uses the attributes rdf:about and rdf:resource
> - NM-WG uses the attributes id and idref
> - The BUILT-IN XML ID and IDREF attributes can not be used,
>    since they only work within a document.
>
> We had a discussion if we should re-use the id and idref from the NM
> working group (formally: re-use the attributes in the
> http://ggf.org/ns/nmwg/base/2.0/ namespace) or are to redefine these
> attributes again.
> I forgot what the consensus was.
>
> Question 2. What attributes to use for references in XML?
> a) existing id and idref in NM-WG namespace
> b) redefine id and idref in NML namespace
> c) create dedicated namespace for just id and idref


As I stated in person (but will restate for this list) its uncommon to 
try and associate attributes with a specific namespace other than what 
is associated with the parent element.  E.g.:

   <ns:element attribute="something" />

Implies that 'attribute' is in the 'ns' namespace.  It is uncommon to 
see this:

   <ns:element ns2:attribute="something" />

But it is possible.

I think b) makes the most sense; we do this in NM/NMC now.


> We decided on the urn:ogf:network:example.net:opaque-identifier syntax.
>
> We have not yet defined what characters should be allowed in the opaque
> identifier part. We have the following options:
>
> Allowed characters:
> GLIF:       A-Z a-z 0-9 - .
> RFC2141:    A-Z a-z 0-9 - . _ ( ) + , : = @ ; $ ! * ' %hex
> pchar:      A-Z a-z 0-9 - . _ ~ ( ) + , : = @ ; $ ! * '&  %hex
> unreserved: A-Z a-z 0-9 - . _ ~
>
> where %hex is a percentage-encoding. E.g. %2E.
>
> - unreserved and pchar are definitions from RFC 3986, which defines URIs
> - GLIF is what is defined in the GLIF working group. This is extremely
> limited (: and _ are not allowed).
> - RFC 2141 is what is currently allowed in a URN. (this list excludes 4
> "reserved" characters which are in the definition for future use.)
> - RFC 2141 is currently being revised. It is very likely that&  and ~
> will be allowed, making the definition equal to that of pchar.
> - unreserved is similar to the current GLIF list.
> - Note that the following characters are NEVER allowed:
>    % / ? # [ ] \ "<  >  [ ] ^ ` { | }
>
> Question 3. What characters are allowed in<opaque string>?
> a) GLIF:       A-Z a-z 0-9 - .
> b) unreserved: A-Z a-z 0-9 - . _ ~
> c) RFC2141:    A-Z a-z 0-9 - . _ ( ) + , : = @ ; $ ! * ' %hex
> d) pchar:      A-Z a-z 0-9 - . _ ~ ( ) + , : = @ ; $ ! * '&  %hex
>


I believe we should use the approach that is going to be supported the 
most widely, in parsing tools/libraries and what is most closely matched 
to GLIF and other standards bodies.


>
> The current schema states that ALL Network Objects MUST have an identifier.
>
> This is very strict. For example, even a network object that is never
> referenced MUST still have an ID. Thus the following is NOT allowed:
>
> <nml:bidirectionallink id="urn:ogf:network:es.net:bilink_A-C">
>    <nml:link>
>      <nml:relation type="serialcompound">
>        <nml:link idRef="urn:ogf:network:es.net:link_A_to_B"/>
>        <nml:link idRef="urn:ogf:network:es.net:link_B_to_C"/>
>      </nml:relation>
>    </nml:link>
>    <nml:link>
>      <nml:relation type="serialcompound">
>        <nml:link idRef="urn:ogf:network:es.net:link_C_to_B"/>
>        <nml:link idRef="urn:ogf:network:es.net:link_B_to_A"/>
>      </nml:relation>
>    </nml:link>
> </nml:bidirectional>
>
> Instead, everything MUST be named, like so:
>
> <nml:bidirectionallink id="urn:ogf:network:es.net:bilink_A-C">
>    <nml:link id="urn:ogf:network:es.net:link_A_to_C">   <!-- ADDED id -->
>      <nml:relation type="serialcompound">
>        <nml:link idRef="urn:ogf:network:es.net:link_A_to_B"/>
>        <nml:link idRef="urn:ogf:network:es.net:link_B_to_C"/>
>      </nml:relation>
>    </nml:link>
>    <nml:link id="urn:ogf:network:es.net:link_C_to_A">   <!-- ADDED id -->
>      <nml:relation type="serialcompound">
>        <nml:link idRef="urn:ogf:network:es.net:link_C_to_B"/>
>        <nml:link idRef="urn:ogf:network:es.net:link_B_to_A"/>
>      </nml:relation>
>    </nml:link>
> </nml:bidirectional>
>
> Question 4. MUST all object have an id?
> a) All Network Objects MUST have an identifier.
> a) All Network Objects SHOULD have an identifier.
>
> "SHOULD" means that an identifier may be left out, but only if it is
> clear what the consequences are (in this case: the result can not be
> referred to.)


As a parallel to the perfSONAR/NMC world - all first order objects have 
an ID field (e.g. data, metadata, subject, parameters, key).  Some do 
not (eventType, 'parameter' [lives inside of parmeters], datum, time 
formats).

I do not have a strong opinion on this, but I think that if you plan on 
ever referencing an object (e.g. in your 2nd example above creating the 
serial compund A_C out of A_B and B_C) it should have an ID.  If the 
relationship is temporal and will never be referenced it won't need the 
ID, but it doesn't seem like a stretch to just give it one anyway.

I suppose I would prefer a) to be safe, but won't defend it to the death.

>
> The current schema states that the Syntax of the identifier MUST follow
> the urn:ogf:network syntax.
>
> This might make future compatibility harder (e.g. when trying to combine
> it with other protocols; I can imagine that in the future other naming
> schema's may be developed).
>
> Question 5. MUST urn:ogf:network syntax be used?
> a) All identifiers MUST follow the urn:ogf:network syntax
> b) All identifiers MUST be a URI, and SHOULD follow the urn:ogf:network
> syntax
> c) All identifiers MUST be a unique, and MAY follow the urn:ogf:network
> syntax
> (some more variants are possible)


No strong preference.  I think that using the urn syntax helps to 
guarantee uniqueness, but I would need to see examples of when it would 
be impossible to assign this type of ID to a given object.

-jason


More information about the nml-wg mailing list