[Nml-wg] Identifiers

Mon Nov 8 10:05:26 CST 2010

Jason Zurawski wrote:

>> Question 3. What characters are allowed in <opaque string>?
>> a) GLIF:       A-Z a-z 0-9 - .
>> b) unreserved: A-Z a-z 0-9 - . _ ~
>> c) RFC2141:    A-Z a-z 0-9 - . _ ( ) + , : = @ ; $ ! * ' %hex
>> d) pchar:      A-Z a-z 0-9 - . _ ~ ( ) + , : = @ ; $ ! * '&  %hex
> 
> I believe we should use the approach that is going to be supported the 
> most widely, in parsing tools/libraries and what is most closely matched 
> to GLIF and other standards bodies.

That's the issue. :(
- c or d is most widely supported in libraries.
- a is most closely matched to GLIF recommendation.
- b plus the ":" character is mostly used in practice

My current feeling is that it is easiest to require:
- The URN MAY use the characters of c or d (no further restrictions)
- For lexical equivalance the URN MUST be case normalized
- Other normalization MUST NOT be used (in particular, %-encoding MUST
NOT be expanded)
- For display purposes, the %-encoding MAY be expanded
- If %-encoding is used, the OCTETS SHOULD represent UTF-8 data in NFC form.

(Rationale: (1) no other restrictions beside what is already in URN; (2)
easy comparison - only case conversion; (3) still possible to display
non-ascii characters.)

>> Question 4. MUST all object have an id?
>> a) All Network Objects MUST have an identifier.
>> b) All Network Objects SHOULD have an identifier.
> 
[...]
> I suppose I would prefer a) to be safe, but won't defend it to the death.

You convinced me. I'll change my vote from b to a :)

Freek