[glue-wg] Updated version of appendix D.

Paul Millar paul.millar at desy.de
Thu Jan 24 08:48:12 CST 2008


On Thursday 24 January 2008 14:50:38 Burke, S (Stephen) wrote:
> One minor thing, "metrics" is not really right (they aren't necessarily
> measuring anything), I think "attributes" is a better term.

Thanks.  I've replaced all "metric"s with "attribute"s.

> >   Examples:
> >     /O=Grid/CN=UNDEFINEDUSE
> >     /O=Grid/CN=UNDEFINEDUSER/CN=Your Grid certificate DN here
> >     /O=Grid/CN=UNDEFINEDUSER/CN=Cannot access SE
>
> R missing for the first one.

Thanks.  That should be fixed now.

Also fixed the numbering and tidied up some loose wording.

Do we want to consider floating-point numbers, too?

Cheers,

Paul.



Appendix D : place-holder values for unknown data (v1.1)
----


Introduction
---

Whilst people endeavour to provide accurate information, there may be
situations where specific GLUE values may be assigned place-holder (or
dummy) values.  These place-holder values carry some additional
semantic meaning; specifically, that the correct value is currently
unknown and the presented value should be ignored.  This appendix
describes a recommended set of place-holder values to use.

Some attributes within the GLUE schema are required whilst others are
optional.  If the attribute is optional and the corresponding
information is unavailable, the information provider must either
publish a place-holder or not to publish the attribute.  If the
attribute is required, then the information must either publish a
place-holder value or refrain from publishing the GLUE object.

If a place-holder value is published, it must conform to the scheme
described in this appendix.  This is to increase the likelihood that
software will understand the nature of the information it receives.

This appendix describes place-holder values that have be chosen so
they are obvious "wrong" to humans, unlikely to occur under normal
operation and valid within the attribute type.  This also allows for
detection of failing information provider components.


Use-cases:
---

There are two principle use-cases for place-holder values, although
others may exist.

Scenario 1. a static value has no good default value and has not been
configured for a particular site.

Some provisions for GLUE Schema provide templates.  These templates
may contain static values that have no good default value; for
example, a value may require some detailed knowledge of a site.
Whilst there may be the expectation that value be configured it is
possible that this did not happen, so exposing the attribute's default
value.


Scenario 2. information provider is unable to obtain a dynamic value.

A dynamic value is provided by an information provider by querying the
underlying grid resources.  This query will use a number of ancillary
resources (e.g., DNS, network hardware) that might fail; the grid
services might also fail.  If an attribute is required and the current
value is unobtainable, a place-holder must be used.


Place-holder values:
---

This section describes a number of values that can be represented
within a given address space (e.g., Strings/UTF-8, Integers, FQDNs,
IPv4 address space).  Each of the different types are introduced along
with the place-holder value and a brief discussion on usage, rational
and any other considerations.


1. Simple strings (ASCII/UTF-8) should use "UNDEFINEDVALUE" or should
   start "UNDEFINEDVALUE:"

  Upper-case letters make it easier to spot and a single word avoids
  any white-space issues.

  A short error message can be incorporated into the message by
  appending the message after the colon.

  Examples:
    UNDEFINEDVALUE
    UNDEFINEDVALUE: Unable to contact torque daemon.

  Using UNDEFINEDVALUE is a default option for strings that have no
  widely-known structure.  If a value is of a more restrictive
  sub-type (e.g., FQDNs, URIs) described below, then the rules for
  more restrictive form must be used.


2. Fully qualified domain names: must use a hostname ending either
	"example.org" for scenario 1, or "invalid" for scenario 2.

  RFC 2606 defines two second-level domains: "example.org" and
  "example.com".  These domains have the advantage of ending with a
  recognisable TLD, so are recognisable as a DNS name.  Default
  configuration (scenario 1, above) must use DNS names that end
  "example.org"

  RFC 2606 also reserves the "invalid" Top-Level-Domain (TLD) as
  always invalid and clearly so.  For dynamic information gathering, a
  value ending "invalid" must be used.

  In both cases, additional information may be included by specifying
  a prefix to "example.org" or "invalid".  This may be used to specify
  the class of machine that should be present.  For dynamic
  infomation, if the class of machine is not published then the FQDN
  "unknown.invalid" must be used.

  Examples:
     www.example.org
     your-CE.example.org
     unknown.invalid
     site-local-BDII.invalid


3. IPv4 addr: must use 192.0.2.250

  There are several portions of IPv4 addresses that should not appear
  on a network, but none that are reserved for documentation or to
  specify a non-existent address.  Using any address leads to the risk
  of side-effects, should this value be used.

  The best option is an IP address from the 192.0.2.0/24 subnet.  This
  subnet is defined in RFC 3330 as "TEST-NET" for use in documentation
  and example code.  For consistency, the value 192.0.2.250 must be
  used.


4. IPv6 addr: must use 2001:DB8::FFFF

  There is no documented undefined IPv6 address.  RFC 3849 reserves the
  address prefix 2001:DB8::/32 for documentation.  For consistency,
  the address 2001:DB8::FFFF must be used.


5. Integers: must use "all nines"
		For uint32/int32 this is 999,999,999
		 "  int64/int64  this is 999,999,999,999,999,999

  For integers, all numbers expressible within the encoding
  (int32/uint32/etc.) are valid so there is no safe choice.

  If an unsigned integer is encoded as a signed integer, it is
  possible to use negative numbers safely.  However, these numbers
  will be unrepresentable if the number is stored as an unsigned
  integer.  For this reason a negative number place-holder must not be
  used.

  The number was chosen for three reasons.  First, attribute scales
  are often chosen to reduce the likelihood of overflow: numbers
  towards MAXINT (the large number representable in an integer domain)
  are less likely to appear.  Second, repeated numbers stand out more
  clearly to humans.  Finally, the statistical frequency of measured
  values often follows Benford's law, which indicates that numbers
  starting with "1" occur far more frequently than those starting with
  "9" (about six times more likely).  For these reasons, information
  providers must use all-nines to indicate an unknown value.


6. Filepath: must start either "/UNDEFINEDPATH" or "\UNDEFINEDPATH".

  As with the simple string, a single upper-case word is recommended.
  The initial slash indicates that the value is a path.
  Implementations must use whichever slash is most appropriate for the
  underlying system (Unix-like systems use a forward-slash).  Software
  should accept either value as an unknown-value place-holder.

  Additional information can be encoded as data beyond the initial
  UNDEFINEDPATH, separated by the same slash as started the value.
  Additional comments should not use any of the following characters:
    \ [ ] ; = " \ : | , * .

  Examples:
    /UNDEFINEDPATH
    \UNDEFINEDPATH
    /UNDEFINEDPATH/Broker unavailable


7. Email addresses: must use an undefined FQDN for the domain.

  RFC 2822 defines emails addresses to have the form:
     <local-part> '@' <domain>

  The <domain> must be an undefined FQDN; see above for a complete
  description.  For email addresses, information providers should use
  "example.org" for scenario 1. and "unknown.invalid" for scenario 2.

  The <local-part> may be used to encode a small amount of additional
  information; for example, it may indicate the class of user to whom
  the email address should be delivered.  If no such information is to
  be encoded the value "user" must be used.

  Examples:
    user at example.org
    user at unknown.invalid
    site-local-contact at example.org
    local-admin at example.org


8. Uniform Resource Identifier (URI): schema-specific

  RFC 3986 defines URIs as a "federated and extensible naming system."
  All URIs start with a schema-name part (e.g., "http") and no
  schema-name has been reserved for undefined or documenting example
  values.

  For any given URI schema ("http", for example), it may be possible
  to define an unknown value within that name-space.  If a GLUE value
  has only one valid schema, the undefined value must be taken from
  that schema.  If several schemata are possible, one must be chosen
  from the available options.  This should be the most commonly used.

  Take care with the URI encoding.  All unknown URI values must be
  valid URIs.  If additional information is included, it must be
  encoded so the resulting URI is valid.

  For schemata that may include a FQDN (e.g., a reference to an
  Internet host), an undefined URI must use an undefined FQDN; see
  above for details on undefined FQDNs.

  URI schemata that reference a remote file (e.g., "http", "ftp",
  "https"), additional information may be included as the path.  The
  FQDN indicates that the value is a place-holder, indicating an
  unknown value, so information providers should not specify
  "UNDEFINEDPATH".

  For "file" URIs, the path part must identify the value as unknown
  and must use the forward-slash variant; see above for details on
  undefined paths.

  For "mailto" URIs [RFC 2368] encapsulates valid email addresses with
  additional information (such as email headers and message body).
  Unknown mailto URIs must use an unknown email address (see above).
  Any additional information must be included in the email body.

  There may be other schemata in use that are not explicitly covered
  in this section.  A place-holder value should be agreed upon within
  whichever domain such schemata are used.  This place-holder value
  should be in the spirit of the place-holder values described so far.

  Examples:
    http://www.example.org/
    httpg://your-CE.example.org/path/to/end-point
    httpg://unknown.invalid/User%20certificate%20has%20expired
    mailto:site-admin at example.org
    mailto:user at maildomain.invalid?body=Problem%20connecting%20to%20WLMS
    file:///UNDEFINEDPATH
    file:///UNDEFINEDPATH/path%20to%20some%20directory


9. X509 Distinguished Names: must include a RDN of CN=UNDEFINEDUSER

  X509 uses a X500 namespace, represented as several Relative
  Domain-Names (RDNs) concatenated by forward-slashes.  The final RDN
  is usually a single common name (CN), although multiple CNs are
  allowed.

  Unknown DN values must have at least two entries: an initial O=Grid
  followed immediately by CN=UNDEFINEDUSER.

  Additional information can be encoded using extra CN entries.  These
  must come after CN=UNDEFINEDUSER.

  Examples:
    /O=Grid/CN=UNDEFINEDUSER
    /O=Grid/CN=UNDEFINEDUSER/CN=Your Grid certificate DN here
    /O=Grid/CN=UNDEFINEDUSER/CN=Cannot access SE


Definition of words:
---

The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT",
"RECOMMENDED", "MAY", and "OPTIONAL" in this document are used
deliberately and take their meaning from RFC 2119.  A brief summary is
given here.


1. MUST (or "REQUIRED") means that no deviation is allowed from
   conforming software.

2. MUST NOT means complete prohibition of this behaviour with
   conforming software.

3. SHOULD (or "RECOMMENDED") means that there may be reasons why
   conforming software does not to adopt this behaviour, but all the
   effects of an alternative behaviour must be understood and
   considered before choosing a different course.

4. SHOULD NOT (or "NOT RECOMMENDED") means that there may be reasons
   why conforming software adopts this behaviour, but all the
   effects of an alternative behaviour must be understood and
   considered before choosing a different course.

5. MAY (or "OPTIONAL") means an item is completely optional.



More information about the glue-wg mailing list