[glue-wg] Draft version of "undefined values" appendix
Paul Millar
paul.millar at desy.de
Tue Jan 22 09:10:12 CST 2008
Hi all,
Here is the draft version of the appendix. I've tried to incorporate the
comments made at the last meeting, cut down the word-count slightly and
changed several mentions of "illegal" to "undefined".
Cheers,
Paul.
Appendix D : place-holder values for unknown data.
----
Introduction
---
Whilst people endeavor to provide accurate information, there may be
situations where specific GLUE values may be assigned place-holder (or
dummy) values. These place-holder values carry some additional
semantic meaning; specifically, that the correct value is currently
unknown and the presented value should be ignored. This appendix
describes a recommended set of place-holder values to use.
GLUE makes no requirement on how much data is published and renderings
(of GLUE Schema into concrete type, such as LDAP) may allow publishing
of partial information. However, there may be situations where a
value is provided that should not be considered valid. Two likely
scenarios are discussed below, although others may exist.
To avoid confusion, these values have be chosen so they are unlikely
to occur under normal operation, whilst still being valid values.
Wherever possible, best-practice has been followed. It is not a
requirement to use place-holder values in general or to use these
values in particular; however, if place-holder values are used, using
these values increases the likelihood that others will understand that
the value are to be ignored.
Use-cases:
---
Scenario 1. a static value has no good default value and has not been
configured for a particular site.
Some provisions for GLUE Schema provide templates. These templates
may contain static values that have no good default value; for
example, a value might require some detailed knowledge of a site.
Whilst there may be the expectation that value be configured it is
possible that this did not happen, so exposing the application's
default configuration.
Scenario 2. information provider is unable to obtain a dynamic value.
A dynamic value is provided by an information provider by querying the
underlying grid resources. This query will use a number of ancillary
resources (e.g., DNS, network hardware) that might fail; the grid
services might also fail. If the system collecting the information
requires a value, a place-holder will be required.
Place-holder values:
---
This section describes a number of values that can be represented
within a given address space (e.g., UTF-8, Integers, FQDNs, IPv4
address space). Each of the different types are introduced along with
the proposed value and a brief discussion on the rational and any
other considerations.
1. Simple strings (ASCII/UTF-8) should use "UNDEFINEDVALUE".
Upper-case letters make it easier to spot and a single word avoids
any white-space issues.
This is a default option for strings that have no widely-known
structure. If a value is of a more restrictive sub-type
(e.g. FQDNs, FQANs), then the rules for more restrictive form should
be used.
2. Fully qualified domain names: should use a hostname ending either
"example.org" or "example.com" for scenario 1, or "invalid"
for scenario 2.
RFC 2606 reserves the ".invalid" Top-Level-Domain (TLD) as always
invalid and clearly so. For dynamic information a value of
"unknown.invalid." may be used. If an alternative is used, it
should use the TLD ".invalid".
RFC 2606 also defines the ".example" TLD and the two second-level
domains: "example.org" and "example.com". These domains have the
advantage of ending with a recognisable TLD, so are more immediately
recognisable as a DNS name. Default configuration should use DNS
names that end ".example.com" or ".example.org"
3. IPv4 addr: should use 192.0.2.250
There are several portions of IPv4 addresses that should not appear
on a network, but none that are reserved for a non-existent address.
Using an arbitrary address leads to the risk of side-effects.
The best option is an IP address from the 192.0.2.0/24 subnet. This
subnet is defined in RFC 3330 as "TEST-NET" for use in documentation
and example code. Although any IPv4 address from 192.0.2.0/24 may
be used, the recommended address above should use for consistency.
5. IPv6 addr: should use 2001:DB8::FFFF
There is no documented undefined IPv6 address. RFC 3849 reserves the
address prefix 2001:DB8::/32 for documentation. For consistency,
the address SHOULD BE the one noted above.
6. Counting/Natural numbers: should use 0
Counting- (also known as Natural-) numbers exclude the number zero,
so information provider may use this value to indicate an undefined
value.
Some counting-number metrics include 0 as a valid value. If so, the
metric should be considered an Integer and zero not be used.
7. Integers: should use MAXINT (maximum value representable in the
domain).
For int32 this is 2,147,483,647
" uint32 this is 4,294,967,295
" int64 this is 9,223,372,036,854,775,807
" uint64 this is 18,446,744,073,709,551,615
For non-negative integers, all numbers expressible within the
encoding (int32/uint32/etc...) are valid so there is no safe choice.
Although any value may be chosen, perhaps the value least likely to
be encountered in any given domain is that domain's maximum value.
If an unsigned integer is encoded as a signed integer, it is
possible to use negative numbers safely. However, these numbers
will be unrepresentable if the number is stored as an unsigned
integer.
For values that might be a positive, zero or a negative integer, all
numbers expressible within the encoding (unsigned int) are valid so
there is no safe choice. For consistency, the MAXINT value should
be used.
Information providers should not attempt to conveying further
semantic distinction, for example by using more than one "undefined"
number.
8. Filenames: should be "UNDEFINEDPATH".
As with the simple string, a single upper-case word is recommended.
9. Uniform Resource Identifier (URI): schema-specific
RFC 3986 defines URIs as a "federated and extensible naming system."
All URIs start with a schema-name part and no schema-name has been
reserved for undefined or example values.
For any given URI schema ("http", for example), it may be possible
to define an undefined value within that name-space. If a GLUE
value has only one valid schema, the undefined value should be taken
from that schema. If several schemata are possible, one should be
chosen from the available options, which should be the most commonly
used.
For schemata that include a FQDN (e.g., a reference to an Internet
host), an undefined value should be derived by using an undefined
FQDN in an otherwise valid URI. See above for details on undefined
FQDNs. Examples of such schemata include: http, httpg and mailto.
For other schemata, some element of the naming system should
indicate that the value is undefined. This is subject to further
work.
10. X509 Distinguished Names: should include a RDN of CN=UNDEFINEDUSER
Rational:
X509 uses a X500 namespace, represented as several RDNs concatenated
by commas. The final RDN is usually a single common name (CN).
It is possible for more than one CN to be present, allowing
inclusion of additional semantic meaning. However, this is outwith
the scope of the document.
Definition of words:
---
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are used deliberately and take their meaning from RFC 2119.
A brief summary is given here.
1. MUST (or "REQUIRED" or "SHALL") means that no deviation is allowed
from conforming software.
2. MUST NOT (or "SHALL NOT") means complete prohibition of this
behaviour with conforming software.
3. SHOULD (or "RECOMMENDED") means that there may be reasons why
conforming software does not to adopt this behaviour, but all the
effects of an alternative behaviour must be understood and
considered before choosing a different course.
4. SHOULD NOT (or "NOT RECOMMENDED") means that there may be reasons
why conforming software adopts this behaviour, but all the
effects of an alternative behaviour must be understood and
considered before choosing a different course.
5. MAY (or "OPTIONAL") means an item is completely optional.
More information about the glue-wg
mailing list