[glue-wg] Updated version of appendix D.
Paul Millar
paul.millar at desy.de
Mon Jan 28 07:43:12 CST 2008
On Saturday 26 January 2008 11:48:28 Burke, S (Stephen) wrote:
> I think Gidon Moont (real time monitor) ticketed sites which were
> publishing 0,0 - so it is useful to be able to see clearly which sites
> are wrong
Yes, That's a good example for scenario-1.
Hopefully standardising these values would help Gstat check for them in their
validation tests.
> (although that doesn't help the ones with the wrong sign for
> the longitude!).
Ha! I remember him talking about the difficulty people seemed to have in
publishing the correct information. With the longitude, I believe (as a
work-around) he had to hard-code that certain locations were a particular
side of the primary meridian and flip the sign if they published a value with
the wrong sign.
> > 've also taken the liberty to include a FQAN section (should
> > people decide to publish these).
>
> We publish generic authz info so I'm not sure if we should tie this to
> VOMS - but we are still undecided on the policy representation anyway,
> so it may make sense to have a default for VOMS as for a specific URL
> scheme.
OK, I'll leave it in for now. We can always remove it or specify a
place-holder value for a more generic authz class.
As an idea, would identifying authz info as a URI make sense?
This would require specify a schema-name part for FQAN. For example, this
could be "fqan", with "fqan:/vo.example.org/Role=An-example" as an example
FQAN. I believe no one currently writes FQANs as URIs, but doing so would
allow GLUE to support additional authz schemes without redefining the
data-type.
Since GLUE currently doesn't specify any authz info, this may be a little
premature.
> > 10. Fully Qualified Attribute Name (FQAN): must use a VO of
> > "vo.example.org" (for scenario 1.) or "unknown.illegal" (for
> > scenario 2).
>
> Don't you mean unknown.invalid?
Thanks, well spotted.
I've also added some minor changes to the introduction.
Cheers,
Paul.
---
Appendix D : place-holder values for unknown data (v1.3)
----
Introduction
---
Whilst people endeavour to provide accurate information, there may be
situations where specific GLUE attributes may be assigned place-holder
(or dummy) values. These place-holder values carry some additional
semantic meaning; specifically, that the correct value is currently
unknown and the presented value should be ignored. This appendix
describes a set of such place-holder values.
Some attributes within the GLUE schema are required whilst others are
optional. If the attribute is optional and the corresponding
information is unavailable, the information provider must either
publish a place-holder or not to publish the attribute. If the
attribute is required, then the information must either publish a
place-holder value or refrain from publishing the GLUE object.
If a place-holder value is published, it must conform to the scheme
described in this appendix. This is to increase the likelihood that
software will understand the nature of the information it receives.
This appendix describes place-holder values that have be chosen so
they are obvious "wrong" to humans, unlikely to occur under normal
operation and valid within the attribute type. This also allows for
detection of failing information provider components.
Use-cases:
---
There are two principle use-cases for place-holder values, although
others may exist.
Scenario 1. a static value has no good default value and has not been
configured for a particular site.
Some provisions for GLUE Schema provide templates. These templates
may contain attributes that have no good default value; for example,
supplying the correct value may require site-specific knowledge.
Whilst it is expected that these attributes be configured, it is
possible that this does not happen, so exposing the attributes'
default values.
Scenario 2. information provider is unable to obtain a dynamic value.
A dynamic value is provided by an information provider by querying the
underlying grid resources. This query will use a number of ancillary
resources (e.g., DNS, network hardware) that might fail; the grid
services might also fail. If an attribute is required and the current
value is unobtainable, a place-holder value must be used.
Place-holder values:
---
This section describes a number of values that can be represented
within a given address space (e.g., Strings/UTF-8, Integers, FQDNs,
IPv4 address space). Each of the different types are introduced along
with the place-holder value and a brief discussion on usage, rational
and any other considerations.
1. Simple strings (ASCII/UTF-8) should use "UNDEFINEDVALUE" or should
start "UNDEFINEDVALUE:"
Upper-case letters make it easier to spot and a single word avoids
any white-space issues.
A short error message can be incorporated into the message by
appending the message after the colon.
Examples:
UNDEFINEDVALUE
UNDEFINEDVALUE: unable to contact torque daemon.
Using UNDEFINEDVALUE is a default option for strings that have no
widely-known structure. If a value is of a more restrictive
sub-type (e.g., FQDNs, FQANs, URIs) described below, then the rules
for more restrictive form must be used.
2. Fully qualified domain names: must use a hostname ending either
"example.org" for scenario 1, or "invalid" for scenario 2.
RFC 2606 defines two second-level domains: "example.org" and
"example.com". These domains have the advantage of ending with a
recognisable TLD, so are recognisable as a DNS name. Default
configuration (scenario 1, above) must use DNS names that end
"example.org"
RFC 2606 also reserves the "invalid" Top-Level-Domain (TLD) as
always invalid and clearly so. For dynamic information gathering, a
value ending "invalid" must be used.
In both cases, additional information may be included by specifying
a prefix to "example.org" or "invalid". This may be used to specify
the class of machine that should be present. For dynamic
infomation, if the class of machine is not published then the FQDN
"unknown.invalid" must be used.
Examples:
www.example.org
your-CE.example.org
unknown.invalid
site-local-BDII.invalid
3. IPv4 addr: must use 192.0.2.250
There are several portions of IPv4 addresses that should not appear
on a network, but none that are reserved for documentation or to
specify a non-existent address. Using any address leads to the risk
of side-effects, should this value be used.
The best option is an IP address from the 192.0.2.0/24 subnet. This
subnet is defined in RFC 3330 as "TEST-NET" for use in documentation
and example code. For consistency, the value 192.0.2.250 must be
used.
4. IPv6 addr: must use 2001:DB8::FFFF
There is no documented undefined IPv6 address. RFC 3849 reserves the
address prefix 2001:DB8::/32 for documentation. For consistency,
the address 2001:DB8::FFFF must be used.
5. Integers: must use "all nines"
For uint32/int32 this is 999,999,999
For uint64/int64 this is 999,999,999,999,999,999
For integers, all numbers expressible within the encoding
(int32/uint32/etc.) are valid so there is no safe choice.
If an unsigned integer is encoded as a signed integer, it is
possible to use negative numbers safely. However, these numbers
will be unrepresentable if the number is stored as an unsigned
integer. For this reason a negative number place-holder must not be
used.
The number was chosen for three reasons. First, attribute scales
are often chosen to reduce the likelihood of overflow: numbers
towards MAXINT (the large number representable in an integer domain)
are less likely to appear. Second, repeated numbers stand out more
clearly to humans. Finally, the statistical frequency of measured
values often follows Benford's law, which indicates that numbers
starting with "1" occur far more frequently than those starting with
"9" (about six times more likely). For these reasons, information
providers must use all-nines to indicate an unknown value.
6. Filepath: must start either "/UNDEFINEDPATH" or "\UNDEFINEDPATH".
As with the simple string, a single upper-case word is recommended.
The initial slash indicates that the value is a path.
Implementations must use whichever slash is most appropriate for the
underlying system (Unix-like systems use a forward-slash). Software
should accept either value as an unknown-value place-holder.
Additional information can be encoded as data beyond the initial
UNDEFINEDPATH, separated by the same slash as started the value.
Additional comments should not use any of the following characters:
\ [ ] ; = " : | , * .
Examples:
/UNDEFINEDPATH
\UNDEFINEDPATH
/UNDEFINEDPATH/Path to storage area
/UNDEFINEDPATH/Broker unavailable
7. Email addresses: must use an undefined FQDN for the domain.
RFC 2822 defines emails addresses to have the form:
<local-part> '@' <domain>
The <domain> must be an undefined FQDN; see above for a complete
description. For email addresses, information providers should use
"example.org" for scenario 1. and "unknown.invalid" for scenario 2.
The <local-part> may be used to encode a small amount of additional
information; for example, it may indicate the class of user to whom
the email address should be delivered. If no such information is to
be encoded the value "user" must be used.
Examples:
user at example.org
user at unknown.invalid
site-local-contact at example.org
local-admin at example.org
8. Uniform Resource Identifier (URI): schema-specific
RFC 3986 defines URIs as a "federated and extensible naming system."
All URIs start with a schema-name part (e.g., "http") and no
schema-name has been reserved for undefined or documenting example
values.
For any given URI schema ("http", for example), it may be possible
to define an unknown value within that name-space. If a GLUE value
has only one valid schema, the undefined value must be taken from
that schema. If several schemata are possible, one must be chosen
from the available options. This should be the most commonly used.
Take care with the URI encoding. All unknown URI values must be
valid URIs. If additional information is included, it must be
encoded so the resulting URI is valid.
For schemata that may include a FQDN (e.g., a reference to an
Internet host), an undefined URI must use an undefined FQDN; see
above for details on undefined FQDNs.
URI schemata that reference a remote file (e.g., "http", "ftp",
"https"), additional information may be included as the path. The
FQDN indicates that the value is a place-holder, indicating an
unknown value, so information providers should not specify
"UNDEFINEDPATH".
For "file" URIs, the path part must identify the value as unknown
and must use the forward-slash variant; see above for details on
undefined paths.
For "mailto" URIs [RFC 2368] encapsulates valid email addresses with
additional information (such as email headers and message body).
Unknown mailto URIs must use an unknown email address (see above).
Any additional information must be included in the email body.
There may be other schemata in use that are not explicitly covered
in this section. A place-holder value should be agreed upon within
whichever domain such schemata are used. This place-holder value
should be in the spirit of the place-holder values described so far.
Examples:
http://www.example.org/
httpg://your-CE.example.org/path/to/end-point
httpg://unknown.invalid/User%20certificate%20has%20expired
mailto:site-admin at example.org
mailto:user at maildomain.invalid?body=Problem%20connecting%20to%20WLMS
file:///UNDEFINEDPATH
file:///UNDEFINEDPATH/path%20to%20some%20directory
9. X509 Distinguished Names: must start /O=Grid/CN=UNDEFINEDUSER
X509 uses a X500 namespace, represented as several Relative
Domain-Names (RDNs) concatenated by forward-slashes. The final RDN
is usually a single common name (CN), although multiple CNs are
allowed.
Unknown DN values must have at least two entries: an initial O=Grid
followed immediately by CN=UNDEFINEDUSER.
Additional information can be encoded using extra CN entries. These
must come after CN=UNDEFINEDUSER.
Examples:
/O=Grid/CN=UNDEFINEDUSER
/O=Grid/CN=UNDEFINEDUSER/CN=Your Grid certificate DN here
/O=Grid/CN=UNDEFINEDUSER/CN=Cannot access SE
10. Fully Qualified Attribute Name (FQAN): must use a VO of
"vo.example.org" (for scenario 1.) or "unknown.invalid" (for
scenario 2).
The "VOMS Credential Format" document,
http://edg-wp2.web.cern.ch/edg-wp2/security/voms/edg-voms-credential.pdf
states that FQANs must have the form:
/VO[/group[/subgroup(s)]][/Role=role][/Capability=cap]
Where VO is a well-formed DNS name. Unlike DNS names, VO names must
be lower-case. The unknown place-holder value for FQAN is derived
from the unknown DNS name (see above). It must have no subgroup(s)
or Capability specified.
Any additional information must be encoded within a single Role
name. Care should be taken that only valid characters (A-Z, a-z,
0-9 and dash) are included.
Examples:
/vo.example.org
/vo.example.org/Role=Replace-this-example-with-your-FQAN
/unknown.invalid
/unknown.invalid/Role=Unable-to-contact-CE-Error-42
11. Geographic locations: must use longitude 0 degrees,
latitude 0 degrees.
Meridians of longitude are taken from (-180,180] degrees, whilst
parallels of latitude are taken from [-90,90] degrees. For a
place-holder value to be a valid location, it must also be taken
from these ranges.
By a happy coincidence, the (0,0) location is within the Atlantic
Ocean, some 380 miles (611 kilometers) south of the nearest country
(Ghana). Since this location is unlikely to be used and repeated
numbers are easier for humans to spot, (0,0) must be used to specify
an unknown location.
Definition of words:
---
The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT",
"RECOMMENDED", "MAY", and "OPTIONAL" in this document are used
deliberately and take their meaning from RFC 2119. A brief summary is
given here.
1. MUST (or "REQUIRED") means that no deviation is allowed from
conforming software.
2. MUST NOT means complete prohibition of this behaviour with
conforming software.
3. SHOULD (or "RECOMMENDED") means that there may be reasons why
conforming software does not to adopt this behaviour, but all the
effects of an alternative behaviour must be understood and
considered before choosing a different course.
4. SHOULD NOT (or "NOT RECOMMENDED") means that there may be reasons
why conforming software adopts this behaviour, but all the
effects of an alternative behaviour must be understood and
considered before choosing a different course.
5. MAY (or "OPTIONAL") means an item is completely optional.
More information about the glue-wg
mailing list