[glue-wg] Publishing annotations
Paul Millar
paul.millar at desy.de
Mon Apr 7 10:14:21 CDT 2008
Hi Stephen,
I don't think "dCache" has a strong opinion on this.
I'm not sure I do, either, but my initial thoughts are that annotations are
not the right solution here as they don't fit in with how the rest of GLUE is
structured.
More comments interleaved below:
On Monday 07 April 2008 14:32:20 Burke, S (Stephen) wrote:
[...]
> The specific thing I was considering was the EGEE "freedom of choice" tool,
> which currently edits the published data by removing
> AcessControlBaseRule attributes for CEs which are failing defined tests.
[...]
Yes, removing information definitely sounds like a hack.
Personally, I'd say the correct approach is that, should ATLAS wish to publish
information (I didn't realise FoC was published into GLUE like this), it
should do so as well-defined objects rather than as annotations to existing
objects.
The AccessControlBaseRule is set by a site, say that they believe that they
support a UserDomain (ATLAS, for example). The site is authoritative in that
statement, no-one should create or alter that information.
If the UserDomain (e.g., ATLAS) wants to record that a service is broken for
them, that should be recorded elsewhere---somewhere where the UserDomain is
authoritative. Alternatively, the UserDomain could publish the services it
has checked as working (e.g., within EGEE, via SAM).
For example, there could be a ServiceBroken (or similarly named) object that
records that some UserDomain (e.g. ATLAS) found the service unusable.
Alternatively, the UserDomain could publish a CertifiedSane object that links
a UserDomain (ATLAS, ops (for SAM tests), etc.) to a Service. In effect,
publishing this object saying the service "works for me".
CertifiedSane objects assumes a default-broken model (a service must prove
itself) whereas a ServiceBroken object assumes a default-working model. One
would hope that publishing ServiceBroken objects would take less storage.
Clients can choose to ignore these ServiceBroken / CertifiedSane objects;
e.g., the ATLAS testing framework could ignore it whilst normal ATLAS jobs
look for it.
> Another thing that occurs to me on similar lines is the downtime
> information. I see that this is now embedded in the Endpoints, but for
> EGEE that wouldn't be usable because our downtime information is in a
> central database (GOC DB), so if we wanted to publish it it would again
> need to be done by modifying/annotating published information from a
> different place.
I believe this is a long-standing issue that arose for two reasons: first, it
was "difficult" for a site to publish down-time through GLUE; second, if a
site is going into emergency down-time (a JCB has just cut the fibre-optic
link), then publishing information through GLUE would be difficult.
The first one is (I think) purely an implementation issue. If the information
was readily available, then GOC could switch from being definitive to acting
as a cache for the information published by the site ... but this is really
an EGEE decision.
For the second issue, I'm not sure what the correct solution is here.
Perhaps, with most technologies, the site should drop out of the info.
system, which would be "good enough".
> This could maybe also feed into our CESEBind discussion, e.g. rather
> than the CE and SE publishing disconnected pieces of information the CE
> could in effect add/modify things published by the SE, i.e. we could
> separate the schema structure from the question of who publishes it.
Hmmm, not completely sure about this.
I believe we currently assume that an object is published by a single
component. Maybe this needn't be an assumption, but I'm not sure how well
the different underlying storage technologies would handle this
I suspect, in most cases, we would need to split the information anyway into
the different subsets to allow partial results to validate correctly
(objectClass in LDAP, XSD types with XML, etc). If this is so, then (in
effect) we've split the object into parts, whether we label them as separate
objects or not.
> What do people think? Maybe it's too complicated to fit in at this
> stage, but it might be worth a bit of thought - at the abstract schema
> level it wouldn't necessarily mean huge changes. Just thinking about it
> quickly I suspect the problems would come at the implementation level,
> e.g. in LDAP the annotations would still be in a different part of the
> tree so it would take some extra effort in queries to find them.
I'm not sure annotations really gains us much, but I'm also willing to be
convinced differently :-)
HTH,
Paul.
More information about the glue-wg
mailing list