[glue-wg] How to Work with Glue! WRT wLCG Vo Views

Thu Apr 24 04:29:34 CDT 2008

Since my last emails could be misinterpreted without understanding my
paradigm, it does help my reputation for trying to help no good without
a simple explanation of my view as a perfect solution. At the end I
explain how to get their in stages with no new development commissioned.

For now lets get to the use case Glue is worrying me its forgetting,
its meta data services.

VO service objects are not published, hence their use cases go
unnoticed, I believe this is because people are unaware the delivery
platform for Glue could be used by more schemers. I also believe this
has led people to focus on modeling the grid with in Glue rather than
describing it, with the intention of further more detailed markup.

VO service listing structures could be used by many VO's, obvious
examples include service instances that a VO wants to use. More complex
examples include Storage Services which should only be used for short
term storage, these sort of configuration parameters are really useful
to publish, not just for the VO that publishes them but for other VO's
potentially as well. Another example is wLCG derived data that should be
republished for the benefit of VO's running on wLCG coordinated
infrastructure is service relative up time which should be republished
from Glue monitoring data. 

Since I see "dGrid", "Atlas" and "dGrid Atlas" as three VO's I see a
strong use case for some VO's to republish or reuse other VO's markup,
particularly when for example dGrid may stipulates a quota to non dGrid
members and Atlas wants to maximise resource utilisation.

Since a BDII infrastructure exists and is extensible to many schema and
is already tested, VO's can already publish from static files, or even
scripts that perform queries and provide additional markup upon Glue
(In which case you have a RDBMS view equivalent), VO boxes can easily
publish such VO lists with previous and current versions of Glue, just
like services currently do, provided they talk to the software
developer and get his consent (Laurence Field) for support and since
its just for a VO the parties involved and the standardisation within a
VO would help.

I personally would like to see Glue decomposed into multiple VO-service
objects referenced by schema containing the necessary information for
that VO's settings or services description. This would simplify and
reduce confusion over what we are all trying to achieve useful service
descriptions to all user groups without over burdening standardisation
efforts and large scale coordination. 

This is not essential, but probably desirable for Glue 3.0 For safeties
sake I would only reference service identifiers (endpoints) from Glue
in a derived schema. Unfortunately I think the VO communities do not
realise they could have a lot more useful information if they chose to
add their own markup. I think particularly of pilot jobs and steering
data transfer between site requests.  

VO's could build on current infrastructure and see real benefit easily
in the publish subscribe infrastructure with minimal effort.
Experimental VO's that are decomposed by nation publishing their VO's
national VO schema on a VO box, but containing what ever meta data they
can to aid resource utilisation, and data management policies for the
international VO. 

This sort of referencing schema is an important use case that I am
worried is being buried as Glue addresses services in Unique ways for
computing and storage for example. Thus making the derived schema
harder than it should be. I assume this is because experimental VO's
are unaware of this useful service and how they could use it so are not
pressing for the schema to be simpler to work with. I would like Glue
to be multiple schema referencing a base schema for service discovery
and further mark up only.

Thank you for your Patience.

Regards 

Owen Synge

Examples of Derived Schema 

wLCG Monitoring should publish Tier Status, Service Reliability
Metrics by Service, and maybe parametrised ranking.

HEP experiments should publish the services they want to use with
abstract markup such as usage policies and important information
derived from the output of pilot jobs, and data retention reliability
by service.

SRM schema implementation with extended derived schema for Castor,
Storm, Dcache and DPM reflecting their unique features and sharing
their common features.