[glue-wg] Usage of GLUE2 XML realization

Maarten Litmaath Maarten.Litmaath at cern.ch
Fri Nov 13 12:41:06 CST 2009


Weijian Fang wrote:

> We are looking at GLUE2 XML realizations, i.e., the official but a
> little obsolete one
> (http://schemas.ogf.org/glue/2008/05/spec_2.0_d42_r01) and the
> NorduGrid one (http://svn.nordugrid.org/trac/nordugrid/browser/arc1/trunk/doc/tech_doc/infosys/GLUE2.xsd.
> (We are also aware of the TeraGrid GLUE2 XML schema.)
> 
> The official schema and the NorduGrid schema are similar. Both define
> only a single XML element: <Domains>. And all the other entities are
> defined as XML types instead of elements. Thus they can be included in
> <Domains> but can never stand on their own. Therefore, under this
> design, in order to update a single piece of information, for
> instance, Domains/AdminDomain/Services/ComputingService/RunningJobs,
> one has to re-publish the whole AdminDomain.
> 
> Our observation is that the current design of GLUE2 XML schema is not
> optimised for updating part of the information. Is this because
> updating part of the information is never an intended usage pattern of
> GLUE? A validity attribute is defined for each entity. We assume the
> intended usage pattern of GLUE information model (including the XML
> realization) is to PERIODICALLY publish ALL the information once the
> validity period expires. Are we correct? Many thanks!

I only have experience with the LDAP schema, but it may also be relevant
in this discussion: yes, we periodically re-calculate and publish all the
information.

For the EGEE/WLCG information system the amount of information that the
resources of one site need to provide is not very large.
That information is collected and served by the site's information system
endpoint ("site BDII").
A grid-wide information service instance ("top BDII") collects and serves
the information from all such endpoints.
For EGEE/WLCG the combined information currently amounts to 62 MB from
354 sites and is updated every few minutes.
Middleware clients execute queries that are optimized to return as little
superfluous information as possible.
The information services have indices on popular attributes, to speed up
the processing of the client queries.

Does the XML schema hamper such a strategy?


More information about the glue-wg mailing list