[glue-wg] XSD aligned to draft 27
Paul Millar
paul.millar at desy.de
Thu Mar 13 11:44:30 CDT 2008
Hi JP,
On Thursday 13 March 2008 15:52:56 JP Navarro wrote:
> I'm less familiar with the XML terminology you use, but I would
> second your suggestion using commoner terminology: we should be able to
> independently publish subsets of the GLUE schema hierarchy. The ability to
> develop and run independent info-providers for subsets of information is a
> very useful design. Did I understand your proposal correctly?
More or less.
What I would like to avoid is that storage service infoProviders publish
information like:
<Grid>
<AdminDomain>
<AdminDomain>
<StorageService>
<ImplementationName>foo</ImplementationName>
<ImplementationVersion>1.0</ImplementationVersion>
<!-- ...etc... -->
</StorageService>
</AdminDomain>
</AdminDomain>
</Grid>
as this requires the SE publisher to know it's part of a distributed Tier-2
site (hence the two levels of AdminDomain elements).
That said, the current XSD doesn't seem to support nested AdminDomain
elements, which would be needed to describe distributed sites.
An alternative would be for the SE to publish information like:
<StorageService>
<ImplementationName>foo</ImplementationName>
<ImplementationVersion>1.0</ImplementationVersion>
<!-- ...etc... -->
</StorageService>
and have the (site-level) aggregation happen at the site level, which would
publish information like:
<AdminDomain>
<Name>Example Site</Name>
<Services>
<ComputingService>
<!-- CE information goes here -->
</ComputingService>
<StorageService>
<ImplementationName>foo</ImplementationName>
<ImplementationVersion>1.0</ImplementationVersion>
<!-- ...etc... -->
</StorageService>
</Services>
</AdminDomain>
The final aggregation would encapsulate multiple sites within a Grid element.
<Grid>
<AdminDomain>
<!-- Site 1 info -->
</AdminDomain>
<AdminDomain>
<!-- Site 2 info -->
</AdminDomain>
</Grid>
The disadvantage of this approach is one cannot query the primary SE info (the
XML provided by the SE info provider) with exactly the same query one would
use when querying top-level aggregation. For example, to extract all
StorageEndpoints for site Example Site, one could use the XPath:
/Grid/**/AdminDomain[Name='Example
Site']/Services/StorageService/StorageEndpoint
Something like:
<xsl:styleshet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:variable name="href" select="http://glue.example.org/grid-glue"/>
<xsl:variable name="site" select = "Example Site"/>
<xsl:template match="/">
<xsl:copy-of
select="document($href)/Grid/**/AdminDomain[Name=$site]/Services/StorageService/StorageEndpoint"/>
</xsl:template>
</xsl:stylesheet>
But, if one substituted the URI of the primary information (via the href
variable), this particular query wouldn't work: the primary XML would not
have the Grid and all AdminDomain elements.
I don't think this is a big deal, though: it makes sense that that query
should return no replies when querying the SE info-provider directly, and
there are other queries that would work (e.g., select all StorageEndpoints)
The advantage to publish with StorageService as the top-level element is that
the SE info-provider need know nothing about the above Glue hierarchy. This
(should) simplify the info-provider and, at the same time, allow the same
information to be (easily) published under different GLUE hierarchies. For
example, if a site is a member of more than one Grid.
To me, this advantage outweighs the disadvantage.
> One question this raises is how one binds or links these separately
> published subset documents to each other? Would we need to introduce
> attributes in each subset that binds it to other related subsets?
I believe that, currently, how the documents are merged isn't defined.
One approach is to use XSLT to do the merging. There's a (working) toy
implementation that demonstrates that here:
http://www.ogf.org/pipermail/glue-wg/2007-December/000249.html
HTH,
Paul.
More information about the glue-wg
mailing list