[glue-wg] LDAP rendering document: new version as an outcome of Lund review

Sun Jun 17 06:43:42 EDT 2012

Florido Paganelli [mailto:florido.paganelli at hep.lu.se] said:
> the document clearly says draft, and we had no other traceable way of
> sharing with the rest of the group, I don't know how to do this better!

There must be other places in the web site ... the only reason the document still says that it's a draft is that the GLUE working group was inactive for something like two years so there was no forum to discuss it. The intention was that it was a final version apart from any detailed textual comments, hence my wish for any more substantial changes to be in a new document.

> So we have a recommendation draft that nobody actually followed and you
> want us to comment on that? I don't understand.

You seem to think that I don't understand either, despite the fact that I worked on both the document and the implementation! I will say it yet again - the document records the *final* decisions that we made in 2009, and the implementation was of course intended to follow that. There may be mistakes, nothing is perfect, and if noticed they should be corrected, but that is very different from trying to change the decisions that were made - including "negative" decisions like allowing the DIT to be largely an implementation choice.

> We must make it usable for implementers, the previous version was not,

That's obviously not true since we have in fact now got a near-complete implementation in EMI 2. It may of course be that things are unclear and need to be explained better, but again that is not the same as changing the substance.

> > In terms of the document, my view is that it should specify something
> > which is consistent with the current BDII implementation, but on the
> > other hand should not constrain any implementation more than
> necessary
> > - we may well want to evolve the existing infrastructure in the
> medium
> > term, and of course other providers, e.g. Nordugrid, may want to
> > produce their own implementation.
> 
> I am already trying to fix that to fit in the bdii-top, at the moment
> I've been successful, I would't be so pessimist here.

Fix what? As far as I'm aware there is nothing to fix.

> And again, you
> are contradicting yourself if you say there is no mandatory DIT but the
> actual infrastructure is based on a specific DIT!

I don't understand why you find this a difficult concept. The intention is that the *specification* (the document) leaves some things open to be an implementation choice, including the details of the DIT. A given implementation, in this case the BDII, obviously has to do some particular thing. However, a different implementation, or the same implementation in future, could do it differently. If you specify the details in the document you remove that freedom to change - in particular, if a particular structure gets embedded in client queries it will probably be fixed forever since we don't in general control all clients.

> We should then make the infrastructure independent on the DIT. But My
> idea is that is an overkill on LDAP.

You keep saying that, but you still haven't given any practical examples. My argument is basically this:

1) I don't believe there is any *need* for queries to refer to the DIT. Queries using only objectclass and attribute filters are sufficient for all needs, and will find objects wherever they are in the tree. If suitable attributes are indexed in the underlying DB then experience says that the BDII performance is not impacted by querying from the root, and in any case the nature of many queries is that they need to cover the entire information system.

2) Queries do of course need to know the base DN, but there is no need for it to be hard-coded, it can e.g. be passed in an environment variable or derived from the information system itself. Hence for example we can have code which can query either a site BDII or a top BDII simply by passing a different base DN.

3) At the moment we have a BDII structure which is based on our traditional GLUE 1 model with a three level hierarchy (resource, site, top). However, we also know that we may want to evolve that structure, for example to separate different kinds of information. Since there is no need to specify either the DIT or the BDII hierarchy I think we should not do so - then we have the freedom to evolve the system gradually without breaking any clients. If we a specify a structure and clients embed it in their queries then any change will be difficult and disruptive, perhaps to the point of being impossible. Note that GLUE 2 itself is the first backward-incompatible change we've made; we started discussing it in 2005, and 7 years later we are still trying to manage the migration! I suspect that we will never be able to do such a thing again.

> > I don't think there is any
> > necessity to specify the DIT since clients should not need to rely on
> > it. If you do regard it as necessary to specify aspects of the DIT I
> > would like to see the specific use cases which you think require it.
> 
> but you just said current clients rely on it!

No I didn't. Perhaps my terminology is confusing, by client I mean some piece of code which queries the information system to retrieve information. Obviously the BDIIs themselves do need to know the structure because they create it! However they are not what I mean by clients. Also even the BDIIs only depend on the structure to the extent that they must, for example a top BDII needs to know that site BDIIs exist but it doesn't need to know what information they contain, it can just copy it.

> isn't enough to change the schema? UTF8 strings will be there anyway,
> it's just a matter of encoding, isn't it?

Unfortunately it isn't that simple. The LDAP servers in the BDIIs validate the published information against the schema definition, and if an attribute is non-compliant the entire object is rejected. Any objects below that in the DIT will then also be rejected because their parent doesn't exist. Hence we basically have to change the schema in every BDII in the system before publishing any non-ASCII characters if we don't want to have objects randomly vanishing. (The case I debugged recently was a Spanish site which quite reasonably had a description string of "Centro de SupercomputaciÃ³n de Galicia (CESGA)", which resulted in their GlueSite object being absent.)

Stephen