[glue-wg] ComputingService and Endpoints, a point of view

Mon Aug 27 08:05:54 EDT 2012

Florido Paganelli [mailto:florido.paganelli at hep.lu.se] said:
> Yes I understand what you mean. But what if I have a delegation
> endpoint that can be used both for computing and for storage?

Then it should clearly be a separate Service.

> I would rather call it Endpoint, add associations pointing to both the
> StorageService and ComputingService it serves, give it the same ID and
> place it in both Computing and Storage services.

You can't do that - an Endpoint is related to exactly one Service. Also an object with a given ID should only be published once.

> A client retrieves all the 60 of them. Then, it might want to query
> information endpoints to scan for submission endpoints.

I don't see the point of that - if you have an index it should contain all the endpoints, so why would you need to scan again? But anyway it still doesn't take long, e.g. I can scan the entire glue 1 BDII for CEs which support the atlas VO in about 0.1 sec, which is tiny compared with anything which has a GSI overhead - e.g. copying a small file into an SE takes 13 secs.

> I'll make it easy here. A single box might have more than one
> information/submission endpoint, that means  deciding which
> information/submission endpoints belonging to the same box one doesn't
> want to query.

I still don't know why you're obsessed with boxes, it should not be relevant to anything what physical machine something runs on.

> We then have different Service.IDs for each endpoint, because
> information endpoints belong to different services than submission
> endpoints.
> 
> The client cannot know which relationship exists between services,

Yes it can, if you add it as a Service-Service association.

> then it must query information endpoints.

No, it should only need to query one index, whether BDII, EMIR or indeed GOC DB, which should contain everything. I can't see any point in having an index which is not complete.

> Hence I must check all the 40 submission endpoints in the index against
> the 200 retrieved from the  information endpoints , in order not to
> submit twice to the same endpoint.

I think that would be a crazy way to do it - it certainly doesn't correspond to our current architecture.

> It is easy to see that as the number of job requests increases we might
> occur in an incredible amount of work just to submit a single job.

It's easy to see that if you do things in an inefficient way it will be inefficient, but I don't think that's any kind of argument.

Stephen