[glue-wg] Speedy information [WAS: Comparison with CIM]

Fri May 9 04:58:06 CDT 2008

glue-wg-bounces at ogf.org 
> [mailto:glue-wg-bounces at ogf.org] On Behalf Of Paul Millar said:
> Does GLUE require an information system to respond quickly?  
> If so, on what time-scale?

Well, we use the CE information for scheduling, so it needs to be fast
enough to give a reasonable representation of e.g. the number of jobs in
a queue. As a rough guide, say you have 1000 job slots and jobs
typically run for 10 hours, you might have a transition every 30 seconds
or so. However, you don't need accuracy down to the level of a single
job, so a latency of a few minutes is OK and typically that's what we
have now. On the other hand a latency of an hour would not be good
enough, the queue state could change quite dramatically in that time.
It's also worth remembering that the batch schedulers themselves have
cycle times, typically something like 30 seconds I think, so there is no
point in the info system being faster than that.

  It has often been suggested that we should separate dynamic and static
data so we could get frequent updates of the small amount of dynamic
information and longer-lived caches for the static stuff. I think
Laurence has done some work along those lines for the BDII, but at the
moment everything is treated the same way.

> So, worse-case delay is the simple sum of these: 291s or the 
> best part of five minutes.

That's probably about right. In the original design for the WMS it
queried the GRIS directly before submitting a job to get more recent
data than from the II, but we disabled that a long time ago because it
was inefficient and didn't make much practical difference - where we
have problems they tend to be more fundamental, e.g. poor algorithms for
EstimatedTraversalTime or black-hole WNs which fail jobs quickly and
hence give a small ERT.

Stephen