[glue-wg] When is data stale?
Paul Millar
paul.millar at desy.de
Tue Apr 21 14:07:20 EDT 2015
Hi Stephen,
First, I must apologise if you felt my emails were in any way abusive
--- they were certainly not intended that way; rather, I would like the
effort we have all invested in GLUE and the grid infrastructure be used
properly.
Currently, I see different groups developing their own information
systems, running in parallel with GLUE+BDII, because of problems (both
perceived and actual) with BDII. I would like these problems addressed
and find the very slow progress frustrating.
Onto the specific points...
On 21/04/15 13:00, stephen.burke at stfc.ac.uk wrote:
> Paul Millar [mailto:paul.millar at desy.de] said:
>> From your replies, you appear to have an internal definition of a
>> CreationTime that is to yourself clear, self-obvious and almost
>> axiomatic. Unfortunately, you cannot seem to express that idea in
>> the>> terms defined within GLUE-2.
>
> OK, let's have one more try. The concept which you seem to think is
> missing is "entity instance". That may not be explicitly defined but
> it's a general computing concept, and I find it hard to see that you
> could make much sense of the schema without it. The schema defines
> entities as collections of attributes with types and definitions; an
> instance of that entity has specific values for the attributes. One
> of those attributes is CreationTime. Instances are created in a way
> completely unspecified by the schema document, but whatever the
> method the CreationTime is the time at which that creation occurs
> (necessarily approximate since creation will take a finite time). If
> a new instance is created it gets a new CreationTime even if all the
> other attributes happen to be the same. However, if an instance is
> copied the copy preserves *all* the attribute values including
> CreationTime - if you change that it's a new instance and not a
> copy.
Thanks, that makes sense.
Just to confirm: you define two general mechanisms through which data is
acquired: creating an entity instance and copying an entity instance.
In concrete terms, resource-level BDII+info-provider creates entity
instances while site- and top- level BDIIs copy entity instances. This
breaks the symmetry, allowing CreationTime to operate only on
resource-level BDIIs.
Perhaps such a description is trivial or "well known", but it seems to
me that GLUE-2 when used in a hierarchy (like the WLCG info system)
would benefit from such a description. This could go in GLUE-2 itself,
or perhaps in a hierarchy profile document.
>> The validator should expose bugs, not hide them. How else are
>> sites going to fix these bugs.
>
> The point is that sites can't fix middleware bugs [..]
What you say is correct. I would also say that only sites can deploy
the bug-fixes.
> and hence
> shouldn't get tickets for them. If tickets were raised for errors
> which would always occur and can't be fixed until a new middleware
> release is available the validator would have been rejected - sites
> must be able to clear alarms in a reasonably short time. That's also
> why only ERRORs generate alarms - ERRORs are always wrong, WARNINGs
> may be correct so a site may be unable to remove them. Of course, the
> validator can still be run outside the Nagios framework without the
> known issues mask.
Yes, it's always a bit fiddly dealing with a new test where the
production instance currently fails.
>> It would be good if we could check this: I think there's a bug in
>> BDII where stale data is not being flushed.
>
> Maria has been on maternity leave for several months, so all this has
> been on hold. I think she should be back fairly soon, but no doubt it
> will take a while to catch up. A couple of years ago there was a bug
> where old data wasn't being deleted, but it should be out of the
> system by now. Also bear in mind that top BDIIs can cache data for up
> to four days.
Sure, I knew Maria was away; but I was hoping there would be someone
covering for her, and that the process wasn't based on her heroic
efforts alone.
>> If the validator is hiding bugs, and the policy is to do so
>> whenever bugs are found, then it is useless.
>
> The policy is to submit a ticket to the middleware developers and
> keep track of it. There's no point in repeatedly finding the same
> bug.
Yes, that is certainly a sound policy.
>> AFAIK, there's no intrinsic reason why there should be anything
>> beyond a 2--3 minute delay: the time taken to fetch the updated
>> information from a site-level BDII.
>
> The top BDII has to fetch information from several hundred site BDIIs
> and the total data volume is large. It takes several minutes to do
> that. And site BDIIs themselves have to collect information from the
> resource BDIIs at the site. Back in 2012 Laurence did some tests to
> see if the top BDII could scale to read from the resource BDIIs
> directly, but the answer was no, it can cope with O(1000) sources but
> not O(10000). Also the resource BDII runs on the service and loads it
> to some extent so it can't update too often - a particular issue for
> the CE, which is the service with the fastest-changing data.
I'm not sure I agree here.
First, the site-level BDII should cache information from resource-level
BDIIs, as resource-level BDIIs cache information from info-providers.
This means that load from top-level BDIIs is only experienced by
site-level BDIIs.
Taking a complete (top-level) dump only takes a few seconds.
paul at celebrimbor:~$ /usr/bin/time -f %e ldapsearch -LLL -x -H
ldap://lcg-bdii.cern.ch:2170 -b o=glue > /dev/null
4.49
paul at celebrimbor:~$ /usr/bin/time -f %e ldapsearch -LLL -x -H
ldap://lcg-bdii.cern.ch:2170 -b o=grid > /dev/null
5.15
Lets say it takes about 10--15 seconds in total.
A top-level BDII is updating by this process (invoking the ldapsearch
command). Assuming the process is bandwidth limited, this should also
take ~10--15 seconds as the total amount of information sent over the
network should be about the same. (Note that this doesn't take into
account TCP slow-start, so it may be a slight underestimate, but see
below for why I don't believe this is a real problem.)
Lets assume the problem isn't bandwidth limited, that the update
frequency is limited by latency of the individual requests to site-level
BDIIs.
I surveyed the currently registered site-level BDIIs:
for url in $(ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b
o=glue $(ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=glue
'(GLUE2ServiceType=bdii_site)' GLUE2ServiceID|perl -p00e 's/\n //'|awk
'BEGIN{printf "(|"}/^GLUE2ServiceID/{printf
"(GLUE2EndpointServiceForeignKey="$2")"}END{print ")"}')
GLUE2EndpointURL|perl -p00e 's/\n //g' | sed -n 's%^GLUE2EndpointURL:
\(ldap://[^:]*:[0-9]*/\).*%\1%p'); do /usr/bin/time -a -o times.dat -f
%e ldapsearch -LLL -x -H $url -o nettimeout=30 -b o=glue > /dev/null; done
This query covered some 318 sites. The ldapsearch command failed for 5
endpoints and the query timed out for 3 endpoints.
Of the remaining 310 sites, the maximum time for ldapsearch to complete
was about 19.21 seconds and the (median) average was 0.44 seconds. For
82% of sites, ldapsearch completed within a second; for 92% it completed
within two seconds.
Repeating this for GLUE-1.3 showed similar statistics.
This suggests to me that information from responsive sites could be
maintained with a lag of order 10 seconds to a minute (depending);
information from sites with badly performing site-level BDIIs would be
updated less often.
I haven't investigated injecting this information: BDII now generates a
LDIF diff which is injected into the slapd. This is distinct from the
original approach, which employed a "double-buffer" with two slapd
instances.
Still, I currently don't see why a top-level BDIIs must lag by some 30
minutes.
>> Yeah, typical grid middleware response: rewrite the software rather
>> than fix a bug.
>
> I could say that your response is typical: criticism without
> understanding.
Perhaps, but I have reviewed the BDII code-base in the past and I know
roughly how it works.
My simple investigation suggests maintaining a top-level BDII with
sub-minute latencies is possible with at least 80--90% of site-level BDIIs.
Of course I may be missing something here, but it certainly seems
feasible to achieve much better than is currently being done.
Cheers,
Paul.
More information about the glue-wg
mailing list