[glue-wg] When is data stale?

Tue Apr 21 12:07:09 EDT 2015

Hi Paul,

Ahaha ok ok, touché :D I should not have changed "generated" with
"collected", the usual last minute changes... let me do it and see if
you like it...

On 2015-04-21 17:17, Paul Millar wrote:
> [...]
>> The CreationTime is the number of seconds elapsed since the Epoch
>> (00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970)
>> formatted as described in the GLUE2 document when BOTH these two are
>> true:
>> 1) the GLUE2 record for a GLUE2 entity is being generated
>> 2) the data contained in the record, that is, the data that describes
>> the entity the record refers to, is being collected.
> 
> Great, thanks for taking the time to define this.
> 

We might fix the word "collected" in 2) with "generated", as I did in
1), but not sure that will solve the issue... you will reuse the
argument that BDII is actually generating records, and I will have to
say it's technically true but not theoretically. Argumentation below.

>> I see no fallacy nor circularity. It's a definition. It does
>> NOT require the knowledge of provider, resource- whatever-BDII
> 
> Yes, absolutely.
> 
>> Of course, if you want to be really picky there is a time drift between
>> 1) and 2) because a Turing machine is sequential. But we can avoid this
>> discussion I hope...
> 
> Certainly, despite evidence to the contrary, I don't want to nitpick.
> 
> Now, I believe your definition also applies to a site-level BDII.  When
> it refreshes information, it generates a new record and populates this
> with information it collects from the resource-level BDII.  Conditions
> 1) and 2) are satisfied, so the site-level BDII may set CreationTime.
> 

IF and ONLY IF BDII site- and top- generate a new record. But AFAIK, and
for what Stephen  said, _it doesn't_. Or actually site just generates a
site(AdminDomain) record. It just copies records from a source and puts
them in another database untouched.
It does change the key though, but this are implementation
technicalities of LDAP's own hierarchical organization.
It just generates new indexes to reindex the data -- these are even
external to GLUE2 as a matter of fact, that was part of the LDAP
implementation work.
Formally the record from resource is _copied_. Not generated. But we're
already describing how to handle aggregation in hierararchical world
here... we're already out of GLUE2 IMHO.

> There's a (translational?) symmetry between a site-level BDII fetching
> information from resource-level BDIIs, and a resource-level BDII
> fetching information from info-providers.

not really, infoproviders can be seen as information generators from the
objects GLUE2 describes, the other components do not generate it. The
difference is that an infoprovider inspects the object to model, bdii
collectors don't. They just repeat the information.

> [...]
>>
>> So if you want to know if the information publishing "got stuck" you'd
>> better be a good sysadmin and use a decent process monitoring tool, let
>> it be Nagios or a simple cronjob that sends emails...
> 
> As with all things: hindsight is 20-20 and failure modes oft choose the
> gaps in monitoring.
> 
> In this particular case, the "mechanical" refresh process was working
> correctly, with the site-level BDII fetching data correctly.  Direct
> monitoring of BDII/LDAP object creation time (the built-in
> 'createTimestamp' attribute) would not have revealed any problem.
> 
> Publishing CreationTime and Validity (with the semantics of
> now()>CreationTime+Validity => problem) would have allowed a script to
> detect the problem.
> 

Well, we go back to the concept that you cannot monitor using a site
bdii because it only *copies* data *by design*. It does NOT generate it.
To my understanding you're saying that we MIGHT use the hierarchical
structure of the BDII and the values of CreationTime for this purpose
but in my opinion this is wrong - by design as well. Let me explain why.
My claim is that you're monitoring on the wrong side -- as a matter of
fact, this will not even tell you what is happening, just that
"something bad is happening somewhere, where I copied the data from"
In principle it does not even tell you where the data is from... unless
you look at site-bdii config files :( where the actual URLs are.
But that is expected; BDII is not a monitoring infrastructure, GLUE2 was
not designed for monitoring... maybe you're right, it can be used for
that, but we need to write another software that does that, and maybe a
MonitoringService entity...

> This isn't to say this is the only way of achieving this, nor that it is
> necessarily the best way; however, it did seem to fit with the idea of
> CreationTime and Validity.
> 

Not to me, as I said, hierarchical representation and aggregation of
GLUE2 data was never discussed. That's why I assume that intermediate
levels should not change the data passed by the levels below but only
add records eventually (i.e. the site level) It might be used as you say
if we redefine a hierarchical approach to aggregation of multiple GLUE2
documents.

> Publishing just the CreationTime allows a script to detect the problem,
> provided it happens to know the refresh period.  Although this is less
> idea, it's probably the best I can do, given everyone else feels
> Validity has a different meaning.
> 
> Cheers,
> 
> Paul.

Cheers,
Florido
-- 
==================================================
 Florido Paganelli
   ARC Middleware Developer - NorduGrid Collaboration
   System Administrator
 Lund University
 Department of Physics
 Division of Particle Physics
 BOX118
 221 00 Lund
 Office Location: Fysikum, Hus B, Rum B313
 Office Tel: 046-2220272
 Email: florido.paganelli at REMOVE_THIShep.lu.se
 Homepage: http://www.hep.lu.se/staff/paganelli
==================================================