[glue-wg] When is data stale?

JP Navarro navarro at mcs.anl.gov
Mon Apr 20 09:45:10 EDT 2015


Warren and I also recently discussed this topic as we’re expanding XSEDE’s GLUE2 implementation. Our two leading interpretations  were 1) how soon does the publisher expected to re-publish/refresh the information, and 2) now long before the consumers should consider ignoring the information. 1) is generally tied to refresh interval (plus some propagation fudge factor), and 2) might be tied to how long to still use the information if there is transient failures of planned outages.


So slow changing information, such as ApplicationsEnvironment, may have long Validity and we would want the information to be considered valid even during an extended outage, even though during normal operations it might be refreshed hourly.

The lack of clarify in the spec leads me to believe that we should produce a best practice on setting Validity based on an agreed to set of use cases (what behaviors we want to be able to support based on Validity values).

JP

> On Apr 20, 2015, at 8:00 AM, Florido Paganelli <florido.paganelli at hep.lu.se> wrote:
> 
> Hi Paul,
> 
> Thanks for the nice summary. Comments inline.
> 
> On 2015-04-20 12:38, Paul Millar wrote:
>> [...]
>> Validity:
>> 
>>  4.    Stephen: "an estimate of how long the information can
>>    be reasonably trusted, irrespective of how the
>>    system updates".
>> 
>>  5.    Jens: [I wasn't sure from his response]
>> 
>>  6.    Florido: "the expected time when the actual provider that
>>    _generates_ the information will run again and update
>>    that information."
>> 
>> So, Stephen and Florido seem to have opposite views of Validity.
>> 
> 
> It might look different views but in practice is the same thing. If you
> expect the data not to be valid, you will know only at the next update.
> The difference in ARC is that our clients always contacts and checks the
> resource, hence my answer. ARC clients do not care about time drifts in
> the bdii-family hierarchy -- what counts for us is what endpoint to
> contact to get the freshest information. OF course this cannot apply for
> sw that does not accept direct requests to resource level.
> But Stephen's definition is better IMHO, because I got tickets from
> admins requesting me to set that data. For example a ComputingService
> can have a Validity bound to its average uptime, even if the machine is
> down, the information will still report its existence if cached somewhere.
> 
>> It seemed that Jens had a similar view to Stephen, at least he suggested
>> Florido's concept be published as a nextUpdate attribute rather than
>> Validity.
>> 
>> My question to Stephen: different clients may tolerate different levels
>> of error/uncertainty (is 1% "good enough"?  how about 5%, 10%, or 20%?).
>> Given that "reasonably trusted" depends on the client, how to know what
>> value is to be published?
>> 
>> My question to Florido: in ARC, how exactly is Validity property set? Is
>> it hard-coded in the code, configured manually by the admin, passed to
>> the info-provider script, or overwritten by the cron/refresh job?
>> 
> 
> It is hard coded at the moment, so you might as well think that
> Stephen's definition applies. We plan to change this in the future for
> ComputingActivities, but we have performance issues that we don't manage
> to overcome so it will stay like that until we find another solution. In
> short, "static" data will always have a fixed timeout that might be even
> configured by the admin, but this is not implemented. Dynamic data might
> have a variable Validity that cannot be configured as it depends on the
> object statuses.
> 
>> Just to add a little bit of "current usage", I did a quick survey using
>> lcg-bdii.cern.ch.  Some 13% (13865 of 104853) GLUE2 objects currently
>> published have a Validity attribute.  These objects have one of three
>> values: ~0% (34 objects) have Validity of 1 minute, 1% (1080 objects)
>> have Validity of 10 minutes, and 12% (12751 objects) have Validity of 1
>> hour.
>> 
>> So, to a good approximation, only one Validity value is set: 1 hour.
>> This narrow distribution suggests that, when set, Validity is hard-coded
>> to some value.
>> 
> 
> It is the case for ARC. I can tell you I had to modify this value
> because by the time BDII picked it up, Validity had already expired. We
> never had this problem before because in our old information system,
> only one hierarchy level, the information was so small that there was no
> time drift between creation and aggregation.
> 
> If I ever had to redefine Validity against aggregation, it should be the
> time the data is to be considered valid taking into account eventual
> overhead/drift caused by aggregation. This is why I say GLUE2 does not
> speak about it; these values must have a different definition if
> considering aggregation. This si also why Stephen defintion is the most
> correct.
> 
> Cheers,
> Florido
> -- 
> ==================================================
> Florido Paganelli
>   ARC Middleware Developer - NorduGrid Collaboration
>   System Administrator
> Lund University
> Department of Physics
> Division of Particle Physics
> BOX118
> 221 00 Lund
> Office Location: Fysikum, Hus B, Rum B313
> Office Tel: 046-2220272
> Email: florido.paganelli at REMOVE_THIShep.lu.se
> Homepage: http://www.hep.lu.se/staff/paganelli
> ==================================================
> _______________________________________________
> glue-wg mailing list
> glue-wg at ogf.org
> https://www.ogf.org/mailman/listinfo/glue-wg



More information about the glue-wg mailing list