[glue-wg] Defining various Size metrics

Burke, S (Stephen) S.Burke at rl.ac.uk
Thu Jun 19 08:36:14 CDT 2008


glue-wg-bounces at ogf.org 
> [mailto:glue-wg-bounces at ogf.org] On Behalf Of Paul Millar said:
> Comments are appreciated.

OK ... 

"Sometime after a successful upload of a file such that it uses online capacity, the UsedOnlineSize MUST increase."

That would only be true if you insist that all storage managed by the SE has to be reflected in these numbers, and probably we can't say that - and if we did it should be added as a constraint for all the attributes. Also I'm not sure this would always be guaranteed to be true anyway, e.g. if you had the equivalent of a fragmented disk you might count deleted fragments as still used, but a new file might happen to fit within one (this is possibly more relevant for tape systems).

"Q/ is the size of the increase in UsedOnlineSize for a successful file upload necessarily equal to the size of the file?"

Modulo the comment above I think the answer is mostly "yes". These are supposed to be high-level summary numbers, so it should reflect what the users think they've stored and not what the system does internally. If the system makes multiple copies that should be reflected in what it "charges" per Gb stored. The exception might be if the users explicitly ask for multiple copies to be stored in a storage class which wouldn't otherwise do that. Similarly if the users do their own compression you would count the compressed size.

"Q/ Is the value of TotalOnlineSize - UsedOnlineSize meaningful? Particular, can one conclude that, if TotalOnlineSize - UsedOnlineSize ≲ file_size, then attempting to upload a file of file_size will likely fail? This is related to the previous question."

It's clearly somewhat meaningful - if a new file is much bigger than total - used then it's fairly likely to fail, but I don't think you can make a definite statement. Anyway tests of that kind shouldn't usually be made at this level, you should normally be looking at the SA when considering the storage of specific new files. At this level, if Total - Used is getting small it should be a flag (e.g. to people in a ROC or WLCG management) that the SE may be heading towards a problem.

"Two access latencies are used: online and nearline"

In principle we also have offline storage, e.g. tapes which have been removed from a robot and put in a cupboard, and would need a human operator to re-insert them.

"In practice, online means the file is stored on a (spinning) magnetic hard-disks."

You imply that spun-down discs would count as nearline, but I'm not entirely sure if that's true. Similarly I'm not sure how we would categorise a system where tapes were permanently mounted in drives.

Stephen


More information about the glue-wg mailing list