[glue-wg] Defining various Size metrics

Paul Millar paul.millar at desy.de
Thu Jun 19 10:35:39 CDT 2008


Hi Stephen,

Thanks for the valuable feedback.  Comments below:

On Thursday 19 June 2008 15:36:14 Burke, S (Stephen) wrote:
> "Sometime after a successful upload of a file such that it uses online
> capacity, the UsedOnlineSize MUST increase."
>
> That would only be true if you insist that all storage managed by the SE
> has to be reflected in these numbers, and probably we can't say that 

OK. 

I've added an extra assertion:

  * The UsedOnlineSize metric SHOULD consider all storage managed by the SE 
that has online access latency.

Although, perhaps this should be "MAY" if "SHOULD" too strong.  I've also 
added a qualifier phrase ("considered by UsedOnlineSize") to the other 
assertions.


> and if we did it should be added as a constraint for all the attributes.

OK, done.


> Also I'm not sure this would always be guaranteed to be true anyway, e.g. if
> you had the equivalent of a fragmented disk you might count deleted
> fragments as still used, but a new file might happen to fit within one (this
> is possibly more relevant for tape systems).

I've tried to adjust the words to allow for this; for example, "previously 
unused online capacity".  If the deleted fragments are considered 
still "used" and a file is uploaded into one of these fragmented "holes" then 
no additional "used" online capacity is used, so UsedOnlineSize should not 
increase.

I've also added this as an open question.

Although I don't there are likely online underlying storage that has this 
property.  AFAIK, all filesystems support file fragmentation, so this isn't 
an issue in practise.


> "Q/ is the size of the increase in UsedOnlineSize for a successful file
> upload necessarily equal to the size of the file?"
>
> Modulo the comment above I think the answer is mostly "yes". These are
> supposed to be high-level summary numbers, so it should reflect what the
> users think they've stored and not what the system does internally. If the
> system makes multiple copies that should be reflected in what it "charges"
> per Gb stored. The exception might be if the users explicitly ask for
> multiple copies to be stored in a storage class which wouldn't otherwise do
> that. Similarly if the users do their own compression you would count the
> compressed size.

OK, I've taken out the question and added it as an assertion.


> "Q/ Is the value of TotalOnlineSize - UsedOnlineSize meaningful?
> Particular, can one conclude that, if TotalOnlineSize - UsedOnlineSize ≲
> file_size, then attempting to upload a file of file_size will likely fail?
> This is related to the previous question."
>
> It's clearly somewhat meaningful - if a new file is much bigger than total
> - used then it's fairly likely to fail,

OK, added this as an assertion ...

> but I don't think you can make a definite statement.

... with a "likely" qualifier (which is pretty vague).  I think that's fair 
enough.

We could say, for uploads:
	file_size >> total - used  =>  transfer will likely fail
	file_size ~= total - used  =>  transfer at risk of failing
would that be better?

> Anyway tests of that kind shouldn't usually be made at 
> this level, you should normally be looking at the SA when considering the
> storage of specific new files.

Duly noted as a comment in the assertion.

> At this level, if Total - Used is getting small it should be a flag (e.g. to
> people in a ROC or WLCG management) that the SE may be heading towards a
> problem.  

Aye, although one has to define "small", which is probably a user- (or perhaps 
VO-) specific number, depending on the usage the VO makes of that SE.

I think the strongest statement is that file uploads of a certain size will 
likely fail.  As you say, if a ROC or WLCG know that their VOs work-flow 
involves uploading files of a certain size, they can (using that assertion) 
deduce that "something's up" and flag (via SAM or GGUS, say) that there's a 
problem.


> "Two access latencies are used: online and nearline"
>
> In principle we also have offline storage, e.g. tapes which have been
> removed from a robot and put in a cupboard, and would need a human operator
> to re-insert them.

Yes.  I left it out as Glue doesn't mention it (iirc), but the concept exists 
so leaving it out might cause more confusion.  I've now included a brief 
description of offline too.

> "In practice, online means the file is stored on a (spinning) magnetic
> hard-disks."
>
> You imply that spun-down discs would count as nearline, but I'm not
> entirely sure if that's true.

I believe the spin-up time is typically a few seconds; so, yes, that should be 
OK.  The "(spinning)" has been removed.

> Similarly I'm not sure how we would 
> categorise a system where tapes were permanently mounted in drives.

Me neither.

On the whole, I'd say tape-drives with a single tape are nearline as seeking 
to read a file might take a minute or so, depending on the average size of 
the files being stored.

But, I've added it as an open question.

Cheers,

Paul.


More information about the glue-wg mailing list