[glue-wg] Datastore proposal

Burke, S (Stephen) S.Burke at rl.ac.uk
Fri Apr 11 10:42:29 CDT 2008


glue-wg-bounces at ogf.org 
> [mailto:glue-wg-bounces at ogf.org] On Behalf Of Paul Millar said:
> > Name: Human-readable name (maybe indicating the technology, e.g.
> > StorageTek)
> 
> Yes, in principle; although I'd be weary of suggestion people 
> put technology names into Name. 

Yes, this is always difficult, and I think the schema document should
make a general statement about IDs and Names that they should never
contain any metadata to be parsed by a program. However, in practice
they always do contain things that humans will recognise, so it may as
well be something that helps a human understand what's going on -
especially for Names which are explicitly supposed to be for human
consumption, e.g. in monitoring displays. In practice sites probably
have some internal name for many of these things anyway.

> > Type (disk, tape, ... - open enumeration) (or maybe call 
> this attribute
> > Medium?)
> 
> This is definitely nit-picking, but for many instances this would be 
> more "media".  Using "type" would avoid the singular / plural 
> issue (I think).

I think this is just because computer people misuse language - it's
correct to describe tape as a storage medium, singular, and the fact
that people refer to "media" meaning "a bunch of tapes" is incorrect.

> > Latency: Enumeration {Online, Nearline, Offline} (probably 
> no need to
> > make this open?)
> 
> Do we define what online, nearline and offline mean somewhere?

SRM does - probably we should copy it. However, I think it isn't really
an SRM-specific term, it should be general enough to use with other
things, hence my suggestion that the enumeration can be closed. In
theory I suppose you could have levels of nearline-ness according to how
long the latency really is, but I doubt that we need to worry about it.

> Perhaps the "total amount of data that can stored without operator 
> intervention and when operating correctly".  Would that be sufficient?

Something like that. In general we should spell things out as much as
possible, people can be very creative in misinterpreting definitions :)

> Aye.  We should record the storage actually available to 
> end-users, the ten hot-spare disks shouldn't be recorded in that
number.

Indeed.

> Stephen, your description seems to map to precious + 
> precious&sticky + cache + 
> cache&sticky.  However, for most systems this should be ~100% 
> of totalSize most of the time, so I'm not sure how useful that number
is.

I'm inclined to think that's correct here: if we're describing the real
hardware and it really is full of files then that's fine. The hardware
doesn't care what the SRM thinks the files are for (or who they belong
to). This kind of distinction is more of a problem for the other places
we use Capacities, and indeed is the reason we get so much argument
about what to publish.

> Perhaps we should publish two numbers?

No, I don't think so; if we want this information at all it should be
attached to the SRM-level objects like Share and Environment, because
the SRM is what knows that one file is in a cache and another is
precious.

> > FreeSize: TotalSize - UsedSize, i.e. the free space at the 
> filesystem
> > level.
> 
> Is this an axiomatic relationship?  If so, it probably isn't 
> worth recording it.

That's another perennial debate - traditionally we've gone with
publishing the complete set even if you can derive one of them from the
others, rather than forcing clients to do the sums. You can see the same
kind of thing on the CE side, e.g. TotalJobs = RunningJobs +
WaitingJobs. As always it's optional so a given grid or info provider
may not in fact publish all of them.

[Hardware compression]

> OK, I think this is a can-o-worms that we don't want to open. 

Well, we have to open it at least part way, otherwise we leave the
definition ambiguous, and you can bet that different people would make
different choices :)

>  2. for some tape systems, it would be very difficult to 
> obtain the actual storage usage (the "tape occupancy"?).

Maybe, but then you just wouldn't publish it. The reason I went for that
definition is that the alternative would create numbers which are hard
to interpret, e.g. UsedSize > TotalSize - the compression factor is
variable so it doesn't make sense to scale the TotalSize.

  Also it seems to me that something in the system must know what the
occupancy is, otherwise how can it decide whether a new file can be
written to a given tape?

>  3. sometimes a file store operation can fail.  If so, the 
> tape software may 
> retry, but some (potentially unknown) fraction of the file 
> has been written 
> to tape.  Does this count towards to actual occupancy?

In principle I'd argue that it should reduce the TotalSize, in practice
you'd probably just ignore it - you can never expect to fill your
storage 100%.

>  4.  I believe Castor had an issue when deleting files (leading 
> to "repacking"?)  If we're attempting to account for actual 
> yardage of tape used, how would this be accounted?

I think that's the same kind of thing, it may be that some of your
"free" space is not actually usable in practice. I think it would be
much too complicated to try to represent that explicitly - bear in mind
that this is just supposed to be a simple definition of an object we
probably don't need at all!

> I think the only thing we can publish is the (user-domain) 
> file size that has 
> been recorded to tape.  I believe this is the actual number 
> people are interested.

It's what they're interested in when they look at e.g. the Share, and
what they should find there. If they want to look at a hardware
description (which they may well not) they should see the hardware
numbers ...

Stephen


More information about the glue-wg mailing list