[glue-wg] Datastore proposal

Burke, S (Stephen) S.Burke at rl.ac.uk
Wed Apr 16 07:58:59 CDT 2008


Paul Millar [mailto:paul.millar at desy.de] said:
> So, if some site offers some disks that provide a 
> custodial-like service and 
> others that provide replica-like service, they would be 
> represented as separate Datastore objects, right?

Probably, but it depends how it's implemented. If the system decides
dynamically where to put files in a common set of hardware regardless of
the RP it would all be one Datastore. However, I suspect it would be
more common to have separate hardware. At some level this relates to the
discussion about whether the Share.RP can be multivalued.

> Or is more the difference between disk 
> used for caching and disk used as part of UserDomain's 
> allocated space (D1T1, D1T0)?

Again I think it's a matter of implementation. Usually I'd expect cache
disks to be physically distinct, but maybe sometimes they aren't -
perhaps it could be useful to allow a cache to overflow into the free
space of permanent storage.

> OK, but it would be equally OK to publish all disk storage as one big 
> generic "disk" Datastore, right?

If it's all managed by a single Resource, yes. The difference you would
get from splitting a homogenous Datastore into pieces would be from the
relation to the Share, so it would only matter at all if different
Shares mapped to distinct disk servers, and in any case you have the
size information at the share level.

> > Type (disk, tape, ... - open enumeration)
> 
> ... which we would define the common types, right?

Yes - but do we have anything common apart from tape and disk? dvd?

> > one Datastore can
> > only be managed by one Resource - if there are e.g. multiple sets of
> > disk servers managed by several different software systems 
> that would constitute multiple Datastores.
> 
> OK, I think, but it might depend on the "management" aspect 
> of Datastores.

Can you elaborate? As I said earlier we have the situation at RAL that
all the different Castor instances (Resources) share a common tape
system, but I think that's too complicated to deal with, we should just
treat it as multiple tape Datastores.

> How about, instead of associating the StorageShare with the 
> Datastore, there 
> was a link between the StorageCapacity (associated with a 
> StorageShare) to the Datastore?

Well, partly that's back to the same discussion about Capacity
(currently called StorageShareState) being treated as a separate object
or not. In my view of how this works it doesn't make sense to have an
independent relation to a Capacity, it's the same thing as a relation to
the parent object. There is an implicit relationship, e.g. if a Share
has a Capacity/State with types online and nearline (and/or cache and
nearline) then you would expect the Share to be related to Datastores
with Latencys online and nearline. However, as discussed many times
there is no requirement to publish the Capacitys, and especially not for
tape, so in principle you could have an online/custodial Share which
doesn't publish a nearline Capacity but is nevertheless related to a
nearline Datastore.

> This link would be optional one-to-many: each 
> StorageShare is 
> associated with a single Datastore (0..1 multiplicity) and 
> each Datastore is 
> associated with any number of StorageShares (0..* multiplicity).
> 
> Would that be acceptable?

Well ... rather than making Share <-> Datastore *..* you could consider
defining three separate Share <-> Datastore relations, one for each of
Online, Nearline and Offline, and each of which would be *..1. However,
that would imply that the storage for a Share for a given Latency can
only be provided by one Datastore, and I suspect that you can't be sure
of that in all cases. An example could be LCG-style Online/Custodial
D1T1 where you have three Datastores: tape, disk cache in front of the
tape, and permanent disk (I think this is what CNAF has?).

Stephen


More information about the glue-wg mailing list