[glue-wg] Datastore proposal
Paul Millar
paul.millar at desy.de
Wed Apr 9 13:24:22 CDT 2008
On Wednesday 09 April 2008 16:14:36 Burke, S (Stephen) wrote:
[snip!]
> A Datastore would represent some set of uniform managed storage hardware,
Sounds OK.
> e.g. a tape robot plus all the tapes, or a set of disk servers managed for
> the same purpose.
So, if some site offers some disks that provide a custodial-like service and
others that provide replica-like service, they would be represented as
separate Datastore objects, right? Or is more the difference between disk
used for caching and disk used as part of UserDomain's allocated space (D1T1,
D1T0)?
Playing devil's advocate here, what if a site had a set of disks that could
provide either a custodial- or replica-like service (e.g. resilient mode with
dCache)?
> For
> clarity, disk servers allocated e.g. to different VOs would still just
> constitute a single Datastore, but disk servers used for completely
> different purposes would be separate, e.g. the disk cache in front of
> the tape robot would be a different Datastore if managed independently
> of the Disk1 storage.
OK, but it would be equally OK to publish all disk storage as one big
generic "disk" Datastore, right?
> The Datastore would have fairly few attributes:
> [...]
> Type (disk, tape, ... - open enumeration)
... which we would define the common types, right?
> [...]
> Capacity (NB this is in the schema as a separate object for technical
> reasons but is really just an attribute)
Would it make sense to embed the numbers as attributes, rather than have them
as a separate object? If totalSize, usedSize, etc. are all, at most,
single-valued for a Datastore, then simply having these numbers as attributes
might make more sense.
> OtherInfo (as usual)
>
> It might perhaps be useful to give the technology, e.g. RAIDn for disk
> systems, but I think that should go in OtherInfo as it's likely to be
> hard to standardise it.
Agreed. Within a grid, they might come up with a standardised method of
expressing the technology. In fact, this might become something of a
folkonomy.
> This would be linked to the existing StorageResource object with a
> one-to-many relation, i.e. one Resource could manage many Datastores
> (Castor manages tape and disk) but not vice versa, one Datastore can
> only be managed by one Resource - if there are e.g. multiple sets of
> disk servers managed by several different software systems that would
> constitute multiple Datastores.
OK, I think, but it might depend on the "management" aspect of Datastores.
> The relation to StorageEnvironment and/or StorageShare remains open
> for discussion as there are other issues there (e.g. whether we want the
> Environment at all), but conceptually you would want a relation between
> the Share and whichever DataStore(s) store the data for that Share. That
> can be one to many, e.g. if Custodial/Online uses disk+tape, as in WLCG.
How about, instead of associating the StorageShare with the Datastore, there
was a link between the StorageCapacity (associated with a StorageShare) to
the Datastore? This link would be optional one-to-many: each StorageShare is
associated with a single Datastore (0..1 multiplicity) and each Datastore is
associated with any number of StorageShares (0..* multiplicity).
Would that be acceptable?
Paul.
More information about the glue-wg
mailing list