[glue-wg] Datastore proposal

Paul Millar paul.millar at desy.de
Wed Apr 9 13:24:22 CDT 2008


On Wednesday 09 April 2008 16:14:36 Burke, S (Stephen) wrote:
[snip!]
> A Datastore would represent some set of uniform managed storage hardware,

Sounds OK.

> e.g. a tape robot plus all the tapes, or a set of disk servers managed for
> the same purpose.

So, if some site offers some disks that provide a custodial-like service and 
others that provide replica-like service, they would be represented as 
separate Datastore objects, right?  Or is more the difference between disk 
used for caching and disk used as part of UserDomain's allocated space (D1T1, 
D1T0)?

Playing devil's advocate here, what if a site had a set of disks that could 
provide either a custodial- or replica-like service (e.g. resilient mode with 
dCache)?


> For  
> clarity, disk servers allocated e.g. to different VOs would still just
> constitute a single Datastore, but disk servers used for completely
> different purposes would be separate, e.g. the disk cache in front of
> the tape robot would be a different Datastore if managed independently
> of the Disk1 storage.

OK, but it would be equally OK to publish all disk storage as one big 
generic "disk" Datastore, right?


>   The Datastore would have fairly few attributes:
> [...]
> Type (disk, tape, ... - open enumeration)

... which we would define the common types, right?

> [...]
> Capacity (NB this is in the schema as a separate object for technical
> reasons but is really just an attribute)

Would it make sense to embed the numbers as attributes, rather than have them 
as a separate object?  If totalSize, usedSize, etc. are all, at most, 
single-valued for a Datastore, then simply having these numbers as attributes 
might make more sense.


> OtherInfo (as usual)
>
> It might perhaps be useful to give the technology, e.g. RAIDn for disk
> systems, but I think that should go in OtherInfo as it's likely to be
> hard to standardise it.

Agreed.  Within a grid, they might come up with a standardised method of 
expressing the technology.  In fact, this might become something of a 
folkonomy.

>   This would be linked to the existing StorageResource object with a
> one-to-many relation, i.e. one Resource could manage many Datastores
> (Castor manages tape and disk) but not vice versa, one Datastore can
> only be managed by one Resource - if there are e.g. multiple sets of
> disk servers managed by several different software systems that would
> constitute multiple Datastores.

OK, I think, but it might depend on the "management" aspect of Datastores.

>   The relation to StorageEnvironment and/or StorageShare remains open
> for discussion as there are other issues there (e.g. whether we want the
> Environment at all), but conceptually you would want a relation between
> the Share and whichever DataStore(s) store the data for that Share. That
> can be one to many, e.g. if Custodial/Online uses disk+tape, as in WLCG.

How about, instead of associating the StorageShare with the Datastore, there 
was a link between the StorageCapacity (associated with a StorageShare) to 
the Datastore?  This link would be optional one-to-many: each StorageShare is 
associated with a single Datastore (0..1 multiplicity) and each Datastore is 
associated with any number of StorageShares (0..* multiplicity).

Would that be acceptable?

Paul.


More information about the glue-wg mailing list