[glue-wg] Some thoughts on storage objects

Maarten.Litmaath at cern.ch Maarten.Litmaath at cern.ch
Mon Mar 31 18:47:29 CDT 2008


Hi Paul,

> I'm updated the document, but do people feel this is useful?
> 
> We could:
> 	folded the information into the GLUE 2.0 spec,

That would be ideal.  Note that it will need to be adapted if we
cannot reach agreement on some statement it currently makes.

> 	keep it as an informative (non-normative) document,
> 	drop it, as being too confusing?
> 
> [...]
> > >  A StorageEnvironment is a collection of one or more StorageCapacities
> > >  with a set of associated (enforced) storage management policies.
> > >  Examples of these policies are Type (Volatile, Durable, Permanent)
> > >  and RetentionPolicy (Custodial, Output, Replica).
> >
> > Note that we should get rid of the obsolete, confusing Type and Lifetime
> > attributes in favor of the ExpirationMode copied from SRM v3.
> 
> Should we do this with GLUE v2.0?

Yes!  The ExpirationMode was invented to clear up the confusion that
has arisen from the (mis)use of Type/Lifetime for other purposes.

> (I would be happy with that).
> 
> [...]
> > >  (SC_E) that is associated with some StorageEnvironment and which has
> > >  totalSize TS_E, let the sum of the totalSize attributes for all
> > >
> > |                  ^^^
> > |                  let TS_S be .....
> 
> Err,  I think that one should be TS_E.  totalSize of the StorageShare 
> associated with the Environment (the one "underneath" the 
> StorageEnvironment).  In this context the StorageShare represents all of the 
> physical medium.

Well, you wrote this before:

|     let the sum of the totalSize attributes for all
|     StorageCapacities that are:
|           1. associated with a StorageSpace, and
|           2. that are implicitly associated with SC_E
|     be TS_S.

I suggested this instead:

|     let TS_S be the sum of the totalSize attributes for all
|     StorageCapacities that are:
|           1. associated with a StorageShare, and
|           2. that are implicitly associated with SC_E
|     .

Spot the 3 differences!  It reads more easily...

> Somehow, accurately describing what overlapping and incomplete StorageShares 
> means takes a lot of words!
> 
> 
> [...]
> > >  do not change as a result of file creation or deletion.  [Does GLUE need
> > > to stipulate this, or should we leave this vague?]
> >
> > Why mention it at all?  You do not make statements about the behavior
> > of the other sizes, and I think there is no need to go there...
> 
> Well, we could just not mention it; however, I'm a little concerned about 
> tacit assumptions, and how not everyone has the same set.  I'd hope we can 
> make all the assumptions explicit.
> 
> In this particular case, there's (at least) a couple of models for how a space 
> could work:
> 
>   a. partitioning: I'm allocating 10 TiB, I can store files up to that size of 
> data (*).  Once I've written 10 TiB of data, I can delete files to create 
> more space.  This is like having a 10 TiB hard disk to store data.
> 
> 	(*) real-life systems have some complications, but considering an
> 		idealised storage system.
> 
>   b. consumable: I'm allocated 10 TiB storage.  I can store files up to that 
> size of data.  I can delete the files if I like, but deleting files doesn't 
> allow me to store more files.  Once I've used up that 10 TiB of storage, I 
> have to ask for more.
> 
> (perhaps option b. seems a little crazy, but it might be how StorageShares 
> would work with archival WORM media).

That example proves my point: one cannot predict/prescribe how a particular
size attribute will be affected when a file is stored or deleted!

> It seems that everyone in HEP assumes a partitioning model but I don't think 
> I've seen it stated anywhere.  Other communities might assume a different 
> model.  If we want information to be sharable (or mergable) I think we should 
> state clearly any assumptions we're making.  If we don't assume anything, 
> that should be noted too, so people consuming information know that either:
> 	1. they can't assume what happens, or
> 	2. if they assume something, that it's a WLCG convention that they must 
> revisit when combining data with other information sources.

We must explicitly mention that in general any of the various sizes may be
an approximation whose exact behavior depends on the underlying technology,
implementation, load, concurrent usage, ...
A grid may impose strict(er) requirements on the behavior.
GLUE just provides a "coat rack" for information.
Thanks,
	Maarten



More information about the glue-wg mailing list