[UR-WG] Draft/suggestion for Storage Accounting Record (2010/10/07)

Wed Oct 13 03:26:39 CDT 2010

Hi Henrik,

I think this is a really nice step forward.

I have a few quick comments. But I hope to have a more detailed look at 
some point before the next OGF meeting. - (which I am sorry to say I 
cannot attend)

It would be nice to see how it fits with other (non-grid) storage systems.

Regarding creating a stand alone storage accounting record. I think 
there are good and bad points here.

Good: It may allow you to get a prototype with less delay which you 
could then use and feedback your experience towards the evolution of a 
full standard. Maybe it even is enough to do this for your EMI work.

Bad: Splitting away from the UR standard causes issues when you want to 
combine compute/storage accounting. The initial idea from OGF, dating 
back to OGF 21, was that the UR would be formed from different elements 
e.g. Compute/Storage/Network. The record would effectively aggregate the 
different parts of the accounting information. I attached the UR2 talk 
from Donal Fellows at OGF21, I think the UR2 Zoo slide is a nice vision 
to have.

Now maybe this is doable and maybe it's a pain but I think it'd be nice 
to try.
Much of the information regarding the users/vo(community)/site would be 
the same and would fit into the core.

One other (smaller) issue that I had was replicas. In some storage 
systems you can have the same file replicated multiple times to improve 
access, dCache can do this and I would be surprised if iRODS couldn't do 
it too. How do you think we should account for this, simply treat the 
files as separate or flag this storage as a different type. It may not 
be so pressing quite yet since I don't think it is so so frequently done 
and so could be a very small fraction of the storage.

We should also start to think about what is required on the backend from 
the storage solution providers to allow us to gather/use the accounting 
info you want. I think many systems would be able to give an answer to 
"what is in your system now" but I am not sure how it is when we ask 
"what was in your system between X and Y 2007". I think you have good 
contact with several providers and can ask right? This also opens up 
questions to what we account for bytes v's byte-mins? If you are really 
talking about an integration over a time period this is what it would 
amount to. We had some discussion regarding the snapshot/integration - 
and I think we may have some more :-)

As I said - just first quick thoughts.

cheers
johnk

On 10/07/2010 01:52 PM, Henrik Thostrup Jensen wrote:
> Hi
>
> (this is send both to the ogf ur list and the emi sar list, sorry for
> duplicates).
>
> Jon and me happened to be at the same meeting this week, so we took out a
> couple of hours of the programme to create a draft suggestion for a storage
> accounting record. The suggestion is mainly meant as a way to kickoff the
> meeting in Brussels.
>
>
> 1. Discrete vs continuous
>
> First we discussed how storage differs from jobs in the sense that a job is a
> discrete unit whereas storage has a more continuous nature, where the usage more
> or less constantly varies (bytes used), but typically in relatively small
> scale.
>
> A storage record can however only describe something discrete, so the
> continuous nature of storage will have to be split into something discrete
> (similar to integration). Of course the granularity of this process should not
> be dictated be the standard. The first suggestions in then to have a start and
> an end time for the storage measurement where a measurement is taken. This
> allows to create per-day, per-hour resolution or something third, depending on
> the needs.
>
> Result:
> "StartTime" and "EndTime" element for describing the time interval for when the
> record is valid. Both of them are DateTime values.
>
>
> 2. What to measure
>
> The most obvious metric here is the amount of used space. Another metric is the
> amount of reserved space, i.e., space not taken by actual files/objects, but
> which cannot be used by other parties. A third metric, is the amount of space
> which is not allocated, and can be used. Initially we called this "free space",
> but this term is not exactly accurate as it refers more to a file system
> metric, which can be very different from what can actually be used due to
> quotas. Instead we choose to have an element describing how much unallocated
> space there is, i.e., how space one can be expected to be able to use. The
> actual free space is a metric, which can be very tricky to measure, and is
> typically only of interest to the storage system administrator.
>
> A secondary issue is how to report the numbers. We quickly decided on using
> bytes, as it is the fundamental unit, and saves us from deciding on weather to
> use 1000 or 1024 bytes as a base. Having multiple options (say KB, MB) for
> reporting would complicate the standard unnecessarily without any real benefits,
> so we suggest to keep it to bytes, and bytes only. This would also ensure that
> the number is always an integer (when using KB and up, floats could be used to
> be exact) and saves silly conversion routines when parsing the record.
>
> We briefly considered if reporting should be per file, but this was quite
> quickly shoot down, as it would make the records unreasonably large, without
> providing any real value. We did end with an element describing the number of
> files, which are using the space reported.
>
> Result:
> "UsedSpace", "ReservedSpace", "UnAllocatedSpace" metrics for describing how
> space is used, reserved, and available for use. Reserved and unallocated are
> probably not overlapping (as reserved space is technically used). The
> measurement is in bytes.
> "FileCount" for describing the number of files using the space.
>
>
> 3. Site and Storage Concepts
>
> The issue of how to describe the site and what storage the accounted data is
> stored is perhaps the most complex issue in defining this format. The
> discussion to achieve this was very non-linear, so I will just the describe the
> result:
>
> A site is considered a top level container for storage. The site name should
> globally unique (which probably means an FQDN).
> A storage system is an independent system on a site.
> A storage system partition is a part of storage system (similar to dCache pools)
> A storage type describes the storage type where the data is stored (disk, tape)
>
> None of these elements are mandatory (though we couldn't find a valid use case
> for a record without a site). E.g., is perfectly valid to just use site, site
> and storage type, or site and storage system. If multiple of the elements
> exist, they are considered to have hierarchical tree-like structure, i.e.,
> site ->  storage system ->  storage system partition ->  storage type. How to
> structure this is described in the "Record Structure" section.
>
> This allows a site with a simple setup to reporting just for the site, where as
> more complex installations can report per storage system, and how much is
> placed on tape and disk respectively for different storage systems.
>
> A single institution can easily report multiple sites, we do not interfere in
> this, but it is very likely that a site would have several storage element /
> systems and would like to able to aggregate it under a single site.
>
> Finally we also add the possibility for a storage class, which describes the
> class of storage accounted for. This could "precious", "deletable", "pinned",
> etc. It is not considered a part of the hierarchy described above.
>
>
> 4. Identity
>
> To describe who is using the space, an identity block or group, similar to the
> one in usage record must be supported. As with usage record it must be possible
> to specify both a local user name and global user identity. However the common
> case is likely to be a VO or a VO group, so being able to describe this is
> extremely important. Another use case is a local group at the site who owns the
> data (not everyone uses grid). How exactly to describe the virtual organization
> is not quite clear, but the following elements are needed as a base: VO name,
> VO issuer (DN of the VO issuer, somewhat VOMS specific), VO group, and VO role.
> There might be use cases for being able to have multiple VO blocks (though I
> suspect that will be messy).
>
> Resulting elements:
> VO Name
> VO Issuer
> VO group
> VO role
> Global User Identity
> Local User Name
> Local User Group
>
> All element are strings. The Global User Identity is to be considered global
> unique. Furthermore the combination of VO issuer and VO name should be globally
> unique.
>
>
> 5. Record Structure
>
> Having identified the identity block and the site/storage structure, we tried
> to find a good way to structure the individual records. We considered the
> possibility of having a tree structure in the storage accounting record, with
> the site, storage system, storage system partition, and storage type as
> possible levels in the tree. The leaves would then each have an identity and
> space usage block. This however would be quite complicated to construct and
> parsed, so a simpler - more flat - structure was quickly preferred. Here each
> record would maximum have a one site, storage system, storage system partition,
> and storage type (but all still optional). If a system would need to describe
> for multiple storage system / storage system partitions / storage type
> several records would have to be generated. This format should still be easy
> to aggregate together into site or storage system to get complete numbers. The
> flat format is also much easier to explain which probably means that it should
> be preferred. The two structure can describe exactly the same, so there is no
> limitation by choosing the flat format.
>
>
> 6. Record Overview
>
> Given the previous section, we can now present an overview of the elements in a
> storage accounting record. We reuse the recordId and createTime elements from
> the usage record standard (just the element names, not the namespaces).
>
>
> StorageAccountingRecord
>
>    recordId (considered globally unique)
>    createTime (timestamp when the record was created)
>    StartTime (datetime value)
>    EndTime (datetime value)
>    Site
>    StorageSystem
>    StorageSystemPartition
>    StorageType
>    StorageClass
>    IdentityBlock
>     VO Block
>      VO Name
>      VO Issuer
>      VO Group
>      VO Role
>     GlobalUserIdentity
>     LocalUserName
>     LocalGroupName
>    UsedSpace
>    ReservedSpace
>    UnallocatedSpace
>    FileCount
>
> The only enforced element is the recordId.
>
>
> 7. Name Issues
>
> Someone brought up that SAR is not the most fortunate name. It sounds a bit
> like SARS, and the phonetic sound is apparently close to a non-to-fortunate
> Hungarian word.
>
> We could consider SR (storage record) or SUR (storage usage record)
> This really doesn't matter.
>
>
> 8. Sharing Elements with Usage Record Standard
>
> Some of the elements are identical in both name and semantics to the ones in
> usage record. We do not suggest to share the elements as such (same namespace),
> as it would make the standard rely on the UR standard, and hence make it less
> self-contained. The UR standard is only used in a few systems, and is likely to
> be replaced with a new standard sometime. Furthermore the implementation gains
> of sharing the names are very small, if they even exist.
>
>
> We have definitely missed something in this, but we hope this can be a start
> for the discussion in Brussels. If you see problems or issues with this record
> please let us know.
>
>
>       Best regards, Henrik Thostrup Jensen&  Jon Kerr Nilsen
>
>
> --
>    ur-wg mailing list
>    ur-wg at ogf.org
>    http://www.ogf.org/mailman/listinfo/ur-wg
>    

-- 
+------------------------------------------------------------+
|Dr. John Alan Kennedy          Rechenzentrum Garching (RZG) |
|Mail:  jkennedy at rzg.mpg.de     Boltzmannstrasse 2           |
|Phone: +49 89 3299 2694        85748 Garching               |
|Fax:   +49 89 3299 1301                                     |
+------------------------------------------------------------+

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OGF21-UR-WG_Presentation_with_changes.ppt
Type: application/vnd.ms-powerpoint
Size: 461312 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/ur-wg/attachments/20101013/f470f458/attachment-0001.ppt