[UR-WG] Proposals for UR version 2

piro at to.infn.it piro at to.infn.it
Wed Sep 13 16:07:50 CDT 2006


Hi all!

This morning at the UR-WG session at GGF18 we briefly discussed a few
proposals for version 2 of the UR.

In order to involve also those into the discussion that could not attend I
send you some slides on my proposals, in two version: One that contains
also some information on our accounting system DGAS (not essential for the
discussion and not presented at the meeting, but maybe they might the info
might be interesting as a use case), and a version that does contain only
the proposals themselves.

Some comments on the single proposals:
(1) This proposal is based on existing use cases, we actually have VOs in
LCG that need more detailed user information, such as the group and role
of the user within the VO. One specific use case: BIOINFOGRID decided to
become part (group) of the BIOMED VO, but still want distinct accounting
statistics in order to compare their resource usage to that of the rest of
the VO. The data fields we propose for this information can be considered
a part of the user's identity (above all the VO is important, I know of
users that are subscribed - with the same User DN/certificate - to
multiple VOs, which VO to charge for a job submitted by such a user?)

(2) We propose to have an element for all ResourceIdentity information, in
analogy to UserIdentity and JobIdentity. This element would contain
already present fields, such as MachineName, Queue, Host. It should also
contain a SiteName (that might also be put into MachineName, but then
where to put the cluster name? Site name is semantically different from
machine name). The most important proposal for this RecordIdentity,
however is to have a field that can be used to specify the global resource
ID within the Grid environment (in our case the LCG ComputingElement ID,
that is based on the Globus contact string). We deem this most important
because for some Grid environments it might be impossible to reconstruct
the correct global resource ID from fields like MachineName (above all if
the meaning of machine name is left open), Host and Queue. Moreover, if
having a GlobalUsername (and ds:KeyInfo) and a GloblaJobId, it is only
logical to also have a GlobalResourceId ("Resource" because then it might
also be used for accounting of other resource, like storage, in the
future).

(3) Proposal 3 has caused some heavy discussion between two fundamentally
different point of views (in my opinion both understandable and
reasonable):
a) leave out of the UR the information that is not directly related to
resource usage (performance is not directly an issue of the job that is
running on the resource)
b) include all information that is necessary to interpret the usage values
(CPU time doesn't mean much when performances are significantly different
- 1 CPU sec on a low performant host is not equal to 1 CPU sec on a high
preformant host)
BUT: These performance values (which ones to take? which do make sense?)
might also be specified by means of the existing extension framework.

(4) We should start thinking about going beyond job usage. Storage
accounting, for example, will be an important issue in the near future
(LCG/EGEE for example is currently working on it). We propose to have a
basic usage record type from which more specific usage records
(JobUsageRecord, StorageUsageRecord, ...) can be derived by adding
additional specific elements (JobIdentity for jobs, FileIdentity for files
on a storage resource, ecc.)

(5) There was no time for looking at this proposal at the meeting. We
should decide wether requested/reseved resources can be considered as
"used". It is, for example, most likely that reserved storage space will
be charged to a user. Some data fields might occur multiple times (Disk,
Network, ...) having an additional attribute (I called it "usage", but
there surely are better names) that specifies whether the resource has
been actually "consumed" or only "reserved/requested". The default should
be "consumed" for reasons of backward compatibility (old URs would still
be valid).

Among these proposals I deem (1) as important and (2) as essential. (4)
might be important for achieving a common framework for more generic usage
accounting (It wouldn't be nice if the UR would be used only of jobs and
storage would be handled by a completely different approach). Proposals
(3) and (5) are just ideas, but might also be handled by the extension
framework. But, we should always think twice before just saying "well that
can be done with an extension", otherwise we might end up with N different
versions (through different extensions) of the UR in N different Grid
environments. This might undermine the standardization effort. A specific
use case: DGAS requires the GlobalResourceId, since it specifies the
resource account to which a usage record should be associated. If we use
extensions for that, and other Grids/accounting systems use other
extensions for the same thing, then we might not be able to exchange usage
information between these accounting systems.

Please check the proposals, think about them, think also about actual use
cases, and then let's have a fruitful discussion! :o)

Cheers,

Rosario.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DGAS_GGF18_Washington-2006-09.pdf
Type: application/pdf
Size: 1420395 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/ur-wg/attachments/20060913/da78704d/attachment-0002.pdf 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DGAS_GGF18_Washington-onlyURproposals-2006-09.pdf
Type: application/pdf
Size: 678712 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/ur-wg/attachments/20060913/da78704d/attachment-0003.pdf 


More information about the ur-wg mailing list