[ogsa-wg] Re: [ogsa-rss-wg] Re:[RSS Architecture Discussion]
Soonwook Hwang
hwang at grid.nii.ac.jp
Tue Dec 13 23:31:39 CST 2005
Mathias and Donal,
If the coallocation issue has to be addressed
in the context of RSS, I agree with your idea of letting
the generation of an ordered set of possible coalloaction
plans be done by EPS. In fact, that's exactly what we have been
exploring for the last two years in the NAREGI project.
Let's assume that you want to run a MPI job that requires
a large number of nodes, and also assume that there is no
single resource with that amount of nodes. In this case,
there is no way to run the MPI job on a single resource.
Either the MPI job fails to run, or it has to run across
multiple resources with the help of some coallocation
mechanism. In order to deal with this situation, what the
NAREGI Super Scheduler (consisting of EPS and CSG) did
is that CSG just returns a set of candidate resources
to EPS and EPS then generates a set of feasible
co-allocation plans based on the set of candidate resources
returned from CSG and application's requirements (e.g.,
network bandwidth, latency, etc). In our current
implementation, since we haven't defined any ranking
function yet, EPS randomly choose one among possible
co-allication plans and then tries to make hard reservations
with local resources associated with the plan selected.
If succeeding in making reservation, EPS returns the co-allocation
plan to the job manager.
I am wondering as to what kinds of job types that EPS/CSG should be
able to handle as its input. Examples of different job types include a
single
job, a set of independent jobs (i.e., no dependency constraint between
jobs), workflow jobs (i.e., dependency constraints between jobs),
or MPI jobs (where maybe co-allocation is one of key requirements).
I am also wondering how these different job types will affect the design
of EPS/CSG interface and protocol. The way to describe resource
requirements for job execution may be different depending on the type
of jobs. Is the resource element part of JSDL v1 spec sufficient to be
used to describe requirements of jobs of different types?
I suspect that depending on job types, the parameters used for ranking
function/mechamism is likely to be different, and that we might end up
having different EPS/CSG interfaces for each job type.
Soonwook
-----Original Message-----
From: owner-ogsa-wg at ggf.org [mailto:owner-ogsa-wg at ggf.org] On Behalf Of
Donal K. Fellows
Sent: Tuesday, December 13, 2005 11:33 PM
To: ogsa-wg at ggf.org; ogsa-rss-wg at ggf.org
Subject: [ogsa-wg] Re: [ogsa-rss-wg] Re:[RSS Architecture Discussion]
Mathias Dalheimer wrote:
> There is no way of supporting e.g. coallocation of resources in the RSS.
That's not quite true. Provided there is a way of expressing it (JSDL
does not give us enough) coallocation can be supported by the EPS,
though it would just be returning possible coallocation plans and not
ones backed up by hard reservations. (As you said in text I elided, that
is stuff that is up to the job manager and reservation service). Just as
with the CSG, arranging for the EPS to return an ordered set of plans is
likely to be the right approach, especially as it allows the JM to
decide what to do if the first plan falls through (and different plans
might be needed depending on exactly what happened too).
Donal.
More information about the ogsa-wg
mailing list