[ogsa-bes-wg] RE: Questions and potential changes to BES, as seen from HPC Profile point-of-view

Wed Jun 7 17:40:58 CDT 2006

Hi;

The cluster job scheduler and meta-scheduler scenarios you mention are
key ones that the HPC profile work is intended to cover (as base case
and as an extension of the base case).  Hence I, at least, can't afford
to ignore it and have to therefore ask questions around how one would
support it using BES and JSDL as the foundations of the HPC profile
work.  In my opinion BES has to be able to support the compute cluster
scenario well, or it risks becoming a somewhat inappropriate basis for
the HPC profile design.

One thing to keep in mind on the security/authorization issue is the
following: it seems a bit strange (to me) to have an interface that
consists of operations that only some clients can invoke.  It seems like
a better design to have an interface of which users can invoke any
operation and a separate interface that is available to administrators
with additional privileges.  I suspect this will also be simpler and
more feasible from the point-of-view of what existing tooling can
provide/support.

Marvin.

-----Original Message-----
From: owner-ogsa-bes-wg at ggf.org [mailto:owner-ogsa-bes-wg at ggf.org] On
Behalf Of Karl Czajkowski
Sent: Tuesday, June 06, 2006 2:40 AM
To: Donal K. Fellows
Cc: Peter Lane; Marvin Theimer; ogsa-bes-wg at ggf.org; Ed Lassettre
Subject: Re: [ogsa-bes-wg] RE: Questions and potential changes to BES,
as seen from HPC Profile point-of-view

This (somewhat repetitive) discussion makes me think that we still do
not have a precise enough nomenclature for the various aspects of jobs
that matter in describing the architecture. I think we (the HPC and
Grid Job management community) spend an inordinate amount of time
talking in circles to reestablish common ground because of this...
The terms are important not only to have unambiguous debates, but also
to make the semantics in specifications clear, concise, and relevant.

At a high level, this thread has revisited (and these are my best
efforts at describing them for discussion, and not an assertion of
what to call things in specifications):

   1. Service management interfaces
      a. Possible generic control (start, stop, deploy, undeploy, etc.)
      b. Existing BES-specific controls (accept activities, etc.)
   [Has strong overlap with WSDM, CIM, et. al]

   2. Generic Activity management interfaces (the main point of BES)
      a. Activity creation and destruction
      b. BES service "resource capabilities" (what can be run)
      c. Single and Array activity monitoring and control
   [Has overlap with provisioning and resource management]

   3. Extended Activity management interfaces
      a. HPC Suspend/resume
      b. POSIX or Windows-specific controls
      c. Job File I/O
      d. ... (huge range of domain-specific stuff in future)
   [Has overlap with existing "legacy" grid services]

   4. Extensibility Discovery/Negotiation mechanisms

And I will add:

   5. Applications/Services

By this last one, I mean that BES is envisioned to support management
of complex services through extensions.  I think there is a
distinction to be made between the "business" end of such a service
(5) and its management interface as a special kind of activity (3).  I
often get an uncomfortable feeling that some people are expecting the
"activity EPR" to expose the business interface, while I and some
others expect it to only expose the extended management interface.
This extended interface might well include some
discovery/bootstrapping capabilities to help find the business
endpoint(s) of the activity.

I will also point out that, in some sense, BES as a generic activity
manager (2) ought to be capable enough to recursively act as the
manager of another BES service (1a).  As an analogy, a well-known BES
service might allow creation of VO's and VO services. Some of these VO
services might be new BES instances.  It seems to me that the
BES-service management interfaces (1b) of the second-tier BES should
be part of the extended activity interface (3) of those BES-types
activities when they exist within the first tier's environment.  On
the other hand, the ability to create activities in the second tier is
part of the "business" interface of those BES services. In my opinion,
it does not belong in the activity management view of the service
within the first tier BES service! [Note, I mention this to try to cut
deeply into the distinctions between interfaces. I realize recursive
use of BES is not on the top of everyone's must-have scenario list.]

Also, as Marvin and Donal are pointing out, we need to consider
various roles and scenarios.  On the other hand, Donal's comments lead
towards that slippery slope of assuming too much about implementation,
e.g. that authorization is scoped at endpoints.  Do we wish to assume
this?  I know it is easy in Globus land, but we can also handle
per-operation authorization.  We are, as implementers, a little
hog-tied as far as providing finer authorization scoping, e.g. access
controls on specific message contents.  It is a tricky area, fraught
with usability problems and non-interoperability.

karl

-- 
Karl Czajkowski
karlcz at univa.com