[ogsa-bes-wg] Questions and potential changes to BES, as seen from HPC Profile point-of-view

Marvin Theimer theimer at microsoft.com
Wed Jun 7 17:48:26 CDT 2006


Hi;

 

I totally understand that jsdl 1.0 isn't the "final" answer.  I view my
email as being an attempt at constructive suggestion for potential
change. :-)

 

Currently the posix extension to jsdl includes various things that I
would argue are more generic than posix.  I think it would be
interesting to define a common subset (that includes things execution
pathname, command-line args, and working directory) as well as defining
a WindowsApplicationType that describes those things that are
Windows-specific.

 

It's OK if provenance is considered out-of-scope for jsdl.  But we may
still want to define a simple provenance notion for BES.

 

Marvin.

 

________________________________

From: A S McGough [mailto:asm at doc.ic.ac.uk] 
Sent: Tuesday, June 06, 2006 8:40 AM
To: Marvin Theimer
Cc: ogsa-bes-wg at ggf.org; Ed Lassettre
Subject: Re: [ogsa-bes-wg] Questions and potential changes to BES, as
seen from HPC Profile point-of-view

 

Hi,

In-lined,

Marvin Theimer wrote:
<snip>



*        Information model: 

1.      JSDL seems to inherently be focused on describing a single job
or a single computational resource.  For example, it has no notion of
describing all the differing compute nodes of a (heterogeneous) compute
cluster.  By incorporating JSDL elements into the BES information model
it seems that BES is foreclosing the ability to describe things like
compute clusters.  This issue also effects what can get returned from
GetActivityJSDLDocuments.  If I'm wrong about this, then it seems like
it would be worth having an explicit explanation about how to achieve
this functionality somewhere in the specification.

If I could say here that JSDL 1.0 is not meant to be a final answer. The
resource model as used within it is meant to be a "rough place holder"
as we are expecting that the CIM people will be providing something much
more robust and appropriate. We simply wanted something simple to get
things going. We hope that when this model is available we can roll it
into a later release of JSDL. If you're interested in following this up
I'd suggest getting involved with the OGSA information model people (and
obviously) the JSDL people.



2.      The BES information model now includes various posix-specific
elements of JSDL.  How would other systems - such as a Windows system -
be described?

The posix elements are only inside the POSIXApplicationType. The best
way would be to define a WindowsApplicationType (or something similar).
The JSDL group would be very interested in this.



3.      Is there any notion of specifying that all compute nodes should
have the same value for some attribute (e.g. CPU architecture, CPU
speed, NIC card)?  This seems to be missing from the JSDL specification,
but seems very important for BES if it is to support things like compute
clusters.

Again this will hopefully come in the future.



4.      Some of the elements seem either incompletely specified, have
definitions that are open to multiple interpretations, or have
definitions that would be very difficult to implement in practice.  In
particular:

5.      CPU architecture seems like it can't describe all the variations
- let alone all the peripherals such as GPUs - that a computing resource
might have (let alone a cluster).

6.      CPU speed seems like the tip of an iceberg having to do with
characterizing the performance of a system, which will depend on all
manner of things like details of the processor chip used, cache sizes,
bus used, etc.

7.      Network bandwidth: is this the theoretical maximum of the NIC on
a compute node or is it the current bandwidth actually available in a
(shared) system?  Note that the latter is difficult to measure in a
practically useful way.  Note also that network bandwidth only describes
one aspect of communications performance and that several others are
arguably equally important (e.g. latency).

All this leads to the question of whether BES will have a notion of
extending the information model that is supplied. If so, then that leads
to the question of what the base case should be and whether it should
include a smaller set of things than is currently listed in the spec.

Are there any plans to tighten the definitions of some of the more vague
information elements?  (I guess this really is an issue more for the
JSDL WG than for BES.)

Again - this is where we are now planning to go with JSDL. JSDL 1.0
should be seen as a starting point and not the end. We hope most of
these things can be handled through extensions to JSDL 1.0. Those that
can't we'll need to add into future versions.



8.      GetActivityJSDLDocuments returns a JSDL document for each
specified activity.  Is this sufficient to capture the entire
"provenance" for what has happened to the activity?  In particular,
would it be sufficient to allow someone to (a) run the same activity on
another BES service (assuming same hardware and software) and get the
same results and (b) debug what has happened to an errant activity?  I
would argue that both capabilities have proven to be important in actual
systems.

A JSDL document is (by the definition of our charter) a Job Submission
document. As such things like provenance was ruled out of scope (not by
us but by GGF in general - it was felt that this was too much to do all
in one goal). However, there is now the scope to go back and re-address
these issues. There was some interesting discussions at the last GGF
meeting where the ideas of where non submission information could be
placed. The suggestions included in a wrapper around the JSDL document
or (and from my recollection) the more popular option was to place it in
the outer most level of the JSDL document.

Hope this helps,

steve..



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/ogsa-bes-wg/attachments/20060607/7abb0a77/attachment.html 


More information about the ogsa-bes-wg mailing list