[ogsa-bes-wg] Questions and potential changes to BES, as seen from HPC Profile point-of-view

A S McGough asm at doc.ic.ac.uk
Tue Jun 6 10:39:44 CDT 2006


Hi,

In-lined,

Marvin Theimer wrote:
<snip>
> ·        Information model:
>
> ·        JSDL seems to inherently be focused on describing a single 
> job or a single computational resource.  For example, it has no notion 
> of describing all the differing compute nodes of a (heterogeneous) 
> compute cluster.  By incorporating JSDL elements into the BES 
> information model it seems that BES is foreclosing the ability to 
> describe things like compute clusters.  This issue also effects what 
> can get returned from GetActivityJSDLDocuments.  If I'm wrong about 
> this, then it seems like it would be worth having an explicit 
> explanation about how to achieve this functionality somewhere in the 
> specification.
>
If I could say here that JSDL 1.0 is not meant to be a final answer. The 
resource model as used within it is meant to be a "rough place holder" 
as we are expecting that the CIM people will be providing something much 
more robust and appropriate. We simply wanted something simple to get 
things going. We hope that when this model is available we can roll it 
into a later release of JSDL. If you're interested in following this up 
I'd suggest getting involved with the OGSA information model people (and 
obviously) the JSDL people.
>
> ·        The BES information model now includes various posix-specific 
> elements of JSDL.  How would other systems -- such as a Windows system 
> -- be described?
>
The posix elements are only inside the POSIXApplicationType. The best 
way would be to define a WindowsApplicationType (or something similar). 
The JSDL group would be very interested in this.
>
> ·        Is there any notion of specifying that all compute nodes 
> should have the /same/ value for some attribute (e.g. CPU 
> architecture, CPU speed, NIC card)?  This seems to be missing from the 
> JSDL specification, but seems very important for BES if it is to 
> support things like compute clusters.
>
Again this will hopefully come in the future.
>
> ·        Some of the elements seem either incompletely specified, have 
> definitions that are open to multiple interpretations, or have 
> definitions that would be very difficult to implement in practice.  In 
> particular:
>
> ·        CPU architecture seems like it can't describe all the 
> variations -- let alone all the peripherals such as GPUs -- that a 
> computing resource might have (let alone a cluster).
>
> ·        CPU speed seems like the tip of an iceberg having to do with 
> characterizing the performance of a system, which will depend on all 
> manner of things like details of the processor chip used, cache sizes, 
> bus used, etc.
>
> ·        Network bandwidth: is this the theoretical maximum of the NIC 
> on a compute node or is it the current bandwidth actually available in 
> a (shared) system?  Note that the latter is difficult to measure in a 
> practically useful way.  Note also that network bandwidth only 
> describes one aspect of communications performance and that several 
> others are arguably equally important (e.g. latency).
>
> All this leads to the question of whether BES will have a notion of 
> extending the information model that is supplied. If so, then that 
> leads to the question of what the base case should be and whether it 
> should include a smaller set of things than is currently listed in the 
> spec.
>
> Are there any plans to tighten the definitions of some of the more 
> vague information elements?  (I guess this really is an issue more for 
> the JSDL WG than for BES.)
>
Again - this is where we are now planning to go with JSDL. JSDL 1.0 
should be seen as a starting point and not the end. We hope most of 
these things can be handled through extensions to JSDL 1.0. Those that 
can't we'll need to add into future versions.
>
> ·        GetActivityJSDLDocuments returns a JSDL document for each 
> specified activity.  Is this sufficient to capture the entire 
> "provenance" for what has happened to the activity?  In particular, 
> would it be sufficient to allow someone to (a) run the same activity 
> on another BES service (assuming same hardware and software) and get 
> the same results and (b) debug what has happened to an errant 
> activity?  I would argue that both capabilities have proven to be 
> important in actual systems.
>
A JSDL document is (by the definition of our charter) a Job Submission 
document. As such things like provenance was ruled out of scope (not by 
us but by GGF in general - it was felt that this was too much to do all 
in one goal). However, there is now the scope to go back and re-address 
these issues. There was some interesting discussions at the last GGF 
meeting where the ideas of where non submission information could be 
placed. The suggestions included in a wrapper around the JSDL document 
or (and from my recollection) the more popular option was to place it in 
the outer most level of the JSDL document.

Hope this helps,

steve..
>
-- 
------------------------------------------------------------------------
Dr A. Stephen McGough                       http://www.doc.ic.ac.uk/~asm
------------------------------------------------------------------------
Technical Coordinator, London e-Science Centre, Imperial College London,
Department of Computing, 180 Queen's Gate, London SW7 2BZ, UK
tel: +44 (0)207-594-8409                        fax: +44 (0)207-581-8024
------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/ogsa-bes-wg/attachments/20060606/baa9319d/attachment.htm 


More information about the ogsa-bes-wg mailing list