[jsdl-wg] Questions and potential changes to JSDL, as seen from HPC Profile point-of-view

Fri Jun 9 05:03:06 CDT 2006

Hi Marvin,

Thanks for the (fairly long) email, you've raised quite a few 
interesting points - which I'll address inline below. First off I'd just 
like to say that the JSDL document is meant to be a language 
specification document thus a large number of the issues about how JSDL 
should be used and what they have to support is not really in scope for 
that document. However, I do agree with you that such a document needs 
to exist - but for all uses of JSDL not just HPC. I would like to take 
your straw man and use it as the starting point for this document for 
the section on "using JSDL for HPC". Let me know what you think.

More comments below:

Marvin Theimer wrote:
>
> Hi;
>
> Coming from the point-of-view of the HPC Profile working group, I have 
> several questions about JSDL, as well as some straw man thoughts about 
> how JSDL should/could relate to the HPC Profile specification that I'm 
> involved with.  Some of my questions lead me to restrictions on JSDL 
> that an HPC profile specification might make.  Other questions lead to 
> potential changes that might be made as part of creating future 
> versions of JSDL.  (I'm well aware that JSDL 1.0 was meant as a 
> starting point rather than the final word on job submission 
> descriptions and so please interpret my questions as being an attempt 
> at constructive suggestions rather than a criticism of a very fine 
> first step by the JSDL working group.)
>
Will do.
>
> At a high level, there are several general questions that came up when 
> reading the JSDL 1.0 specification:
>
> ·        Can JSDL documents describe jobs other than Linux/Unix/Posix 
> jobs?  For example, things like mount points and mount sources do not 
> map in a completely straight-forward manner to how file systems are 
> provided in the Windows world.
>
The idea is that JSDL (possibly through extensions) will be able to 
describe all kinds of jobs that can be submitted. This may include 
database queries, control of instruments or web invocations. We did 
Posix job type fist as that was the one the majority of people in the 
group wanted first. We're currently working on a ParallelApplication 
extension for JSDL. We'd be more than happy to see if an extension for 
Windows (or any other system) can be done through tweaks to the existing 
setup or by adding a new extension.

Could you say how file systems don't map to the Windows world? My naive  
assumption was that  you could do it.
>
> ·        Is JSDL expressive enough to describe all the needs of a 
> job?  For example, it is unclear how one would specify a requirement 
> for something like a particular instruction set variation of the IA86 
> architecture (e.g. the SSE3 version of the Pentium) or how one would 
> specify that AMD processors are required rather than Intel ones 
> (because the optimized libraries and the optimizations generated by 
> the compiler used will differ for each).  For another example, it is 
> unclear how one would specify that all the compute nodes used for 
> something like an MPI job should have the same hardware.
>
NO. I doubt JSDL is expressive enough in its current state to describe 
the needs of all jobs. We're working with the Information model people 
in the OGSA group at the moment on this please help! I liked some of 
your ideas for this below by the way.
>
> ·        How will JSDL's normative set of enumeration values for 
> things like processor architecture and operating system be kept 
> up-to-date and relevant?  Also, how should things like operating 
> system version get specified in a normative manner that will enable 
> interoperability among multiple clients and job scheduling services?  
> For example, things like Linux and Windows versions are constantly 
> being introduced, each with potentially significant differences in 
> capabilities that a job might depend on.  Without a normative way of 
> specifying these constantly evolving version sets it will be 
> difficult, if not impossible, to create interoperable job submission 
> clients and job scheduling services (including meta-scheduling 
> services where multiple schedulers must interoperate with each other).
>
Agreed. We don't yet have a way to add to the normative enumerations. I 
think you suggest below to move these into a separate document so that 
they can be updated more easily - this would seem a good idea. As for OS 
versioning I have my ideas though JSDL doesn't have a central plan yet. 
Again input here would be appreciated.
>
> ·        Although JSDL specifies a means of including additional 
> non-normative elements and attributes in a document, non-normative 
> extensions make interoperability difficult.  This implies the need for 
> normative extensions to JSDL beyond the Posix extension currently 
> described in the 1.0 specification.  Are there plans to define 
> additional extension profiles to address the above questions 
> surrounding expressive power and normative descriptions of things like 
> current OS types and versions?
>
Yes. The intention with JSDL has always been to produce more normative 
extensions post JSDL 1.0.
>
> ·        If one accepts the need for a variety of extension profiles 
> then this raises the question of what should be in the base case.  For 
> example, it could be argued that data staging -- with its attendant 
> aspects such as mount points and mount sources -- should be defined in 
> an extension rather than in the core specification that will need to 
> cover a variety of systems beyond just Linux/Unix/Posix.  Similarly, 
> one might argue that the base case should focus on what's 
> /functionally/ necessary to execute a job correctly and should leave 
> things that are "optimization hints", such as CPU speed and network 
> bandwidth specifications, to extension profiles.
>
Personally I'd agree with you that file staging should be in an 
extension. Though the view of the group was that most current DRM 
systems which would consume JSDL had file staging as a core element. I 
also agree on the idea of "optimization hints".
>
> ·        How are concepts such as IndividualCPUSpeed and 
> IndividualNetworkBandwidth intended to be defined and used in 
> practice?  I understand the concept of specifying things like the 
> amount of physical memory or disk space that a job will require in 
> order to be able to run.  However, CPU speed and network bandwidth 
> don't represent functional requirements for a job -- meaning that a 
> job will correctly run and produce the same results irrespective of 
> the CPU speed and network bandwidth available to it.  Also, the 
> current definitions seem fuzzy: the megahertz number for a CPU does 
> not tell you how fast a given compute node will be able to execute 
> various kinds of jobs, given all the various hardware factors that can 
> affect the performance of a processor (consider the presence/absence 
> of floating point support, the memory caching architecture, etc.).  
> Similarly, is network bandwidth meant to represent the theoretical 
> maximum of a compute node's network interface card?  Is it expected to 
> take into account the performance of the switch that the compute node 
> is attached to?  Since switch performance is partially a function of 
> the pattern of (aggregate) traffic going through it, the network 
> bandwidth that a job such as an MPI application can expect to receive 
> will depend on the /type/ of communications patterns employed by the 
> application.  How should this aspect of network bandwidth be reflected 
> -- if at all -- in the network bandwidth values that a job requests 
> and that compute nodes advertise?
>
As said above we really need to define this in a separate "profile" 
document.
>
> ·        JSDL is intended for describing the requirements of a job 
> being submitted for execution.  To enable matchmaking between 
> submitted jobs and available computational resources there must also 
> be a way of describing existing/available resources.  While much of 
> JSDL can be used for this purpose, it is also clear that various 
> extensions are necessary.  For example, to describe a compute cluster 
> requires that one be able to specify the resources for each compute 
> node in the cluster (which may be a heterogeneous lot).  Similarly, to 
> describe a compute node with multiple network interfaces would require 
> an extension to the current model, which assumes that only a single 
> instance of such things can exist.  This raises the question of 
> whether something other than JSDL is intended to be used for 
> describing available computational resources or whether there are 
> intensions to extend JSDL to enable it to describe such resources.
>
The writing of a resource description language was something we were 
told we couldn't do in the JSDL group. I do agree that it's now 
important that we have one. I think we'd need to go back to GGF (or 
whatever there name is this week) and ask to set up a group to do this. 
Perhaps we could take all the stuff out of JSDL which is appropriate as 
a starting point?
>
> ·        The current specification stipulates that conformant 
> implementations must be able to parse all the elements and attributes 
> defined in the spec, but doesn't require that any of them be 
> supplied.  Thus, a scheduling service that does nothing could claim to 
> be compliant as long as it can correctly parse JSDL documents.  For 
> interoperability purposes, I would argue that the spec should define a 
> minimum set of elements that any compliant service must be able to 
> supply. Otherwise clients will not be able to make any assumptions 
> about what they can specify in a JSDL document and, in particular, 
> client applications that programmatically submit job submission 
> requests will not be possible since they can't assume that any valid 
> JSDL document will actually be acceptable by any given job submission 
> service.
>
Yes - this is true - though as the current document is a description of 
the JSDL "language" this is  correct. These issues should all be 
clarified in the profile document.
>
> ·        I have a number of questions about data staging:
>
> ·        Although the notions of working directory and environment 
> variables are defined in the posix extension, they are implicitly 
> assuming in the data staging section of the core specification.  This 
> implies to me that either (a) data staging is made an extension or (b) 
> these concepts are made a normative, required part of the core 
> specification.
>
Hmm - well spotted. Personally as I've said I'd like to see it made into 
an extension. This probably need s some discussion on the list.
>
> ·        Recursive directory copying can be specified, but is not 
> required to be supplied by any job submission service.  This makes it 
> difficult to write applications that programmatically define their 
> data staging needs since they cannot in the current design determine 
> whether any given job submission service implements recursive 
> directory copying.  In practice this may mean that programmatically 
> generated job submissions will only ever use lists of individual files 
> to stage.
>
This is a major problem as many of the systems that are currently 
available out there do not support recursive directory copying. Again we 
could clarify the use of this through a HPC profile.
>
> ·        The current definitions of the well-known file systems seem 
> imprecise to me.  In particular:
>
> ·        What are the navigation rules associated with each?  Can you 
> cd out of the subtree that each represents?  ROOT almost certainly 
> does not allow that.  Is there an assumption that one can cd out of 
> HOME or TMP or SCRATCH?  Hopefully not, since that would make these 
> file systems even more Unix/Linux-centric, plus one would now need to 
> specify what clients can expect to see when they do so.
>
Again not defined here. Though I'd assume we can easily say in the 
profile that you can't cd out of it.
>
> ·        What is ROOT intended to be used for?  Are there assumptions 
> about what resides under root?  Are there assumptions about what an 
> application can read/write under the ROOT subtree?  (ROOT also seems 
> like the most Unix-specific of the 4 file system types defined.)
>
Personally I don't have a use for it. Anyone else?
>
> ·        What are the sharing/consistency semantics of each file 
> system in situations where a job is a multi-node application running 
> on something like a cluster?  Is HOME visible to all compute nodes in 
> a data-consistent manner?  I'm guessing that TMP would be assumed to 
> be strictly local to each compute node, so that things like MPI 
> applications would need to be cognizant that they are writing multiple 
> files to multiple separate storage systems when they write to a file 
> in TMP -- and furthermore that data staging of such files after a job 
> has run will result in multiple files that all map to the same target 
> file.
>
Again profile issue.
>
> ·        Can other users write over or delete your data in TMP and/or 
> SCRATCH?  Is data in these file systems visible to other users or does 
> each job get its own private TMP and SCRATCH?
>
Profile.
>
> ·        How long does data in SCRATCH stay around?  Without some 
> normative definition -- or at least a normative lower bound -- on data 
> lifetime clients will have to assume that the data can vanish 
> arbitrarily and things like multi-job workflows will be very difficult 
> to write if they try to take advantage of SCRATCH space to avoid 
> unnecessary data staging actions to/from a computing facility.
>
Profile.
>
> ·        From an interoperability and programmatic submission 
> point-of-view, it is important to know which transports any given job 
> submission service can be expected to support.  This seems like 
> another area where a normative minimal set that all job submission 
> services must implement needs to be defined.
>
This gets very difficult and political! Though we should be able to come 
up with a core set for the profile.
>
> Given these questions, as well as the mandate for the HPC profile to 
> define a simple base interface (that can cover the HPC use case of 
> submitting jobs to a compute cluster), I would like to present the 
> following straw man proposal for feedback from this community:
>
> ·        Restructure the JSDL specification as a small core 
> specification that must be universally implemented -- i.e. not just 
> parsable, but also suppliable by all compliant job submission services 
> -- and a number of optional extension profiles.
>
Hopefully the language as it stands at the moment (with a few 
exceptions) is a good core set. With profiles for different use cases we 
could mandate the implemented side too.
>
> ·        Declare concepts such as executable path, command-line 
> arguments, environment variables, and working directory to be generic 
> and include them in the core JSDL specification rather than the posix 
> extension.  This may enable the core specification to support things 
> like Windows-based jobs (TBD).  The goal here is to define a core JSDL 
> specification that in-and-of-itself could enable job submission to a 
> fairly wide range of execution subsystems, including both the 
> Unix/Linux/Posix world and the Windows world.
>
Why do these need to be in the core? We had problems before in a 
pre-release version when they were in the core as people who wanted to 
do database submissions (and other things) were trying to map these into 
such elements.
>
> ·        Move data staging to an extension.
>
> ·        Create precise definitions of the various concepts introduced 
> in the data staging extension, including normative requirements about 
> whether or not one can change directory up and out of a file system's 
> root directory, etc.
>
> ·        Define which transports are expected to be implemented by all 
> compliant services.
>
Quite possibly - and the use of a profile.
>
> ·        Move the various enumeration types -- e.g. for CPU 
> architecture and OS -- to separate specification documents so that 
> they can evolve without requiring corresponding and constant revision 
> of the core JSDL specification.
>
Sounds good. Even better if we can get someone else to update these for us.
>
> ·        Define extension profiles (eventually, not right away) that 
> enable richer description of hardware and software requirements, such 
> as details of the CPU architecture or OS capabilities.  As part of 
> this, move optimization hints, such as CPU speed and network bandwidth 
> elements out of the JSDL core and into a separate extension profile.
>
This should come from the work we are doing with the Information model 
people - please join in.
>
> ·        Embrace the issue of how to specify available resources at an 
> execution subsystem.  Start by defining a base case that allows the 
> description of compute clusters by creating a compound JSDL document 
> that consists of an outer element that ties together a sequence of 
> individual JSDL elements, each of which describes a single compute 
> node of a compute cluster.  Define an explicit notion of extension 
> profiles that could define other ways of describing computational 
> resources beyond just an array of simple JSDL descriptions.
>
Not entirely sure what you are meaning on this one. Can you explain further.
>
>  
>
> Now, as presented above, my straw man proposal looks like suggestions 
> for changes that might go into a JSDL-1.1 or JSDL-2.0 specification.  
> In the near-term, the HPC profile working group will be exploring what 
> can be done with just JSDL-1.0 and restrictions to that 
> specification.  The restrictions would correspond to disallowing those 
> parts of the JSDL-1.0 specification that the above proposal advocates 
> moving to extension profiles.  It will also explore whether a 
> restricted version of the posix extension could be used to cover most 
> common Windows cases.
>
>  
>
>  
>
> Marvin.
>
OK for those who have made it this far - possibly not many. I'm going to 
propose a JSDL call on this in a new email so all can see it.

steve..

-- 
------------------------------------------------------------------------
Dr A. Stephen McGough                       http://www.doc.ic.ac.uk/~asm
------------------------------------------------------------------------
Technical Coordinator, London e-Science Centre, Imperial College London,
Department of Computing, 180 Queen's Gate, London SW7 2BZ, UK
tel: +44 (0)207-594-8409                        fax: +44 (0)207-581-8024
------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/jsdl-wg/attachments/20060609/688af11f/attachment.html