[DRMAA-WG] DRMAA2 comments

Nadav Brandes nadavbrandes at gmail.com
Fri Apr 29 06:10:39 CDT 2011


Hi guys,

My team and I have finished going over the latest draft of DRMAA2, and we
have some comments, suggestions and questions about it.
We want to hear your opinion about these issues.


   1. Given a *jobId*, you can easily get its *Job* object using the method
   *JobSession::getJobs(in JobInfo filter)*, if you give has as a filter a *
   JobInfo* with the wanted *jobId* (maybe it would be an easier shorthand
   if DRMAA had a method *JobSession::getJob(string jobId)*, but this is a
   different issue). *But*, given a *jobArrayId*, there is no way to get its
   *JobArray* object, which is a great limit of DRMAA that doesn't really
   let users to use the *JobArray* feature in DRMAA as it is used in most
   batch systems. I think that there should be added a similar method
*JobSession::getJobArrays(in
   JobArrayInfo filter)*, or at least a method *JobSession::getJobArray(string
   jobArrayId)*.
   2. A very important feature that many batch systems support is the
   ability to limit the number of jobs in a job array that may run
   simultaneously (in LSF it's called "Slot Limit" and you can read about it at
   http://www-cecpv.u-strasbg.fr/Documentations/lsf
   /html/lsf6.1_admin/G_jobarrays.html#26618). I think that DRMAA can also
   support this feature by:
      1. Change the method *JobSession::runBulkJobs* so it will also accept
      an optional argument *in long slotLimit* (if it's *UNSET* then no slot
      limit will be assigned to the new job array).
      2. Add a new method *JobArray::changeSlotLimit(in long slotLimit)*
   3. There are some parameters that most batch systems allow changing for
   already submitted jobs, but DRMAA doesn't support changing them. For
   example, DRMAA doesn't let you change the priority or queue of an already
   submitted jobs. I think that methods *Job::changePriority(in long
   priority) *and *Job::changeQueue(in string queueName)* should be added.
   4. Many batch systems allow rerunning existing jobs. Although DRMAA has a
   field called *rerunnable* in the *JobTemplate* struct, it doesn't allow
   users to actually rerun jobs. Maybe a method *Job::rerun()* could be
   added to DRMAA.
   5. I have a question. Does DRMAA support Generic Resources? (for example,
   if I have a cluster where some of its nodes have GPU cards, and I want to
   submit jobs that require a certain amount of GPUs, so I would like the batch
   system to manage it for me, as many batch systems know how to manage).


Thank you for reading all of this. I would very like to hear what you think
about each of the bullets above.

Regards,
Nadav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/drmaa-wg/attachments/20110429/a8c80193/attachment.html 


More information about the drmaa-wg mailing list