[DRMAA-WG] DRMAA2 comments

Peter Tröger peter at troeger.eu
Fri Apr 29 07:53:04 CDT 2011


Hi Nadav,

thanks (again) for your in-depth analysis. Here are my comments.

> Given a jobId, you can easily get its Job object using the method JobSession::getJobs(in JobInfo filter), if you give has as a filter a JobInfo with the wanted jobId (maybe it would be an easier shorthand if DRMAA had a method JobSession::getJob(string jobId), but this is a different issue). But, given a jobArrayId, there is no way to get its JobArray object, which is a great limit of DRMAA that doesn't really let users to use the JobArray feature in DRMAA as it is used in most batch systems. I think that there should be added a similar method JobSession::getJobArrays(in JobArrayInfo filter), or at least a method JobSession::getJobArray(string jobArrayId).
Symmetry is always good, I see no problem with adding "JobSession::getJobArrays(in JobArrayInfo filter)".
> A very important feature that many batch systems support is the ability to limit the number of jobs in a job array that may run simultaneously (in LSF it's called "Slot Limit" and you can read about it at http://www-cecpv.u-strasbg.fr/Documentations/lsf/html/lsf6.1_admin/G_jobarrays.html#26618). I think that DRMAA can also support this feature by:
> Change the method JobSession::runBulkJobs so it will also accept an optional argument in long slotLimit (if it's UNSET then no slot limit will be assigned to the new job array).
> Add a new method JobArray::changeSlotLimit(in long slotLimit)
This is what JobTemplate::maxSlots is expected to provide.

> There are some parameters that most batch systems allow changing for already submitted jobs, but DRMAA doesn't support changing them. For example, DRMAA doesn't let you change the priority or queue of an already submitted jobs. I think that methods Job::changePriority(in long priority) and Job::changeQueue(in string queueName) should be added.
We discussed the general possibility of changing the attributes of running jobs. There are tons of issues with making such a concept available in a generalized API. One reason are hidden changes of attributes by the DRM system on queuing time - Grid Engine is one example. In such a case, you cannot know what kind of job attribute state your are actually changing. So you need better monitoring. And so on ... The possibilities and supported attributes for online changes also vary widely in the different systems. 
For this reason, DRMAA intentionally leaves out the complete idea - at least until enough people complain ;-) 
> Many batch systems allow rerunning existing jobs. Although DRMAA has a field called rerunnable in the JobTemplate struct, it doesn't allow users to actually rerun jobs. Maybe a method Job::rerun() could be added to DRMAA.
The rerunnable flag is intended to allow the DRM system itself re-running a job. We never had a proposal for such a functionality from user perspective. What would be the expected job state flow in this case ? And what is the use case of having such functionality, if you don't have interactive job support ?
> I have a question. Does DRMAA support Generic Resources? (for example, if I have a cluster where some of its nodes have GPU cards, and I want to submit jobs that require a certain amount of GPUs, so I would like the batch system to manage it for me, as many batch systems know how to manage).
Requesting non-standardized resource types and configurations is expected to be covered by the "jobCategory" concept. Examples for job categories are different MPI libraries, OpenMP environments, Java environments, or GPU environments. We hope to organize a community-based list of recommended job category names, which would raise the chances for portability with such job submission applications. Later DRMAA2 version then could integrate these names as official part of the spec.

Best regards,
Peter.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/drmaa-wg/attachments/20110429/298c83ab/attachment-0001.html 


More information about the drmaa-wg mailing list