[DRMAA-WG] Comments on latest DRMAA documents

Daniel Templeton Dan.Templeton at Sun.COM
Thu Jul 26 17:52:49 CDT 2007


Piotr Domagalski wrote:
> Hi,
>
> We still have some comments on DRMAA specs:
>
> 1. The IDL version says that drmaa_wait(SESSION_ANY) should wait only
> on those jobs that were in the session up until the drmaa_wait call.
> The latest DRMAA 1.0 spec (June, 2007) doesn't mention it.
>
> I remember quite long discussion [1] where Daniel argued that this
> waiting semantics prevents some useful use cases. As I don't see any
> clear consensus in that discussion, am I right assuming that the IDL
> version has the final agreed semantics?
>   

Yep.  The IDL doc is the final agreement.  I gave up that fight a long 
time ago. :)

> 2. I remember asking questions about DRMAA semantics concerning the
> behaviour of drmaa_job_ps on finished (drmaa_wait'ed) jobs [2]. In the
> latest DRMAA 1.0 spec, in 3.1.2 it says:
>
> "Succesfull drmaa_wait() and drmaa_synchronize(), with dispose = true
> parameter, calls will make job id's invalid by reaping the job run
> usage data.".
>
> I think this is the only hint in the spec on what might happen to
> drmaa_job_ps on disposed jobs. In [2] Peter suggested adding something
> more clear but I can't seen anything like that in current docs. OK, so
> I firstly assume that the term "job run usage data" doesn't mean that
> only the "rsuage" structure used in drmaa_wait is disposed, but the
> whole jobid is from now on considered invalid, so that drmaa_job_ps
> returns INVALID_JOB. This term, especially with the next point 3.1.3
> talking about "run usage data" with other meaning, is a bit confusing.
>   

I guess we need to open a tracker to clarify the behavior of the 
drmaa_job_ps routine when a job has been reaped.

> But assuming I was right in the previous paragraph, how is one
> supposed to implement drmaa_job_ps which supports jobids from other
> sessions (or jobs submitted not using DRMAA at all)? When I call
> drmaa_wait, it disposes the job, so drmaa_job_ps now returns
> INVALID_JOB, but If I ask for the status in another session, it will
> return it happily. Is this how it's supposed to work?
>   

Submitting a job creates a record in the session context.  When that job 
ends, that job's status can be permanently set in the record as DONE or 
FAILED.  When the job is reaped, the record is reaped.  Now, before a 
job has finished, or when querying a job from outside the submitting 
session, the drmaa_job_ps routine must talk to the DRM to get the job's 
status.  After a job has finished, the DRM is allowed to claim the job 
doesn't exist.  As long as you're in the same session that submitted the 
job, and you haven't reaped the job, you can use the job's session 
record instead of asking the DRM.  If the job has been reaped or you're 
not in the same session, if the DRM doesn't know about the job, you get 
an INVALID_JOB.  If the DRM always knows about every job that was ever 
submitted, then you'll never get an INVALID_JOB, even if the job has 
been reaped.

Daniel

> [1] http://www.ogf.org/pipermail/drmaa-wg/2006-June/000505.htm
> [2] http://www.ogf.org/pipermail/drmaa-wg/2006-May/000453.html
>
>   



More information about the drmaa-wg mailing list