[drmaa-wg] DRMAA test suite

Peter Troeger peter.troeger at hpi.uni-potsdam.de
Fri Jun 23 04:48:06 CDT 2006


> Looking through the IDL spec, it says that drmaa_wait(ANY) will only
> work on jobs submitted up to the time of the drmaa_wait() call.  I don't
> like that.  For drmaa_synchronize(ALL), it makes sense, because
> otherwise the call would block indefinitely in an active system.  With
> drmaa_wait(), however, that change prevents a very useful use case.  Say
> I want to write a thread that waits for jobs to end and places their
> finish information in a data structure for other threads to read.  With
> that caveat applied, if I submit one very long-running job before
> drmaa_wait() gets called, the hundreds of really short jobs that I
> submit after the drmaa_wait() call have to wait for the long-running job
> to end so that the next call to drmaa_wait() can see them.  That's bad,
> and I don't see where it makes anything better.  What problem does
> limiting drmaa_wait() to previously submitted jobs solve?

We had so much discussion around the drmaa_wait semantics, I am not sure
what the exact reason was. For me, it seems like the same argumentation
as with drmaa_synchronize. The drmaa_wait() call relies on some current
state of all the jobs in the session. I know that I submitted 3 jobs so
far, and now I want to wait for all of them. If we allow other threads
to extend the session while drmaa_wait() is running, you need to clarify
the point of synchronization within the running drmaa_wait() call. It's
harder to implement.

In your particular example, my expectation would be that the second
thread also calls drmaa_wait() in parallel. In this case, our modified
text from the latest DRMAA doc can be applied:

-- snip

In a multithreaded environment, only the active thread gets the
status of the finished or failed job in that case, while the rest of the
threads continue waiting. If there are no more running or completed jobs
the routine SHOULD return DRMAA_ERRNO_INVALID_JOB error.

-- snip

We can summarize that drmaa_wait(SESSION_ANY) is always a bad idea when
multiple threads submitting jobs. In order to get a consistent picture,
it seems to be appropriate to define the function call as
"synchronization point", where the session state "at this time" acts as
input to the method.


Peter.



-------------- next part --------------
A non-text attachment was scrubbed...
Name: peter.troeger.vcf
Type: text/x-vcard
Size: 538 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/drmaa-wg/attachments/20060623/abcb53b6/attachment.vcf 


More information about the drmaa-wg mailing list