[drmaa-wg] Simultaneous waits on the same job id?

Ed Baskerville lists at edbaskerville.com
Fri Jun 23 16:07:50 CDT 2006


This implies that the second call should fail when interpreted in the  
context of a multithreaded application, but it doesn't really seem to  
be written with a mulithreaded application. There's no error code  
that makes sense here: INVALID_JOB implies that the job data has  
already been reaped, but that's not necessarily true, because you  
could have something like this:

thread 1: wait(jobId)
thread 2: wait(jobId), immediately returns INVALID_JOB_ID because  
there's already a wait in progress
thread 1: wait times out
thread 2: wait(jobId)...completes successfully

So thread 2 is first told that the job data has already been reaped,  
then told that the job is valid (because thread 1 happened to time  
out). That's just weird.

Another option is to simply not return INVALID_JOB_ID until the data  
*has* been reaped (or not), but that seems weird too--why make  
subsequent threads wait if they're probably just going to get an  
error message?

If this hasn't been decided, I would propose that a provision be  
added saying that multiple threads are allowed to wait  
simultaneously, and *all of them* get back the job data. It's not too  
hard to implement, at least for Xgrid, and the semantics seem cleaner.

--Ed


On Jun 23, 2006, at 1:52 PM, Rajic, Hrabri wrote:

> Good question.  There is no such provision in the spec.  One thread
> would need to be the first ...
>
> Hrabri
>
>
>> -----Original Message-----
>> From: owner-drmaa-wg at ggf.org [mailto:owner-drmaa-wg at ggf.org] On  
>> Behalf
> Of
>> Ed Baskerville
>> Sent: Friday, June 23, 2006 3:32 PM
>> To: DRMAA Working Group
>> Subject: [drmaa-wg] Simultaneous waits on the same job id?
>>
>> With all the discussion of wait in multithreaded contexts, I thought
>> I'd throw out another related question...
>>
>> Are multiple threads allowed to wait simultaneously on the same job
>> id and get back results, or is it required that one of them gets back
>> DRMAA_ERRNO_INVALID_JOB? That is, is it possible for the data to be
>> reaped simultaneously for multiple waiting threads, or must only one
>> of them be lucky enough to get the data back?
>>
>> For Xgrid, either way is straightforward to implement; obviously
>> having the option of returning data to multiple simultaneous calls
>> would be nice, but I want to get the semantics right.
>>
>> --Ed
>





More information about the drmaa-wg mailing list