[drmaa-wg] Simultaneous waits on the same job id?

Daniel Templeton Dan.Templeton at Sun.COM
Fri Jun 23 18:22:47 CDT 2006


Ed,

I'm with you 100%, but this was discussed at length, and the group 
(minus me) decided that it was important to follow the POSIX wait(4) 
semantics, in which only one thread will succeed in waiting and the 
others will fail when the job is reaped.

At some point I had found another POSIX wait variant which allowed all 
waiting threads to receive the exit status information, but I have since 
forgotten what it was, and Sun's new email retention policy has deleted 
the email.

Daniel

Ed Baskerville wrote:
> This implies that the second call should fail when interpreted in the 
> context of a multithreaded application, but it doesn't really seem to 
> be written with a mulithreaded application. There's no error code that 
> makes sense here: INVALID_JOB implies that the job data has already 
> been reaped, but that's not necessarily true, because you could have 
> something like this:
>
> thread 1: wait(jobId)
> thread 2: wait(jobId), immediately returns INVALID_JOB_ID because 
> there's already a wait in progress
> thread 1: wait times out
> thread 2: wait(jobId)...completes successfully
>
> So thread 2 is first told that the job data has already been reaped, 
> then told that the job is valid (because thread 1 happened to time 
> out). That's just weird.
>
> Another option is to simply not return INVALID_JOB_ID until the data 
> *has* been reaped (or not), but that seems weird too--why make 
> subsequent threads wait if they're probably just going to get an error 
> message?
>
> If this hasn't been decided, I would propose that a provision be added 
> saying that multiple threads are allowed to wait simultaneously, and 
> *all of them* get back the job data. It's not too hard to implement, 
> at least for Xgrid, and the semantics seem cleaner.
>
> --Ed
>
>
> On Jun 23, 2006, at 1:52 PM, Rajic, Hrabri wrote:
>
>> Good question.  There is no such provision in the spec.  One thread
>> would need to be the first ...
>>
>> Hrabri
>>
>>
>>> -----Original Message-----
>>> From: owner-drmaa-wg at ggf.org [mailto:owner-drmaa-wg at ggf.org] On Behalf
>> Of
>>> Ed Baskerville
>>> Sent: Friday, June 23, 2006 3:32 PM
>>> To: DRMAA Working Group
>>> Subject: [drmaa-wg] Simultaneous waits on the same job id?
>>>
>>> With all the discussion of wait in multithreaded contexts, I thought
>>> I'd throw out another related question...
>>>
>>> Are multiple threads allowed to wait simultaneously on the same job
>>> id and get back results, or is it required that one of them gets back
>>> DRMAA_ERRNO_INVALID_JOB? That is, is it possible for the data to be
>>> reaped simultaneously for multiple waiting threads, or must only one
>>> of them be lucky enough to get the data back?
>>>
>>> For Xgrid, either way is straightforward to implement; obviously
>>> having the option of returning data to multiple simultaneous calls
>>> would be nice, but I want to get the semantics right.
>>>
>>> --Ed
>>
>





More information about the drmaa-wg mailing list