[drmaa-wg] Simultaneous waits on the same job id?

Daniel Templeton Dan.Templeton at Sun.COM
Wed Jun 28 08:13:09 CDT 2006


Peter,

I think this issue is different from call-time semantics.  This issue is 
about the use of ALL being forgiving.  For example, if I do a 
control(ALL, TERMINATE) and one of the jobs ends before the control() 
call can kill it, I would not expect the call to fail.  Same thing with 
synchronize().  If I do a synchronize(ALL) and a job is reaped before 
the call completes, that's OK.  To me, that's a very different statement 
from saying that the ALL constant represents all of the jobs that were 
submitted at the time of the call, and I think that should be clarified 
in the spec for synchronize() and control().

The statement that I think we're making here is that a call that uses 
the ALL constant will operate of the list of jobs as it was at 
submission time minus any jobs that go out of scope for the operation 
during the run time of the operation.

Am I wrong?

Daniel

Peter Troeger wrote:
>
>> Ed,
>>
>> What Andreas is trying to get at is that since synchronize() doesn't 
>> return job information, so multiple simultaneous synchronize() calls 
>> will all succeed.  Even if synchronize() reaps job info, since other 
>> calls to synchronize() don't care whether they actually are able to 
>> find the job info, it's all good.  As long as synchronize() is 
>> running, reaped jobs are viewed as simply having ended, so not even a 
>> call to wait() will cause a call to synchronize() to fail.
>>
>> Is that really correct?  Can anyone else confirm?  I think we should 
>> probably have a tracker to clarify that point in the (g2) spec.
>
> I confirm. The DRMAA group agreed (in another discussion) that sync() 
> has call-time context semantics. The text from the spec says more or 
> less the same. Therefore, your synchronize call should succeed if jobs 
> 5,6,7 existed at the time when thread 2 invokes the operation.
>
> Peter.
>
>>
>> Daniel
>>
>> Andreas.Haas at Sun.COM wrote:
>>> On Fri, 23 Jun 2006, Ed Baskerville wrote:
>>>
>>>> OK, I'll do that. But one more question: how do these issues apply 
>>>> to synchronize? Consider the following sequence:
>>>>
>>>> thread 1: wait(job id 5)
>>>> thread 2: synchronize(job ids 5,6,7)
>>>> [job id 5 finishes]
>>>> ¿thread 2 synchronize call fails?
>>>> [job id 7 finishes]
>>>> [job id 6 finishes]
>>>> ¿thread 2 synchronize call succeeds?
>>>>
>>>> Should synchronize fail with INVALID_JOB as soon as any of the ids 
>>>> it's waiting on are reaped? Or should it eventually succeed?
>>>
>>> There is no reason for synchronize to fail, as long as none of the 
>>> jobs was reaped when synchronize() gets issued.
>>>
>>> Andreas
>>
>





More information about the drmaa-wg mailing list