[drmaa-wg] Simultaneous waits on the same job id?
Daniel Templeton
Dan.Templeton at Sun.COM
Wed Jun 28 08:13:09 CDT 2006
Peter,
I think this issue is different from call-time semantics. This issue is
about the use of ALL being forgiving. For example, if I do a
control(ALL, TERMINATE) and one of the jobs ends before the control()
call can kill it, I would not expect the call to fail. Same thing with
synchronize(). If I do a synchronize(ALL) and a job is reaped before
the call completes, that's OK. To me, that's a very different statement
from saying that the ALL constant represents all of the jobs that were
submitted at the time of the call, and I think that should be clarified
in the spec for synchronize() and control().
The statement that I think we're making here is that a call that uses
the ALL constant will operate of the list of jobs as it was at
submission time minus any jobs that go out of scope for the operation
during the run time of the operation.
Am I wrong?
Daniel
Peter Troeger wrote:
>
>> Ed,
>>
>> What Andreas is trying to get at is that since synchronize() doesn't
>> return job information, so multiple simultaneous synchronize() calls
>> will all succeed. Even if synchronize() reaps job info, since other
>> calls to synchronize() don't care whether they actually are able to
>> find the job info, it's all good. As long as synchronize() is
>> running, reaped jobs are viewed as simply having ended, so not even a
>> call to wait() will cause a call to synchronize() to fail.
>>
>> Is that really correct? Can anyone else confirm? I think we should
>> probably have a tracker to clarify that point in the (g2) spec.
>
> I confirm. The DRMAA group agreed (in another discussion) that sync()
> has call-time context semantics. The text from the spec says more or
> less the same. Therefore, your synchronize call should succeed if jobs
> 5,6,7 existed at the time when thread 2 invokes the operation.
>
> Peter.
>
>>
>> Daniel
>>
>> Andreas.Haas at Sun.COM wrote:
>>> On Fri, 23 Jun 2006, Ed Baskerville wrote:
>>>
>>>> OK, I'll do that. But one more question: how do these issues apply
>>>> to synchronize? Consider the following sequence:
>>>>
>>>> thread 1: wait(job id 5)
>>>> thread 2: synchronize(job ids 5,6,7)
>>>> [job id 5 finishes]
>>>> ¿thread 2 synchronize call fails?
>>>> [job id 7 finishes]
>>>> [job id 6 finishes]
>>>> ¿thread 2 synchronize call succeeds?
>>>>
>>>> Should synchronize fail with INVALID_JOB as soon as any of the ids
>>>> it's waiting on are reaped? Or should it eventually succeed?
>>>
>>> There is no reason for synchronize to fail, as long as none of the
>>> jobs was reaped when synchronize() gets issued.
>>>
>>> Andreas
>>
>
More information about the drmaa-wg
mailing list