[drmaa-wg] Questions
Peter Troeger
peter.troeger at hpi.uni-potsdam.de
Thu Apr 7 04:09:49 CDT 2005
>> The routine return code would need to indicate a compound error; BTW we
>> do not have such error code defined, and the detailed error message
>> would need to detail what happened.
>
>
> In other words, the spec completely fails to address this case.
> Something to keep in mind for 1.1 or 2.0.
I added a tracker item for the issue.
>>> When doing a drmaa_control(DRMAA_JOB_IDS_SESSION_ALL), what is the
>>> contract on failure, i.e. in what state will the jobs be left? In the
>>> case of a job failure, does that mean that all jobs will be left in the
>>> state that they were in before the call? If so, that's going to cause
>>> serious implementation problems. If not, that's going to cause serious
>>> usability problems.
>>
>> Transactional interface would be quite useful here ...
>> If a routine exits/fails during the call there is no good recourse.
>
> Exactly the point I'm making. Without transactions, it's hard to use.
> With transactions, it's hard to implement.
To demand a transactional behavior seems to me non-realistic. Most other
groups (e.g. OGSA) have similar problems, take for example the
SetResourceProperties operation in WS-ResourceProperties specification
(chapter 7). The usual approach is to declare the problem as
implementation-dependent.
>>> (DRMAA_JOB_IDS_SESSION_ALL), but another thread "steals" the job exit
>>> info with a call to drmaa_wait()? I would assume that the synchronize
>>> thread should just assume that the job finished, even though its job
>>> record is gone. That is what the SGE implementation does.
>>
>>
>> Ha, races with job reaping info. The developers would need to be
>> careful in multithreaded environments ... some guidelines would be
>> necessary, but preferably outside of the normative docs.
>
>
> The reason I bring it up is that this particular case is non-obvious.
> It's clear that waiting for the same job twice is bad, but it's not so
> clear when waiting for any or all.
The result seems to be that we need more clarification about
multithreading issues in the spec. Is it worthwhile to open a tracker
item for this, in order to collect all the specific findings ?
Regards,
Peter.
.
More information about the drmaa-wg
mailing list