[DRMAA-WG] Some thoughts on drmaa_wifaborted vs. DRMAA spec

Piotr Domagalski piotr.domagalski at man.poznan.pl
Thu Aug 7 09:36:01 CDT 2008


Hi Roger,

On Thu, Aug 7, 2008 at 3:51 PM, Roger Brobst <rogerb at cadence.com> wrote:
> Without commenting on any specific implementation,
> I feel comfortable that the section of the DRMAA spec
> pertaining to drmaa_wif{exited,signalled,aborted}
> is written as intended.
>
> In particular, once a process is started, it should
> eventually end by exiting or by being signalled.
> If the former case, the exit value should be accessible.
> In the latter case, the signal should be accessible.
>
> Once a process has been started by the DRM (and enters
> the 'running state') wifaborted should never be true
> for the job.

OK. I totally agree -- that does make sense (besides the fact that
there's no equivalent of wifaborted in POSIX). I'm curious about
other's opinions, especially Andreas' as he will probably know the
details of SGE implementation.

When we agree on the consensus, I'll have to look into our LSF and PBS
implementations to have this checked/fixed. As far as I remember,
they're behaving like SGE, i.e. aborted = true for all terminated jobs
whatever their state was.

P.S. I though it'd be appropriate to move this discussion back to
drmaa-wg at ogf.org

-- 
Piotr Domagalski


More information about the drmaa-wg mailing list