[DRMAA-WG] wifexited and wifsignalled confusion continues

Roger Brobst rogerb at cadence.com
Wed Nov 12 09:26:21 CST 2008


Hi Piotr.

I believe during a drmaa teleconf (over a year ago)
it was agreed that the single testcase which validates
a wide range of exit codes should be split into two
testcases (one for below 128, the other for above).
I haven't had an opportunity to dig through the archives
to substantiate my recollection.

I think the suggestion to handle 126 and 127 specially 
deserves additional discussion ... but introduces its
own issues:

If the command is a shell script like:
    #!/bin/sh
    sleep 30  # or solve the world's problems
    exec /some/nonExistant/program

I would expect the shell to exit with status 126
(because /some/nonExistant/program was not found).

It would be incorrect for the parent of the shell
to interpret this as 'job never started' since the
shell could perform any number of tasks before the
failed exec.

-Roger

----Original Message----
From: "Piotr Domagalski" <piotr.domagalski at fedstage.com>
Subject: Re: [DRMAA-WG] wifexited and wifsignalled confusion continues
Date: Wed, 12 Nov 2008 13:10:19 +0100

Would you mind relaxing it even more? I.e. to test only codes from 0 to 125?

Reading "man 1posix exit":

RATIONALE
       As  explained in other sections, certain exit status values
have been reserved for special uses and should be used by applications
only for those purposes:
        126   A file to be executed was found, but it was not an
executable utility.
        127   A utility to be executed was not found.
       >128   A command was interrupted by a signal.

This way, we could interpret, at DRMAA implementation level, 126 and
127 exit codes so that the job would get DRMAA_PS_FAILED and
drmaa_wifaborted() = true because of wrong executable, instead of
getting exit status of 126 or 127 and leaving the interpretation up to
the user.

-- 
Piotr Domagalski


More information about the drmaa-wg mailing list