[DRMAA-WG] normal exit status causes drmaa_wifaborted

Tim Harsch harsch1 at llnl.gov
Wed Mar 28 19:07:40 CDT 2007


Hi Daniel,
    Thanks so much for your help.  I'm still trying to determine the 
problem, but your help has gotten me further along I think.  My apologies 
about the setup in that script I sent.  It was a left over from the original 
test case...  after reading your message I altered the script to have just 
"csh" as the cmd, and two args "-c" and "'exit 1'".  I got the same results. 
I ran your Java test, which gave me the results you show here.  My next step 
is to mimic your Java test, with a bare min Perl test and see if they 
produce same results.

    I'll get back to you soon on that...

For now, thanks a bunch!

----- Original Message ----- 
From: "Daniel Templeton" <Dan.Templeton at Sun.COM>
To: "Tim Harsch" <harsch1 at llnl.gov>
Cc: "DRMAA-WG" <drmaa-wg at gridforum.org>
Sent: Wednesday, March 28, 2007 10:48 AM
Subject: Re: [DRMAA-WG] normal exit status causes drmaa_wifaborted


> Tim,
>
> Looks like something localized to the Perl binding or your
> configuration.  I did the same test on the Java language binding, which
> is also based on the C binding, and it worked fine for me.  Output
> below, program attached.
>
> Could the problem be that you're sending the full command line as the
> remote command and "1" as the args, instead of "csh" as the remote
> command and "-c", "'exit 1'" as the args?  What is the meaning of
> setting the args to "1"?
>
> ---
>
> % java -cp /sge/lib/drmaa.jar:. -d64 Test
> Exited: true
> Aborted: false
> Signaled: false
>
> ---
>
> Daniel
>
> Tim Harsch wrote:
>> I don't understand why causing a simple non-zero exit status is
>> causing drmaa_wifaborted to be set.
>>
>> The easiest way for me to demo this is to change line 38 of
>> t/08_posix_tests.t of the Schedule::DRMAAc CPAN module to be
>> my $remote_cmd = "csh -c 'exit 1'";
>>
>> And then running "make test TEST_VERBOSE=1", which would produce:
>> <SNIP>
>> ok 12 - drmaa_wait says jobid did not change?
>> #     Failed test (t/08_posix_tests.t at line 83)
>> not ok 13 - drmaa_wait should say there is more info available in
>> POSIX funcs
>> ok 15 - drmaa_wifaborted error?
>> #     Failed test (t/08_posix_tests.t at line 90)
>> not ok 16 - normal job should not abort.
>> ok 17 - drmaa_wifexited returned 3 of 3 args
>> ok 18 - drmaa_wifexited error?
>> #     Failed test (t/08_posix_tests.t at line 97)
>> not ok 19 - normal job should exit.
>> <SNIP>
>>
>> I've attached test 8 to this email, in case you want to see how the
>> calls are made in Perl.
>>
>> Any ideas?
>>
>> Thanks,
>> Tim Harsch
>> ------------------------------------------------------------------------
>>
>> --
>>   drmaa-wg mailing list
>>   drmaa-wg at ogf.org
>>   http://www.ogf.org/mailman/listinfo/drmaa-wg
>>
>
>


--------------------------------------------------------------------------------


> import org.ggf.drmaa.*;
>
> public class Test {
> public static void main(String[] args) throws Exception {
> Session s = SessionFactory.getFactory().getSession();
> s.init("");
> JobTemplate jt = s.createJobTemplate();
> jt.setRemoteCommand("/usr/bin/csh");
> jt.setArgs(new String[] {"-c", "'exit 1'"});
> String job = s.runJob(jt);
> JobInfo ji = s.wait(job, s.TIMEOUT_WAIT_FOREVER);
> System.out.println("Exited: " + ji.hasExited());
> System.out.println("Aborted: " + ji.wasAborted());
> System.out.println("Signaled: " + ji.hasSignaled());
> s.deleteJobTemplate(jt);
> s.exit();
> }
> }
> 


More information about the drmaa-wg mailing list