[DRMAA-WG] wifexited and wifsignalled confusion continues

Peter Tröger peter at troeger.eu
Thu Nov 6 02:09:49 CST 2008


Hi,

>> Yes, if a DRM uses a shell to start the client-specified program
>> and the shell uses the convention of conveying that the child
>> was terminated by exiting with 128+sigNum, then the DRM may not
>> be able to distinguish between a child exit(137) and being
>> terminated by sigNum=9.
>> This is a DRM implementation issue.
> 
> I was actually hoping for some discussion as to how should DRMAA
> implementation should look like in this case. And also (that's mainly
> for Peter), what should the test suite look like. For example, now it
> tests exit statuses 0..255 which would obviously fail if we wanted to
> assume that drmaa_wifexited is true only for 0..128 and use the
> remaining values for signal numbers.

I am not the C binding expert, even though I am maintaining the test 
suite. Most test cases were originally written for SGE, and therefore 
could be way too specific. We already relaxed a lot of tests, in order 
to fit better to the spec itself. This sounds like just another case. If 
you guys agree on 128, we can put that in.

> The thing is that current test suite (again, Peter?) tests whether a
> signalled DRMAA's job was both wifsignaled and wifexited. That kind of
> puzzled me.

This is a bug, and should have been fixed since test suite 1.6.0 (check 
the CHANGELOG). We had this discussion before. Please, send me a patch.

Thanks for helping, Piotr !

/Peter.



More information about the drmaa-wg mailing list