[DRMAA-WG] TERMINATED vs. FAILED - reloaded

Mariusz Mamoński mamonski at man.poznan.pl
Tue May 5 16:24:37 CDT 2009


2009/5/5 Peter Tröger <peter at troeger.eu>:
> Dear all,
>
> The March 31th conference call decided upon the following strategy
> regarding job state model extension:
>
> --- snip
>
>> 4. TERMINATED vs. FAILED state discussion:
>> http://www.ogf.org/pipermail/drmaa-wg/2009-March/001012.html
>
> Option 2 from the original mail is now highly preferred. TERMINATED
> state should express that an external entity (e.g. user or DRM system)
> stopped the job before finishing. For POSIX-aligned systems, this
> could be formulated as reception of a signal by "the job". In
> contrast, FAILED state now expresses that the application stopped on
> its own before finishing. For POSIX-aligned systems, this could be
> formulated as reception of a signal "by the job's application process".
>
> We ask for comments from PBS and LSF experts (FedStage ?!?). Do these
> systems provide enough error information to distinguish between these
> two states  ? For SGE and Condor, Dan and Peter already agreed.
In LSF it seems to be feasible (by checking all events related with the job)
>
> --- snip
>
> Piotr from FedStage informed me that the proposed distinction seems not
> to be implementable in PBS. One solution could be to detect the
> 'requested' termination only in the DRMAA library. Dan already expressed
> that this would not reflect the original idea. An intentional job
> termination by another user would then lead to FAILED instead of TERMINATED.
>
> Since we already rejected Option 1 and 3 in the last phone calls, we
> come out with Option 4 as last solution: There will be no new TERMINATED
> state. The new job sub-state concept will allow to express the job
> failure details, but only in a DRM-specific way.
>
> We will finally vote about this in the next call.
>
> Best regards,
> Peter.
>
>
> --
>  drmaa-wg mailing list
>  drmaa-wg at ogf.org
>  http://www.ogf.org/mailman/listinfo/drmaa-wg
>



-- 
Mariusz Mamonski


More information about the drmaa-wg mailing list