[DRMAA-WG] Conf call minutes - Jan 20th 2009
Piotr Domagalski
piotr.domagalski at fedstage.com
Wed Jan 21 05:59:50 CST 2009
On Tue, Jan 20, 2009 at 7:50 PM, Daniel Gruber <D.Gruber at sun.com> wrote:
> Undetermined - is it a valid job state?
> -> Yes! Undetermined = Error
> -> Condor: it is permanent some time
> -> Need to clarify if this means "don't try again"
> or "try it again"
But does that mean that undeteimined state will go away and the
function will return an error?
> Distinction between state "failed" and "terminated"
> -> "Failed" := user can fix it (through changes on job template for
> example)
> -> "Terminated" := error the user can't fix
I thought as Terminated as the state the job gets into if it was
drmaa_controll'ed() or possibly deleted locally in DRMS (by admin or
user), but the later may be optional functionality.
> Why should we support extensible state?
> -> basically for reporting
> -> problem: difficult to implement in C
It might be modelled similarly to BES so that there are standard
states that one can additionally inherit from to have more detailed
states. In C it might done in the following way (kind of OOP
programming in C):
typedef struct {
int standard_state;
} drmaa_state_t;
That would be standardised. But the implementation might want to
extend it and then it might actually return:
typedef {
drmaa_state_t super;
int my_own_specific_state;
} drmaa_sge_state_t;
If the "client" wants to use only standard states, it uses a pointer
to the first structure and thus doesn't see the detailed state (e.g.
general hold state + user/admin hold implementation specific). But
when he knows he's using a specific DRMAA implementation it may cast
the general structure to the impl-specific one. Kind of a hack, but
AFAIR it is C standards compliant. Pointers to these two structures
should be interchangeable, because they point to the same place in
memory.
--
Piotr Domagalski
FedStage Systems Ltd.
More information about the drmaa-wg
mailing list