[saga-rg] job states...

Andre Merzky andre at merzky.net
Fri Feb 10 20:54:16 CST 2006


I agree - the file transfer state models are needed for SAGA.
We don't have any actions on these states anyway.

Andre.


Quoting [Christopher Smith] (Feb 11 2006):
> 
> Sure.
> 
> As mentioned ... I think maybe supporting a subset of BES is ok. Much of the
> state model wrt file transfer state modelling I think is not required for
> SAGA.
> 
> -- Chris
> 
> 
> 
> On 10/2/06 18:46, "Andre Merzky" <andre at merzky.net> wrote:
> 
> > Ok, then I'll do that in the strawman.  I would appreciate
> > if you could glance over it after commit, for a sanity
> > check.
> > 
> > Thanks, Andre.
> > 
> > 
> > Quoting [Christopher Smith] (Feb 11 2006):
> >> 
> >> It makes sense to keep the state models in sync.
> >> 
> >> -- Chris
> >> 
> >> 
> >> On 10/2/06 18:26, "Andre Merzky" <andre at merzky.net> wrote:
> >> 
> >>> Quoting [Christopher Smith] (Feb 11 2006):
> >>>> 
> >>>> What I meant by that comment is that where it is a subset, it should
> >>>> reflect
> >>>> the BES terminology. I think that the number of states represented is
> >>>> enough
> >>>> already. ;-)
> >>> 
> >>> Would it make sense to just copy the BES state diagram?
> >>> 
> >>> It did not exist when we (== you ;-) drafted the SAGA job
> >>> states - if it would have been around then, we might have
> >>> had copied it already.
> >>> 
> >>> Apart from the SystemXXX/UserXXX states, and from Hold,
> >>> it is not that much different from the SAGA model anyway.
> >>> 
> >>> Cheers, Andre.
> >>> 
> >>> 
> >>>> -- Chris
> >>>> 
> >>>> 
> >>>> On 10/2/06 17:30, "Andre Merzky" <andre at merzky.net> wrote:
> >>>> 
> >>>>> Hi Chris, 
> >>>>> 
> >>>>> many thanks for the answers! :-)
> >>>>> 
> >>>>>> By the way ... I believe that the state diagram should at least be a
> >>>>>> subset
> >>>>>> of the BES state diagram ... we should adopt the same names.
> >>>>> 
> >>>>> I agree, kind of - I would say that the SAGA job state
> >>>>> diagram should at _most_ be subset of the BES state diagram.
> >>>>> It could be _S_implier :-)
> >>>>> 
> >>>>> Cheers, Andre.
> >>>>> 
> >>>>> 
> >>>>> Quoting [Christopher Smith] (Feb 10 2006):
> >>>>>> Date: Fri, 10 Feb 2006 13:41:18 -0800
> >>>>>> Subject: Re: [saga-rg] job states...
> >>>>>> From: Christopher Smith <csmith at platform.com>
> >>>>>> To: Simple API for Grid Applications WG <saga-rg at ggf.org>
> >>>>>> 
> >>>>>> On 4/2/06 11:18, "Andre Merzky" <andre at merzky.net> wrote:
> >>>>>> 
> >>>>>> Ok ... I'll try to answer these, at least from my viewpoint.
> >>>>>> 
> >>>>>>> 
> >>>>>>> I think that diagram is wrong, isn't it?  Well, here are my
> >>>>>>> questions:
> >>>>>>> 
> >>>>>>>   - if we submit a job, its immediately Queued - is that
> >>>>>>>     right?  Should it be pending before (e.g. as long as the
> >>>>>>>     queuing request travels the middleware layers)?
> >>>>>>> 
> >>>>>> To me, Queued is the same as Pending. Pending is probably a better word
> >>>>>> for
> >>>>>> this. Can't remember where the Queued name came from, as LSF uses PEND.
> >>>>>> 
> >>>>>>>   - can the hold and suspend states reached only from
> >>>>>>>     'Running', or from elsewhere as well?
> >>>>>>> 
> >>>>>> You can only go into a Hold state from Pending, I think, or directly into
> >>>>>> Hold on submission.
> >>>>>> 
> >>>>>>>   - What is the difference between 'Hold' and 'Suspend'?
> >>>>>>> 
> >>>>>> A Hold state tells the scheduler/broker not to consider this job for
> >>>>>> scheduling/dispatch until the hold is explicitly released.
> >>>>>> 
> >>>>>>>   - Are there signals defined (apart from KILL) which shange
> >>>>>>>     the job state?  I guess that is not as simple as saying
> >>>>>>>     SUSP does suspend - that state is probably defined by
> >>>>>>>     the scheduler, not by the OS...
> >>>>>>> 
> >>>>>> Right ... this is implementation dependent on the mechanism used to
> >>>>>> suspend
> >>>>>> a job (might be a signal, might be some other mechanism). What is
> >>>>>> important
> >>>>>> is that there is an operation to initiate the state transition.
> >>>>>> 
> >>>>>>>   - What is the use case for distinguishing between UserHold
> >>>>>>>     and SystemHold, or between UserSuspend and
> >>>>>>>     SystemSuspend?
> >>>>>>>    
> >>>>>> If I preempt workload, the system will put it into a SystemSuspend state
> >>>>>> that a user cannot cause a switch out of, otherwise a system may become
> >>>>>> oversubscribed due to the preempted and preempting jobs running at the
> >>>>>> same
> >>>>>> time. A UserSuspend can be entered and exited by the user, and is often
> >>>>>> used
> >>>>>> to hold processing to check progress, etc.
> >>>>>>  
> >>>>>> 
> >>>>>> By the way ... I believe that the state diagram should at least be a
> >>>>>> subset
> >>>>>> of the BES state diagram ... we should adopt the same names.
> >>>>>> 
> >>>>>> -- Chris
> >>>>> 
> >>>>> 
> >>> 
> >>> 
> > 
> > 



-- 
"So much time, so little to do..."  -- Garfield





More information about the saga-rg mailing list