[saga-rg] job states...

Rajic, Hrabri hrabri.rajic at intel.com
Mon Feb 13 12:14:07 CST 2006


They are not.  Grid is a richer environment. 

Hrabri

>-----Original Message-----
>From: owner-saga-rg at ggf.org [mailto:owner-saga-rg at ggf.org] On Behalf Of
>Thilo Kielmann
>Sent: Sunday, February 12, 2006 1:09 AM
>To: Andre Merzky
>Cc: Christopher Smith; Simple API for Grid Applications WG
>Subject: Re: [saga-rg] job states...
>
>Curious question:
>
>SAGA should align job states with both BES and DRMAA ?
>Are they the same to begin with?
>
>Thilo
>
>On Sat, Feb 11, 2006 at 03:54:16AM +0100, Andre Merzky wrote:
>> X-Original-To: kielmann at localhost
>> Delivered-To: kielmann at localhost.cs.vu.nl
>> Delivered-To: grdfm-saga-rg-outgoing at mailbouncer.mcs.anl.gov
>> X-Original-To: grdfm-saga-rg at mailbouncer.mcs.anl.gov
>> Delivered-To: grdfm-saga-rg at mailbouncer.mcs.anl.gov
>> Date: Sat, 11 Feb 2006 03:54:16 +0100
>> From: Andre Merzky <andre at merzky.net>
>> To: Christopher Smith <csmith at platform.com>
>> Cc: Andre Merzky <andre at merzky.net>,
>> 	Simple API for Grid Applications WG <saga-rg at ggf.org>
>> Subject: Re: [saga-rg] job states...
>> X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at
>mailbouncer.mcs.anl.gov
>> X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at
>mailbouncer.mcs.anl.gov
>>
>> I agree - the file transfer state models are needed for SAGA.
>> We don't have any actions on these states anyway.
>>
>> Andre.
>>
>>
>> Quoting [Christopher Smith] (Feb 11 2006):
>> >
>> > Sure.
>> >
>> > As mentioned ... I think maybe supporting a subset of BES is ok.
Much
>of the
>> > state model wrt file transfer state modelling I think is not
required
>for
>> > SAGA.
>> >
>> > -- Chris
>> >
>> >
>> >
>> > On 10/2/06 18:46, "Andre Merzky" <andre at merzky.net> wrote:
>> >
>> > > Ok, then I'll do that in the strawman.  I would appreciate
>> > > if you could glance over it after commit, for a sanity
>> > > check.
>> > >
>> > > Thanks, Andre.
>> > >
>> > >
>> > > Quoting [Christopher Smith] (Feb 11 2006):
>> > >>
>> > >> It makes sense to keep the state models in sync.
>> > >>
>> > >> -- Chris
>> > >>
>> > >>
>> > >> On 10/2/06 18:26, "Andre Merzky" <andre at merzky.net> wrote:
>> > >>
>> > >>> Quoting [Christopher Smith] (Feb 11 2006):
>> > >>>>
>> > >>>> What I meant by that comment is that where it is a subset, it
>should
>> > >>>> reflect
>> > >>>> the BES terminology. I think that the number of states
represented
>is
>> > >>>> enough
>> > >>>> already. ;-)
>> > >>>
>> > >>> Would it make sense to just copy the BES state diagram?
>> > >>>
>> > >>> It did not exist when we (== you ;-) drafted the SAGA job
>> > >>> states - if it would have been around then, we might have
>> > >>> had copied it already.
>> > >>>
>> > >>> Apart from the SystemXXX/UserXXX states, and from Hold,
>> > >>> it is not that much different from the SAGA model anyway.
>> > >>>
>> > >>> Cheers, Andre.
>> > >>>
>> > >>>
>> > >>>> -- Chris
>> > >>>>
>> > >>>>
>> > >>>> On 10/2/06 17:30, "Andre Merzky" <andre at merzky.net> wrote:
>> > >>>>
>> > >>>>> Hi Chris,
>> > >>>>>
>> > >>>>> many thanks for the answers! :-)
>> > >>>>>
>> > >>>>>> By the way ... I believe that the state diagram should at
least
>be a
>> > >>>>>> subset
>> > >>>>>> of the BES state diagram ... we should adopt the same names.
>> > >>>>>
>> > >>>>> I agree, kind of - I would say that the SAGA job state
>> > >>>>> diagram should at _most_ be subset of the BES state diagram.
>> > >>>>> It could be _S_implier :-)
>> > >>>>>
>> > >>>>> Cheers, Andre.
>> > >>>>>
>> > >>>>>
>> > >>>>> Quoting [Christopher Smith] (Feb 10 2006):
>> > >>>>>> Date: Fri, 10 Feb 2006 13:41:18 -0800
>> > >>>>>> Subject: Re: [saga-rg] job states...
>> > >>>>>> From: Christopher Smith <csmith at platform.com>
>> > >>>>>> To: Simple API for Grid Applications WG <saga-rg at ggf.org>
>> > >>>>>>
>> > >>>>>> On 4/2/06 11:18, "Andre Merzky" <andre at merzky.net> wrote:
>> > >>>>>>
>> > >>>>>> Ok ... I'll try to answer these, at least from my viewpoint.
>> > >>>>>>
>> > >>>>>>>
>> > >>>>>>> I think that diagram is wrong, isn't it?  Well, here are my
>> > >>>>>>> questions:
>> > >>>>>>>
>> > >>>>>>>   - if we submit a job, its immediately Queued - is that
>> > >>>>>>>     right?  Should it be pending before (e.g. as long as
the
>> > >>>>>>>     queuing request travels the middleware layers)?
>> > >>>>>>>
>> > >>>>>> To me, Queued is the same as Pending. Pending is probably a
>better word
>> > >>>>>> for
>> > >>>>>> this. Can't remember where the Queued name came from, as LSF
>uses PEND.
>> > >>>>>>
>> > >>>>>>>   - can the hold and suspend states reached only from
>> > >>>>>>>     'Running', or from elsewhere as well?
>> > >>>>>>>
>> > >>>>>> You can only go into a Hold state from Pending, I think, or
>directly into
>> > >>>>>> Hold on submission.
>> > >>>>>>
>> > >>>>>>>   - What is the difference between 'Hold' and 'Suspend'?
>> > >>>>>>>
>> > >>>>>> A Hold state tells the scheduler/broker not to consider this
job
>for
>> > >>>>>> scheduling/dispatch until the hold is explicitly released.
>> > >>>>>>
>> > >>>>>>>   - Are there signals defined (apart from KILL) which
shange
>> > >>>>>>>     the job state?  I guess that is not as simple as saying
>> > >>>>>>>     SUSP does suspend - that state is probably defined by
>> > >>>>>>>     the scheduler, not by the OS...
>> > >>>>>>>
>> > >>>>>> Right ... this is implementation dependent on the mechanism
used
>to
>> > >>>>>> suspend
>> > >>>>>> a job (might be a signal, might be some other mechanism).
What
>is
>> > >>>>>> important
>> > >>>>>> is that there is an operation to initiate the state
transition.
>> > >>>>>>
>> > >>>>>>>   - What is the use case for distinguishing between
UserHold
>> > >>>>>>>     and SystemHold, or between UserSuspend and
>> > >>>>>>>     SystemSuspend?
>> > >>>>>>>
>> > >>>>>> If I preempt workload, the system will put it into a
>SystemSuspend state
>> > >>>>>> that a user cannot cause a switch out of, otherwise a system
may
>become
>> > >>>>>> oversubscribed due to the preempted and preempting jobs
running
>at the
>> > >>>>>> same
>> > >>>>>> time. A UserSuspend can be entered and exited by the user,
and
>is often
>> > >>>>>> used
>> > >>>>>> to hold processing to check progress, etc.
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> By the way ... I believe that the state diagram should at
least
>be a
>> > >>>>>> subset
>> > >>>>>> of the BES state diagram ... we should adopt the same names.
>> > >>>>>>
>> > >>>>>> -- Chris
>> > >>>>>
>> > >>>>>
>> > >>>
>> > >>>
>> > >
>> > >
>>
>>
>>
>> --
>> "So much time, so little to do..."  -- Garfield
>>
>
>
>
>--
>Thilo Kielmann
>http://www.cs.vu.nl/~kielmann/





More information about the saga-rg mailing list