[saga-rg] job states...

Christopher Smith csmith at platform.com
Fri Feb 10 20:18:50 CST 2006


What I meant by that comment is that where it is a subset, it should reflect
the BES terminology. I think that the number of states represented is enough
already. ;-)

-- Chris


On 10/2/06 17:30, "Andre Merzky" <andre at merzky.net> wrote:

> Hi Chris, 
> 
> many thanks for the answers! :-)
> 
>> By the way ... I believe that the state diagram should at least be a subset
>> of the BES state diagram ... we should adopt the same names.
> 
> I agree, kind of - I would say that the SAGA job state
> diagram should at _most_ be subset of the BES state diagram.
> It could be _S_implier :-)
> 
> Cheers, Andre.
> 
> 
> Quoting [Christopher Smith] (Feb 10 2006):
>> Date: Fri, 10 Feb 2006 13:41:18 -0800
>> Subject: Re: [saga-rg] job states...
>> From: Christopher Smith <csmith at platform.com>
>> To: Simple API for Grid Applications WG <saga-rg at ggf.org>
>> 
>> On 4/2/06 11:18, "Andre Merzky" <andre at merzky.net> wrote:
>> 
>> Ok ... I'll try to answer these, at least from my viewpoint.
>> 
>>> 
>>> I think that diagram is wrong, isn't it?  Well, here are my
>>> questions:
>>> 
>>>   - if we submit a job, its immediately Queued - is that
>>>     right?  Should it be pending before (e.g. as long as the
>>>     queuing request travels the middleware layers)?
>>> 
>> To me, Queued is the same as Pending. Pending is probably a better word for
>> this. Can't remember where the Queued name came from, as LSF uses PEND.
>> 
>>>   - can the hold and suspend states reached only from
>>>     'Running', or from elsewhere as well?
>>> 
>> You can only go into a Hold state from Pending, I think, or directly into
>> Hold on submission.
>> 
>>>   - What is the difference between 'Hold' and 'Suspend'?
>>> 
>> A Hold state tells the scheduler/broker not to consider this job for
>> scheduling/dispatch until the hold is explicitly released.
>> 
>>>   - Are there signals defined (apart from KILL) which shange
>>>     the job state?  I guess that is not as simple as saying
>>>     SUSP does suspend - that state is probably defined by
>>>     the scheduler, not by the OS...
>>> 
>> Right ... this is implementation dependent on the mechanism used to suspend
>> a job (might be a signal, might be some other mechanism). What is important
>> is that there is an operation to initiate the state transition.
>> 
>>>   - What is the use case for distinguishing between UserHold
>>>     and SystemHold, or between UserSuspend and
>>>     SystemSuspend?
>>>    
>> If I preempt workload, the system will put it into a SystemSuspend state
>> that a user cannot cause a switch out of, otherwise a system may become
>> oversubscribed due to the preempted and preempting jobs running at the same
>> time. A UserSuspend can be entered and exited by the user, and is often used
>> to hold processing to check progress, etc.
>>  
>> 
>> By the way ... I believe that the state diagram should at least be a subset
>> of the BES state diagram ... we should adopt the same names.
>> 
>> -- Chris
> 
> 





More information about the saga-rg mailing list