[saga-rg] Job states proposal

Mon Feb 20 08:44:57 CST 2006

Andre Merzky wrote:
> Quoting [Raul Sirvent] (Feb 20 2006):
> 
>>Hi. May I disagree with this SAGA states? 
> 
> 
> Sure :-)
> 
> 
>>In our use case (GRID 
>>superscalar) it is important to know when the transfering of input files 
>>end when a job has been submitted (this transfered files can be used by 
>>other jobs). I don't see how you can know that in this state diagram.
> 
> 
> You would do the following:
> 
> 
>   void print_job_state (saga::job & j)
>   {
>     cout << "job state: " << j.get_state ()        << " - "
>                           << j.get_state_detail () << endl;
>   }
> 
>   void run_job (saga::job_decription & jd)
>   {
>     saga::job_server js;
>     saga::job        j = js.create (jd);
> 
>     print_job_state (j); // job state: "New" - "BES:New"
> 
>     j.run ();
> 
>     print_job_state (j); // job state: "Running" - "BES:Pending"
> 
>     sleep (1);
>     print_job_state (j); // job state: "Running" - "BES:StagingIn"
>   
>     sleep (1);
>     print_job_state (j); // job state: "Running" - "BES:Running"
> 
>     sleep (1);
>     print_job_state (j); // job state: "Running" - "BES:StagingOut"
> 
>     sleep (1);
>     print_job_state (j); // job state: "Done" - "BES:Done"
>   }
> 
> You would be able to monitor both the simple saga job state,
> as the more verbose job_state_detail.  For both you can add
> callbacks which would notify you on state changes
> asynchroneously.
> 
> Does that make sense to you?
> 
> The prints would be enums for state, and strings for state
> details.  Also, the method names might be slightly different
> - so the code is not 100% correct - however, it should be
> close, and I hope its illustrative :-)

Yes, it is clear now. Many thanks Andre.

Raul.

> 
> Cheers, Andre.
>     
>     
> 
>>Cheers. Raul.
>>
>>Andre Merzky wrote:
>>
>>>As addition, please see attached the updated SAGA state
>>>diagram.  Note that the BES states are taken from the BES
>>>doc.  The SAGA states would then apply to both jobs and
>>>tasks, with the same semantics.  The task cancelled state
>>>would be folded into the Failed state.  The Cancelling state
>>>we discussed in Athens would be ShuttingDown, so a sub state
>>>of Running.
>>>
>>>Cheers, Andre.
>>>
>>>
>>>Quoting [Andre Merzky] (Feb 17 2006):
>>>
>>>
>>>>Date: Fri, 17 Feb 2006 21:26:58 +0200
>>>>From: Andre Merzky <andre at merzky.net>
>>>>To: Christopher Smith <csmith at platform.com>
>>>>Cc: Simple API for Grid Applications WG <saga-rg at ggf.org>
>>>>Subject: Re: [saga-rg] Job states proposal
>>>>
>>>>Hi Chris, 
>>>>
>>>>thanks for the discussions at GGF, and for this proposal!
>>>>
>>>>I am supportive of the proposal.  It basically means that we
>>>>maintain a very simple state model for SAGA, with a very
>>>>finite number of states - but allow a more detailed state
>>>>model to be exposed if needed.  I am all for it.
>>>>
>>>>As for the suspend and hold state: I would, contrary to
>>>>Chris, move the hold state into the details, and only expose
>>>>the suspend state.  Reason would be that we have 
>>>>suspend/resume methods, but no hold, and no use cases for
>>>>hold.  
>>>>
>>>>Well, Chris' argument to that was, IIRC, that we actually
>>>>should have hold (as a flag to submit), as it is supported
>>>>by most backends.
>>>>
>>>>I guess that is a minor detail, either way will work fine at
>>>>the end
>>>>
>>>>Cheers, Andre.
>>>>
>>>>
>>>>
>>>>Quoting [Christopher Smith] (Feb 16 2006):
>>>>
>>>>
>>>>>Date: Thu, 16 Feb 2006 05:05:15 -0800
>>>>>Subject: [saga-rg] Job states proposal
>>>>>From: Christopher Smith <csmith at platform.com>
>>>>>To: Simple API for Grid Applications WG <saga-rg at ggf.org>
>>>>>
>>>>>Ok ... here's my proposal for modelling the SAGA job states.
>>>>>
>>>>>There is a typical path through the states:
>>>>>
>>>>>New --> Pending --> Running --> Finished
>>>>>
>>>>>You could also submit with a hold, or put a Pending job on hold:
>>>>>
>>>>>New --> Hold <--> Pending
>>>>>
>>>>>You can suspend a job (and resume it):
>>>>>
>>>>>Running <--> Suspended
>>>>>
>>>>>The job can fail in execution:
>>>>>
>>>>>Running --> Failed
>>>>>
>>>>>And the job can be cancelled:
>>>>>
>>>>>Running --> Cancelled
>>>>>New --> Cancelled
>>>>>Pending --> Cancelled
>>>>>Suspended --> Cancelled
>>>>>
>>>>>All states can go to "Unknown".
>>>>>
>>>>>Let's not bother modelling User and System suspended and hold states,
>>>>>pending some more experience and let's not bother with all the pre/post
>>>>>states and file staging stuff pending experience and use cases, but see
>>>>>below on extended states.
>>>>>
>>>>>This model pretty much preserves what we had before (adding unknown, I
>>>>>believe), and it represents the "basic" SAGA state model.
>>>>>
>>>>>Now ... in order to have the ability to add states (say to support the 
>>>>>BES
>>>>>state model), we can define a class for JobState that contains the 
>>>>>BaseState
>>>>>(one of the enumerated states from above) and that has an ExtendedState
>>>>>attribute defined as a vector of strings. The job can then represent 
>>>>>some of
>>>>>the extended states such as "Running PreExecution" or "Running 
>>>>>StagingIn",
>>>>>etc, etc. By using the attribute mechanism, coders can get started with 
>>>>>SAGA
>>>>>implementations now, and as we extend the state model according to
>>>>>experience and use cases, we merely need to specify acceptable strings,
>>>>>rather than change the underlying data structures.
>>>>>
>>>>>My $0.02
>>>>>
>>>>>-- Chris
>>>>>
>>>>>
>>>>>------------------------------------------------------------------------
>>>>>
> 
> 
> 
>