[saga-rg] Job states proposal

Andre Merzky andre at merzky.net
Mon Feb 20 08:38:03 CST 2006


Quoting [Raul Sirvent] (Feb 20 2006):
> 
> Hi. May I disagree with this SAGA states? 

Sure :-)

> In our use case (GRID 
> superscalar) it is important to know when the transfering of input files 
> end when a job has been submitted (this transfered files can be used by 
> other jobs). I don't see how you can know that in this state diagram.

You would do the following:


  void print_job_state (saga::job & j)
  {
    cout << "job state: " << j.get_state ()        << " - "
                          << j.get_state_detail () << endl;
  }

  void run_job (saga::job_decription & jd)
  {
    saga::job_server js;
    saga::job        j = js.create (jd);

    print_job_state (j); // job state: "New" - "BES:New"

    j.run ();

    print_job_state (j); // job state: "Running" - "BES:Pending"

    sleep (1);
    print_job_state (j); // job state: "Running" - "BES:StagingIn"
  
    sleep (1);
    print_job_state (j); // job state: "Running" - "BES:Running"

    sleep (1);
    print_job_state (j); // job state: "Running" - "BES:StagingOut"

    sleep (1);
    print_job_state (j); // job state: "Done" - "BES:Done"
  }

You would be able to monitor both the simple saga job state,
as the more verbose job_state_detail.  For both you can add
callbacks which would notify you on state changes
asynchroneously.

Does that make sense to you?

The prints would be enums for state, and strings for state
details.  Also, the method names might be slightly different
- so the code is not 100% correct - however, it should be
close, and I hope its illustrative :-)

Cheers, Andre.
    
    
> 
> Cheers. Raul.
> 
> Andre Merzky wrote:
> >As addition, please see attached the updated SAGA state
> >diagram.  Note that the BES states are taken from the BES
> >doc.  The SAGA states would then apply to both jobs and
> >tasks, with the same semantics.  The task cancelled state
> >would be folded into the Failed state.  The Cancelling state
> >we discussed in Athens would be ShuttingDown, so a sub state
> >of Running.
> >
> >Cheers, Andre.
> >
> >
> >Quoting [Andre Merzky] (Feb 17 2006):
> >
> >>Date: Fri, 17 Feb 2006 21:26:58 +0200
> >>From: Andre Merzky <andre at merzky.net>
> >>To: Christopher Smith <csmith at platform.com>
> >>Cc: Simple API for Grid Applications WG <saga-rg at ggf.org>
> >>Subject: Re: [saga-rg] Job states proposal
> >>
> >>Hi Chris, 
> >>
> >>thanks for the discussions at GGF, and for this proposal!
> >>
> >>I am supportive of the proposal.  It basically means that we
> >>maintain a very simple state model for SAGA, with a very
> >>finite number of states - but allow a more detailed state
> >>model to be exposed if needed.  I am all for it.
> >>
> >>As for the suspend and hold state: I would, contrary to
> >>Chris, move the hold state into the details, and only expose
> >>the suspend state.  Reason would be that we have 
> >>suspend/resume methods, but no hold, and no use cases for
> >>hold.  
> >>
> >>Well, Chris' argument to that was, IIRC, that we actually
> >>should have hold (as a flag to submit), as it is supported
> >>by most backends.
> >>
> >>I guess that is a minor detail, either way will work fine at
> >>the end
> >>
> >>Cheers, Andre.
> >>
> >>
> >>
> >>Quoting [Christopher Smith] (Feb 16 2006):
> >>
> >>>Date: Thu, 16 Feb 2006 05:05:15 -0800
> >>>Subject: [saga-rg] Job states proposal
> >>>From: Christopher Smith <csmith at platform.com>
> >>>To: Simple API for Grid Applications WG <saga-rg at ggf.org>
> >>>
> >>>Ok ... here's my proposal for modelling the SAGA job states.
> >>>
> >>>There is a typical path through the states:
> >>>
> >>>New --> Pending --> Running --> Finished
> >>>
> >>>You could also submit with a hold, or put a Pending job on hold:
> >>>
> >>>New --> Hold <--> Pending
> >>>
> >>>You can suspend a job (and resume it):
> >>>
> >>>Running <--> Suspended
> >>>
> >>>The job can fail in execution:
> >>>
> >>>Running --> Failed
> >>>
> >>>And the job can be cancelled:
> >>>
> >>>Running --> Cancelled
> >>>New --> Cancelled
> >>>Pending --> Cancelled
> >>>Suspended --> Cancelled
> >>>
> >>>All states can go to "Unknown".
> >>>
> >>>Let's not bother modelling User and System suspended and hold states,
> >>>pending some more experience and let's not bother with all the pre/post
> >>>states and file staging stuff pending experience and use cases, but see
> >>>below on extended states.
> >>>
> >>>This model pretty much preserves what we had before (adding unknown, I
> >>>believe), and it represents the "basic" SAGA state model.
> >>>
> >>>Now ... in order to have the ability to add states (say to support the 
> >>>BES
> >>>state model), we can define a class for JobState that contains the 
> >>>BaseState
> >>>(one of the enumerated states from above) and that has an ExtendedState
> >>>attribute defined as a vector of strings. The job can then represent 
> >>>some of
> >>>the extended states such as "Running PreExecution" or "Running 
> >>>StagingIn",
> >>>etc, etc. By using the attribute mechanism, coders can get started with 
> >>>SAGA
> >>>implementations now, and as we extend the state model according to
> >>>experience and use cases, we merely need to specify acceptable strings,
> >>>rather than change the underlying data structures.
> >>>
> >>>My $0.02
> >>>
> >>>-- Chris
> >>>
> >>>
> >>>------------------------------------------------------------------------
> >>>



-- 
"So much time, so little to do..."  -- Garfield





More information about the saga-rg mailing list