[ogsa-wg] Job State proposal made to SAGA-RG

Steve Loughran steve_loughran at hpl.hp.com
Tue Apr 25 05:14:50 CDT 2006


Christopher Smith wrote:
> Hi all,
> 
> Per Marvin's comments ...
> 
> Here is a pointer to the proposal for modelling job states that I made to
> the SAGA group last February.
> 
> http://www-unix.gridforum.org/mail_archive/saga-rg/2006/02/msg00107.html
> 
> -- Chris
> 

Presumably suspend<-->resume is optional? That is, there are some things 
that cannot be suspended, or at least they suspend but cannot resume? (*)

That is something that is not in the CDDLM model (which is based on the 
WSDM state model), because its a lot harder t to suspend things like a 
database and a server hosting many active connections. Its a lot easier 
to shut it down and redeploy later, relying on the application to be 
able to continue when it is redeployed. Which is a good idea for 
anything you want to be resilient. [1]

If you have VM images you can suspend them, but at least as far as 
vmware is concerned, the apps don't get warned before and after, so they 
have a worse experience than on a laptop, where apps and drivers get 
warned that they are about to suspend and told that they have woken up. 
All you know about on a vmware hosted image is that the clock suddenly 
jumps and all your active TCP start throwing errors. You may have 
suspended, but the world still turns.


(*) Even on ACPI laptops there is evidence of a de-facto suspend state, 
S6: the sleep from which laptops do not recover. [2]

[1] http://swig.stanford.edu/~candea/papers/crashonly/
[2] http://www.hpl.hp.com/techreports/2000/HPL-2000-21.html .





More information about the ogsa-wg mailing list