[saga-rg] task states

'Andre Merzky' andre at merzky.net
Thu Dec 1 11:47:02 CST 2005


Hi Haresh, 

sorry for the late answer...


Quoting [Haresh Bhatt] (Nov 14 2005):
> 
> Hi Andre and All,
> 
> Sorry for delayed response as I was out of touch to my mailbox.
> 
> I will revert back on Job states. I have some comment on your response with
> respect to task handling.
> 
> >>For the task states: tasks in SAGA are just handles to
> >>asynchroneous operations really, so fairly simple things.
> 
> I feel it will nice to have state of Suspend and Resume for task
> (asynchronous operations). This is will be useful while handling several
> issues especially during fault recovery/tolerance time period as well as
> synchronization among asynchronous tasks. We have faced such problems in
> real life environment and solved using Suspend and resume mechanism.

Could you give us an example for such situations?

Also, I am somewhat unsure a bout the semantics of a task
suspend.  Asume you do a remote seek operation
asynchroneously.  In terms of implementation that would
potentially look like that:

main thread: 

  01  saga::file f (url);
  02  saga::task t = f.seek_async (100, saga::file::SEEK_SET);
  03  
  04  sleep (2);
  05  t.suspend ();

right?

Now the saga implementation would, at line 2, spawn a
separate thread.  That thread would issue for example 
a single soap call to a remote site to change the state of a
remote file handle (EPR or so).  the thread would then
do a blocking wait for the answer soap message or so.

Now, what would a suspend do, e.g. if issued after sending
the request, and before receiving the answer?  

It could keep the answer undelivered to the application
while in suspended state, but I don't see the value of that
(the application could just as well ignore the fact that the
task is finished).

Or it could refuse to accept the answer if in suspend state,
but that might well mean that the connection drops, and a
later resume would be impossible or very difficult.

The most useful semantics I could imagine is to talk to the
remote side again, and to request the remote seek to be
halted (that might make more sense for a remote file
transfer or so...).  But assuming that arbitrary remote
operations are suspensible is very optimistic (apart from
file transfer, I could not think of any, really).


> >>For most implementations, a async call (i.e. task) will
> >>probably spawn a thread which performs a operation (e.g.
> >>remote file copy), and watch that thread.  So a 'failed
> >>task' will mean that the file copy failed, not that the
> >>thread was killed -- FAILED so always would mean a user
> >>handled error. 
> 
> Take a case:  Remote file copy is not supporting RESUME and network
> connectivity is failed. Fault tolerance mechanism (to switch over to
> alternate path) takes little more time than the Remote File Copy time out
> period. This may cause remote file copy to fail and waste the complete time
> period of the partial file transfer. In such a case, one would like to
> suspend the Remote File copy thread and will resume back when network is
> restored back. Thus the Suspend and Resume states (as well as API facility
> to suspend and resume) will help to make environment more and effective
> fault tolerant. These states will also help in proper time-accounting. 

I don't think that evaluation of network status belongs on
the application level.  Instead, a clever saga
implementation could be able to handle network
drops/switches transparently.  

So IMHO the task should continue to run on a network drop,
w/o the application seeing any problem really.  So state
would stay Running.  The implementation however would detect
the network drop, wait for the network replacement to come
up, and use a restart marker or such to continue operation.

Would that make sense to you?

Cheers, Andre.

-- 
+-----------------------------------------------------------------+
| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
| Dept. of Computer Science         | mail: merzky at cs.vu.nl       |
| De Boelelaan 1083a                | www:  http://www.merzky.net |
| 1081 HV Amsterdam, Netherlands    |                             |
+-----------------------------------------------------------------+





More information about the saga-rg mailing list