[saga-rg] SAGA thread model

Mon Jul 17 11:10:57 CDT 2006

Hi All,

I was just following VU's discussion and like to comment on it.

As I understand it, tasks are by definition independant from
each other, because they are asynchronous operations, right?

With this in mind, SAGA users should be responsible to manage
possible race conditions by themselves. They are the only ones,
which are aware of the exact nature of calls the put into tasks.
And that s why, it is a easier task for them to avoid race
conditions in their very own code, than for the SAGA spec, to
avoid race conditions in every case without knowing about the
exact semantics.

Copying around, objects which are used by tasks seems to be
problematic, as these different copies need to be synchronised
afterwards (as Andre pointed out).

Hence, I would propose to leave the problems of race conditions
to the enduser, because there is probabely no general solution,
which would not contradict certain use cases.

regards,
Stephan

On Mon, 17 Jul 2006, Andre Merzky wrote:

> Quoting [Thilo Kielmann] (Jul 17 2006):
> >
> > Giving it another thought, I think it isn't as bad as my last mail assumed
> > to be.
> >
> > The difference is that SAGA tasks aren't threads. They are kind-of single
> > operations (e.g., a single file.read, but no sequence of multiple of such
> > operations). It is just that the obvious implementation in Java would be
> > using threads for everything asynchronous...
>
> Right, that is what saga tasks are: they represent a async
> operation.
>
>
> > Still, if state sharing between multiple tasks (or between a task and
> > the main thread) is desired, we need to start out by defining data consistency
> > of all local and remote objects...
>
> We already had a discussion about this, which was triggered
> by similar comments from Felix.  The result of the
> discussion was (cited from the spec intro):
>
>   \subsubsection{Consistency Model}
>
>    We had a lengthy discussion about consistency models, with the
>    agreement that the consistency model is to be defined and
>    documented by the implementation.  The API spec itself does not
>    assume any specific consistency model, as we feel that (a) POSIX
>    consistency is not achievable within reasonable effort/performance,
>    (b) if the user assumes the worst (no consistency), he will still
>    be able to make good use of the API, and (c) reality will be
>    somewhere in the middle.
>
> After discussing further with some OGSA folx at last GGF, I
> added:
>
>    Implementors SHOULD, however, strive to implement ``At Most Once''
>    consistency, as that seems (a) to be generally supported by most
>    Grid middleware, (b) implementable in distributed systems with
>    reasonable effort, and (c) useful and intuitively expected by most
>    end users.
>
> There have been some recent discussion on the BES and OGSA
> list about At-Least-Once and At-Most-Once, but I am pretty
> positive that our use cases benefit from At-Most-Once most.
>
> Is that what you are looking fore?
>
> It is not really saying anything about shared state of
> object, and the life time consequences for these objects
> (that is what this thread originally tried to discuss).
>
> For that, I tried to clean up the intro once more, see the
> CVS version of
>
>   \subsubsection{Life Time Management}
>
> in the light of what we discussed about consistency and
> tasks, does that section make sense to you?
>
> Cheers, Andre.
>
>
> > Thilo
> >
> > On Mon, Jul 17, 2006 at 04:16:26AM +0200, Thilo Kielmann wrote:
> > > Date: Mon, 17 Jul 2006 04:16:26 +0200
> > > From: Thilo Kielmann <kielmann at cs.vu.nl>
> > > To: saga-rg at ggf.org
> > > Subject: [saga-rg] SAGA thread model
> > >
> > > I am sorry to say, but SAGA's task model seems to me severely flawed.
> > > This is for two reasons:
> > >
> > > 1. the "main" thread (executing sync operations)
> > >    needs to be considered as yet another task
> > > 2. there must be a concise definition of the shared state. current solutions
> > >    are ad-hoc and mostly undefined.
> > >    shared state is:
> > >      - local objects, shared between multiple tasks of the same process
> > >        here: definition of synchronization between tasks
> > >      - remote objects, in the service(s)
> > >        here: definition of legal execution orders
> > >
> > > so far, I can see only a few incidental definitions, but they are far from
> > > being concise.
> > >
> > > "tasks in a bulk operation have to be independent"
> > > "a task cancel is doing 'best effort' but can not guarantee cancelation"
> > >
> > > The latter, BTW, is a special case, because this is about connection
> > > termination for which you can formally prove that there is no protocol that
> > > can guarantee this AND notify both parties of successful termination.
> > >
> > > To be constructive:
> > > what the task model must do first thing is
> > > - define tasks
> > > - define which data is shared between tasks and which concurrency control
> > >   happens on this shared data
> > >
> > > That is the only way to define clearly what tasks will do in the event of
> > > sharing, really.
> > >
> > > You may want to look at:
> > >
> > > http://www.amazon.com/gp/product/0201695812/qid=1153102128/sr=2-3/ref=pd_bbs_b_2_3/002-2045221-3597631?s=books&v=glance&n=283155
> > >
> > > This is:
> > > Doug Lea, "Concurrent Programming in Java: Design Principles and Patterns"
> > >
> > > This book uses 280 pages on objects, shared state and concurrency control
> > > before using 95 pages for the thread operations...
> > >
> > >
> > > --
> > > Thilo Kielmann                                 http://www.cs.vu.nl/~kielmann/
> --
> "So much time, so little to do..."  -- Garfield
>
>