[saga-rg] tasks and jobs...

Tue Feb 7 11:25:06 CST 2006

Hi Graeme, 

thanks for the pointer!  

I still have some questions if you don't mind...

What I have seen at the URL you sent is how to program
threads in Java.  What I am unsure about is, what would a
Java-SAGA programmer actually _do_.  

For example, if he wants to do a async seek on a file, what
would he do?

Lets try:

  - Create a new class which is 'Runnable', and inherits
    'saga::file'
  - implement 'run' to perform a seek

Can't be, because run is void (or can you change that?)

try again:

  - Create a new class which is 'Runnable', and inherits
    'saga::file'
  - have seek not doing anything, but setting args
  - have run perform the seek async

Hmm, again not nice, as your object needs to be very
stateful: e.g. store args for many reads, which are later
run

try again:

  - saga::file is already 'Runnable'
  - no, wait, it still has only one 'run' method, so I can't
    do a async seek and then async read...

You see, I am really on the wrong track - sorry, this looks
probably very foolish to you...

I would very much appreciate a code example.  That usualoly
helps me most :-)

I will google somewhat, I am sure there are for example
async file classes around or so..

Thanks, Andre.

Quoting [G.E.POUND at soton.ac.uk] (Feb 07 2006):
> Date: Tue,  7 Feb 2006 14:33:34 +0000
> From: G.E.POUND at soton.ac.uk
> To: Andre Merzky <andre at merzky.net>
> Cc: G.E.POUND at soton.ac.uk, Thilo Kielmann <kielmann at cs.vu.nl>,
> 	saga-rg at ggf.org
> Subject: Re: [saga-rg] tasks and jobs...
> 
> Andre,
> 
> 
> The benefits of the application of the TaskContainer semantics to job
> submission are compelling. Whilst the separation of JobService.submitJob()
> into  JobService.create() and  Job.run() methods adds a little complexity,
> the possibility of optimised bulk operations may justify this.
> 
> The role of the JobService changes a little, it may be used to create Job
> objects without submitting them to the resource manager. Will
> JobService.runJob() invoke the Job object that has been submitted to the
> resource manager?
> 
> ---
> 
> I am yet to be  convinced that the 'Task' and 'JobManagement' namespaces
> should be linked. Whilst the semantics of these namespaces are similar they
> are designed for different purposes; a simple model of asynchronous method
> calls in the client, and the submission of batch jobs to a remote resource.
> 
> The advantage of keeping these two namespaces separate is to avoid an
> unnecessary dependency between these two different areas of the API. For
> example; if in the future additional methods are required to support
> asynchronous method calls these would be reflected in the JobManagement
> package. [Or if the Task namespace were altered to support language
> specific features, see below]
> 
> The advantage of linking the namespaces would be to allow all classes
> implementing TaskContainer to handle Job objects.
> 
> ---
> 
> In Java the basis for multithreading support is sub-classing the
> java.lang.Thread class (or implementing the java.lang.Runnable interface).
> In practice SAGA implementations could return inner classes that may be
> called asynchronously (by either of these approaches) for each API method.
> This is essentially the whole story (prior to Java 5.0), and the whole
> approach could be encapsulated within the SAGA.Task model.
> 
> The problem with encapsulation within the SAGA.Task model is that the
> fine-grain control over the threads (when sub-classing java.lang.Thread) is
> lost. Furthermore when using Java 5.0 it would not be possible to leverage
> the high-level concurrency utilities now available
> (java.util.concurrent.*). For example the task scheduling framework would
> be appropriate to control the execution of the threads.
> 
> For more details see:
> http://java.sun.com/docs/books/tutorial/essential/threads/index.html
> 
> In C# the model is nicer than in Java. Methods may be invoked asynchronously
> by creating a 'delegate' for that method using Thread.ThreadStart.
> Evaluation of the delegates in different threads can be managed at a
> high-level via the thread pool.
> 
> When working with these languages it there will be some functionality that
> it will be important not to preclude. In Java <5.0 there may be little
> above and beyond that which is available in SAGA.Task of value to most
> users, and encapsulation may be the correct approach. However the
> high-level support for concurrency available in Java 5.0 and C# is
> certainly important. I am unsure whether access to this functionality is
> best achieved by by-passing, or altering SAGA.Task namespace in the
> language bindings.
> 
> 
> Graeme
> 
> 
> 
> 
> 
> Quoting Andre Merzky <andre at merzky.net>:
> 
> > Hi Graeme,
> >
> > Quoting [G.E.POUND at soton.ac.uk] (Feb 06 2006):
> > >
> > > Andre,
> > >
> > > I have a couple of reservations about this action that you may be able
> > to
> > > answer.
> > >
> > > I had been hoping to avoid implementing the 'Task' namespace in the
> > java
> > > bindings and encourage developers to use the language's support to
> > > threading to allow asynchronous method calls in the client code.
> >
> > I understood that much from your last comments, but my
> > limited knowledge does not allow me to give a qualified
> > answer I'm afraid.  Hmm, maybe we can work this out together
> > :-)
> >
> > Could you post some code examples to demonstrate how a
> > asynchronous seek (for example) would be coded in a java
> > application?
> >
> > The C++ code would be:
> >
> >   saga::file f (url);
> >   saga::task t = f.seek <saga::task> (off, whence, &pos);
> >
> >   t.run  ();
> >   t.wait ();
> >
> >
> > Next question is, how would a java application manage many
> > tasks - i.e. is there something similar to a
> > saga::task_container ?
> >
> >   saga::task_container tc;
> >
> >   tc.add (task_1);
> >   tc.add (task_2);
> >   tc.add (task_3);
> >
> >   tc.run  ();
> >   tc.wait ();
> >
> >
> > > I am therefore concerned about creating a dependency between the
> > > 'JobManagement' namespace and the 'Task' namespace.
> >
> > I think, for the java bindings that would mean that job
> > implements the task interface.  Apart from the run you
> > mention below, the semantic of the task interface is
> > actually included in job already, more or less, only the
> > methods and states are differently named  (that is one
> > motivation for the proposal really).
> >
> >
> > > The submission of remote jobs is naturally asynchronous, and there are
> > > natural semantic parallels to the asynchronous model described by the
> > > 'Task' namespace. However from my reading of the API I understood these
> > two
> > > models to be independent in purpose; creating a dependency could hinder
> > the
> > > natural description (and development) of these two areas of the API.
> >
> > Both models are not really different on purpose.  Would you
> > see an advantage of having them truly separate?
> >
> > >
> > > The idea of a job container for the management of a large number of
> > remote
> > > jobs is useful. However the TaskContainter does not appear to be wholly
> > > compatible; the run() method to start the asynchronous operations is
> > > unnecessary for jobs that have been submitted to a remote resource.
> >
> > You are right.  Well, that might be somewhat subtle, but in
> > the example code below, the submit_job() call is accompanied
> > by c new reate_job().  That creates a job which needs to be
> > run(), which would make it compatiple to the task model.
> >
> >   saga::job j1 = job_server.create_job (jd);
> >   // job state is 'pending'
> >
> >   j1.run  ();
> >   // job state is 'running' or so - 'not pending'
> >
> >   j1.wait ();
> >   // job state is Done or Failed
> >
> >
> >
> >   saga::job j2 = job_server.submit_job (jd);
> >   // job state is 'running' or so - 'not pending'
> >
> >   j1.wait ();
> >   // job state is Done or Failed
> >
> >
> > That is very similar to the semantics we have for tasks...
> >
> >
> > One reason for this proposal is additionally that we want to
> > approach the bulk operations soon.  Consider a parameter
> > sweep, where 100.000 jobs are to be run.
> >
> >  for ( i = 0; i < 100.000; i++ )
> >  {
> >    jobs[i] = job_server.submit (jd[i]); // SUBMIT
> >  }
> >
> >  for ( i = 0; i < 100.000; i++ )
> >  {
> >    jobs[i].wait ();
> >  }
> >
> > As for each submission, you are very likely to have at least
> > one remote operation (they are independend), that will take
> > a lot of time.
> >
> > Compare that to:
> >
> >  task_container tc;
> >
> >  for ( i = 0; i < 100.000; i++ )
> >  {
> >    tc.add (job_server.create (jd[i])); // CREATE
> >  }
> >
> >  tc.run  ();
> >  tc.wait ();
> >
> > It is rather straigh forward to optimize the task container
> > for bulk job submission, or bulk operations in general (we
> > hope).
> >
> >
> > What I am not sure is, what would that look like in native
> > java?  Are there similar mechanisms?
> >
> >
> > Looking forward to your comments,
> >
> >   Andre.
> >
> >
> > > Graeme
> > >
> > >
> > > Quoting Andre Merzky <andre at merzky.net>:
> > >
> > > > Hi,
> > > >
> > > > I just had a discussion with Thilo about the topic, as he
> > > > and me obviously talked somewhat orthogonal to each other...
> > > >
> > > > Well, now we have the same opinion, kind of, and I have
> > > > barely any bruises...  Anyway, I want to summarize our point
> > > > here, as I probably was not really clear in my initial post.
> > > >
> > > > Sorry if re-iteration of the topic bores you...
> > > >
> > > > So, we have tasks, which represent async operations, with a
> > > > couple of states attached, and the ability to call run(),
> > > > wait() and cancel() on these.  And we can collect them in
> > > > containers, and wait() on many of these tasks conveniently.
> > > >
> > > > And then we have jobs, which represent remote executables,
> > > > with a couple of states attached, and the ability to call
> > > > run (== create them), wait() and cancel().  And some more
> > > > methods.  And we can't collect them in containers right now,
> > > > but would like to.
> > > >
> > > > You see the similarities, right?  Its even more obvious in
> > > > code:
> > > >
> > > >   Tasks:
> > > >   --------------------------------------------
> > > >     task_container tc;
> > > >     task t = file.copy <saga::task> (...);
> > > >          t.run  (   );
> > > >          t.wait (1.0);
> > > >
> > > >         tc.add  (t);
> > > >         tc.wait ( );
> > > >   --------------------------------------------
> > > >
> > > >   Jobs:
> > > >   --------------------------------------------
> > > >     job_container jc;
> > > >     job j = job_server.submit (job_descr);
> > > >         j.wait (1.0);
> > > >
> > > >        jc.add  (j);
> > > >        jc.wait ( );
> > > >   --------------------------------------------
> > > >
> > > >   slightly changed:
> > > >   --------------------------------------------
> > > >     job_container jc;
> > > >   ! job j = job_server.create (job_descr);
> > > >   +     j.run  (   );
> > > >         j.wait (1.0);
> > > >
> > > >        jc.add  (j);
> > > >        jc.wait ( );
> > > >   --------------------------------------------
> > > >
> > > > The similarities are obvious I think.  Now, if job would
> > > > IMPLEMENT the task interface (or inherit from task), we
> > > > would unify both classes, and hence:
> > > >
> > > >   - simplify jobs (leave only those methods which are
> > > >     specific to jobs, like migrate, signal, ...
> > > >
> > > >   - allow to out jobs into task containers, efficiently
> > > >     handling large amounts of jobs and other tasks
> > > >
> > > >   - have the API and used paradigms more uniform.
> > > >
> > > > Also, if later tasks get suspendable, as Gregor rightly
> > > > suggested, we can move more methods to tasks, w/o breaking
> > > > the paradigms.
> > > >
> > > > In terms of state, following mappings would be appropriate:
> > > >
> > > >   job::Pending -> task::Pending
> > > >   job::Done    -> task::Done
> > > >   job::Failed  -> task::Failed
> > > >   job::???     -> task::Cancelled
> > > >
> > > >   job::Queued,Running,Pre/Poststaging,... -> task::Running
> > > >
> > > > So, no adjustements to the statet models are needed AFAICS,
> > > > apart from Cancelled
> > > > (Does it make sense   on jobs?
> > > >  Should job::stop     be job::cancel?
> > > >  Should tasks::cancel be task::job?)
> > > >
> > > > Hope that clearifies things.  I think Gregor was on target
> > > > with his remarks, and Hartmut signalled consent as well.
> > > > And I think I convinced Thilo (Andre rubs his bruises).
> > > >
> > > > Unless there is any opposition, I'll go ahead and document
> > > > that in the strawman then, ok?
> > > >
> > > > Cheers, Andre.
> > > >
> > > >
> > > > Quoting [Thilo Kielmann] (Feb 04 2006):
> > > > > Date: Sat, 4 Feb 2006 22:29:41 +0100
> > > > > From: Thilo Kielmann <kielmann at cs.vu.nl>
> > > > > To: saga-rg at ggf.org
> > > > > Subject: Re: [saga-rg] tasks and jobs...
> > > > >
> > > > > > Wouldn't it be useful to have jobs implementing the task
> > > > > > interface?
> > > > >
> > > > > Certainly, no.
> > > > > Jobs and Tasks are two different things, and they are this on
> > purpose.
> > > > >
> > > > > However, Tasks always have been the mechanism for asynchronous
> > > > operation,
> > > > > which is kind-of obsoleted by having asynchronous ops directly.
> > > > >
> > > > > If you want to work on the "S" of SAGA: why not unify both Tasks
> > and
> > > > Jobs
> > > > > into a better "Job" notion, and do local asynchronous operations
> > via
> > > > > async, local calls?
> > > > >
> > > > > Thilo
> > > > --
> > > > "So much time, so little to do..."  -- Garfield
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> >
> > --
> > "So much time, so little to do..."  -- Garfield
> >
> >
> >
> >
> 
> 

-- 
"So much time, so little to do..."  -- Garfield