[saga-rg] Comments on the job API (revision 1.4)

Mon Mar 6 15:17:44 CST 2006

Quoting [Graeme Pound] (Mar 03 2006):
> 
> Hi,
> 
> I have some comments on the job section of the Strawman API (revision 1.4).

Great :-)  Lets see...

> Thanks
>   Graeme
> 
> 
> 
>  -3.37 I do not understand the purpose of job_service.get_self() which 
> returns a Job object.

Use case is the following (pseudo code-ish:

main ()
{
  saga::job_service js ("xyz");
  saga::job         me = js.get_self ();

  saga::job_status  s  = me.get_status (); // should be running ;-)

  // start to do some work

  // after a while, do something to MY instance
  me.migrate ("some new big resource");

  return (0);
}

Basically, the job returned represents the application
calling get_self, and allows to perform actions on that
application (like suspending myself via the resource
manager).

Should make that more clear in the spec I guess... ;-)

>  -3.38 I like the removal of the JobInfo and JobExitStatus objects from 
> the API, and adding this information as attributes to Job. This 
> streamlines the API, the concept of read only attributes also makes 
> sense [to me] within the context of the Job object.

great :-)

>  -3.39 I like the simplification of the job_state enumeration. NB This 
> is called 'state' in the SIDL and should be corrected to 'job_state'.

Thanks, fixed.

>  -3.41 The separation of the JobService.submitJob() method into two 
> methods JobService.createJob() [create the job object] and Job.run() 
> [start the job] has not been seamless and there are several of 
> conceptual and practical problems. This needs to be fine-tuned further. 
> On many [all?] resource managers there is no separation between 
> submitting a job to the resource manager and manually starting the job, 

Interesting point.  How does BES invision the use of its New
state?  I guess the job is then not bound to a resource
manager in that state...

> this raises the following problem:
>         - Job objects are identified via the job_id which is described 
> as the  "job identifier as returned by the resource manager". 
> Unfortunately since this information will only be available upon 
> submission of the job (via Job.run()) this breaks the methods 
> JobService.list() and JobService.getJob(). It is now impossible to 
> manage an index of Job objects within JobService based upon the job ID.

Right, thats impossible.  I think a job-id should only be
assigned after the job got run().  Also, the job should not
be listed before.

But see below.

>         - What is the conceptual relationship between the JobService 
> and the resource manager? At present this is a little confused in a 
> couple of ways.
>             #1 Should there be a one to one relationship between 
> JobService instances and resource managers; i.e. should the resource 
> manager endpoint be specified in the JobService constructor (or 
> otherwise as an argument to JobService.createJob())?

You pointed the missing resource manager specification out
before - and we concluded that it should be specified in the
create-job method.  That is what we have right now.  As
such, the relation job_service to RM would be 1:n (or even
n:m I guess).  

However, I wonder how list_jobs is supposed to work: should
the job_service query all _available_ RMs?  Or should list
also allow/require a RM to be specified? *scratch*

The job should be associated to a RM only if it is running I
guess (as you state above, the job-id and job-listing get
useless oterwise).

As I stated earlier, I have not much experience with RM.
However, pondering about your comments, the following seems
possible as well:

  1)
    // we need job description
    saga::job_description js; // fill it...

    // keep job and job_service independend
    saga::job             j  (jdes);
    saga::job_service     js ("RM");

    // associate job and js
    js.submit_job (j); 

  2)
    // we need job description
    saga::job_description js; // fill it...

    // keep job and job_service independend
    saga::job             j  (jdes);
    saga::job_service     js;

    // associate job and js, and RM
    js.submit_job (j, "RM"); 

Both versions would imply that a job_id is only available
after submit.  The js.run_job method would be unaffected
(there seems no issue with that).

>             #2 The term JobService implies a close relationship to the 
> resource manager. Previously JobService.submitJob() corresponded to 
> communication with the resource manager. Now JobService.createJob() 
> corresponds to the creation of instances of the Job class the JobService 
> is acting as a factory. It may beneficial to rename JobService to 
> JobFactory to clarify the relationship.
> 
>  -3.42 Should Job objects created by runJob() be added to the index 
> managed by the JobService? 

Yes.

> Or in other terms, should JobService.runJob() 
> be a Java 'static' method?

Uhm, is that related?

Cheers, Andre.

-- 
"So much time, so little to do..."  -- Garfield