[ogsa-wg] effective use of resource lifetimes in grid infrastructure

Ian Foster foster at mcs.anl.gov
Mon Dec 6 15:06:25 CST 2004


Steve:

You can indeed put the ManagedJob WS-Resource on a different host.

You might find this URL relevant: 
http://www-unix.globus.org/toolkit/docs/development/3.9.3/execution/wsgram/WS_GRAM_Approach.html

Regards -- Ian.

At 01:12 PM 12/6/2004 +0000, Steve Loughran wrote:
>Ian Foster wrote:
>>   Steve:
>>A variety of semantics and connections are possible between a 
>>"WS-Resource" and an "entity that the WS-Resource repesents", including 
>>both your (a) and (b) below. I don't believe that the implied resource 
>>pattern implies that one particular approach be adopted.
>>The following are some rough notes on how we have chosen to handle things 
>>in the GT4 GRAM service. This may perhaps be relevant to your problem.
>>The approach that we take in GT4 GRAM is as follows:
>>1) A GRAM ManagedJobFactory defines a "create job" operation that:
>>a) creates a job, and also
>>b) creates a ManagedJob WS-Resource, which represents the resource 
>>manager's view of the job.
>>2) The ManagedJob WS-Resource and the job are then linked as follows:
>>a) Destroying the ManagedJob WS-Resource kills the job
>>b) State changes in the job are reflected in the ManagedJob WS-Resource
>>c) Termination of the job also destroys the ManagedJob WS-Resource, but 
>>not immediately: we find that you typically want to leave the managedjob 
>>state around for "a while" after the job terminates to allow clients to 
>>figure out what happened to the job after the fact
>>Regards -- Ian.
>
>Ian,
>
>What is your fault tolerance strategy here?
>
>Is every ManagedJob WS-Resource hosted on the same host (and perhaps, same 
>process) as the job itself?
>
>This would mean that there is no way for the managedjob EPR to fail 
>without the job itself failing, but would require the entire set of job 
>hosts to be visible for inbound SOAP messages. And prevent you moving a 
>job from one node to another without some difficultly (the classic CORBA 
>object-moved problem, I believe, though HTTP 304 responses would work if 
>only SOAP stacks processed them reliably)
>
>I am trying to do a design which would enable (though would not require) 
>only a subset of nodes -call them portal nodes- to be visible to outside 
>callers, with the rest of the nodes only accessible to the portal itself. 
>Once I assume this architecture, modelling the resources gets complex, as 
>EPRs contain routing info that may become invalid if a portal node fails.
>
>-steve
>

_______________________________________________________________
Ian Foster                    www.mcs.anl.gov/~foster
Math & Computer Science Div.  Dept of Computer Science
Argonne National Laboratory   The University of Chicago
Argonne, IL 60439, U.S.A.     Chicago, IL 60637, U.S.A.
Tel: 630 252 4619             Fax: 630 252 1997
         Globus Alliance, www.globus.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/ogsa-wg/attachments/20041206/c4214524/attachment.html 


More information about the ogsa-wg mailing list