[ogsa-wg] Paper proposing "evolutionary vertical design efforts"

Christopher Smith csmith at platform.com
Tue Mar 21 18:16:14 CST 2006


Hi Karl,

Fair enough ... it's not exactly what you meant, but your points explaining
how you can implement an idempotent protocol are also relevant to my
assertion that this is not a high priority within our customer base, since
they can check for at most once submission using the mechanisms that you
described. 

Our customers who want it can do it.

If a standardized submission protocol also has these characteristics
(doesn't BES have submit with hold?), then clients can get at most once
submission if they want. Yes ... as Ian pointed out this puts more burden on
the client. 

-- Chris
 


On 21/3/06 16:01, "Karl Czajkowski" <karlcz at univa.com> wrote:

> On Mar 21, Christopher Smith modulated:
>> No Ian ... I¹m not saying that the ability to tell whether your job has
>> been submitted is not important. What I am saying is that for systems
>> like LSF, implementing this in the submission protocol is not necessary
>> as there are other ways of figuring this out (such as Karl outlined in
>> another email). Thus, having this protocol implemented is not important
>> for our customers, who might rather see other features added to the
>> product.
>> 
>> -- Chris
>> 
> 
> Chris, that is not an accurate characterization of what I wrote.
> 
> I said that it is easy to implement an idempotent message protocol in
> front of two different LSF local submission mechanisms (hold+release
> and job name annotations), and therefore was implying that it is not a
> significant implementation burden to support idempotence in a standard
> protocol!
> 
> I suspect that most schedulers have some "client provided job name"
> option that can be used in a general adapter solution:
> 
>    1. standard protocol client chooses unique idempotence ID
> 
>    2. standard protocol client sends message, possibly more than once
> 
>    3. standard protocol service receives message, possibly more than once
> 
>       a. implement a persistent atomic <client-ID, engine-ID, job-ID> map
> 
>       b. # log client-ID for protocol idempotence
>          if <client-ID, *, *> is not in map
>          then 
>              engine-ID := new unique ID()
>              enter <client-ID, engine-ID, nil> into map
>          else
>              find engine-ID for client-ID in map
>          endif
> 
>       c. # log local ID and job ID for local idempotence
>          if <client-ID, engine-ID, nil> is in map
>          then
>              if job system has a job annotated with engine-ID
>              then
>                 job-ID := job system ID for job annotated with engine-ID
>              else
>                 job-ID := submit(job annotated with engine-ID)
>              endif
>              enter <client-ID, engine-ID, job-ID> into map
>          else
>              find job-ID for engine-ID in map
>          endif
> 
>       this process is crash-recoverable to provide at-most-once semantics
>       for local job submission, with as much reliability as there is in
>       the persistent log mechanism.  it also requires that jobs "linger"
>       in the local scheduler (or accessible log files) long enough for
>       recovery to take place.
> 
> This works in practice if the service engine can formulate unique IDs
> that are unlikely to have a collision with any other client-specified
> name annotations for jobs.  I use a separate client-ID and engine-ID
> to clarify that this solution does not depend on the client following
> a locale-specific unique job naming convention.  Of course, the client
> must follow the standard protocol's conventions for unique message
> naming.
> 
> 
> karl





More information about the ogsa-wg mailing list