[ogsa-wg] RE: Modeling State: Technical Questions

Tue Apr 5 08:44:18 CDT 2005

Dear Ian,

I don't think that the approach I proposed forces the user to do more
than they would have to do anyway if EPRs were used. It is still the
case that someone has to manage the EPRs to the resources in WSRF. This
is similar to what happens in the real world. The online bookstore will
ask for my credit card number (a URI), or the book store will as for an
ISBN (another URI) or multiple ISBNs if I want to buy multiple books.
The banking service will ask for my bank account number (another URI
perhaps).

Also, there is no reason why a "kill all my jobs" message couldn't also
be supported. But please note that this message is now addressed to the
service (the container of resources) and not, as in the case of WSRF, to
a specific resource. This is no different from what I am advocating.

Also... to Steve's point about partial failure. If one wishes atomic
transaction semantics, I don't see the difference from the two
approaches...

Atomic

  Msg -> resource 1

  Msg -> resource 2

  Msg -> resource 3

End Atomic

Vs

Msg

  Atomic

    Resource 1

    Resource 2

    Resource 3

  End Atomic

In fact, I would argue that the latter is better because:

1. It uses fewer messages (and, Steve, I am not assuming only HTTP and
the optimisations that may be supported)

2.  I can more easily deal with the failures in an application
specific-manner since my atomic TX semantics do not span multiple msgs.

(Anyway... who wants to do atomic TXs over the Web anyway? :-)

Regards,

--
Savas Parastatidis
http://savas.parastatidis.name

________________________________

From: Ian Foster [mailto:foster at mcs.anl.gov] 
Sent: Tuesday, April 05, 2005 2:22 PM
To: Steve Loughran; Savas Parastatidis
Cc: Mark McKeown; Karl Czajkowski; Dennis Gannon; Samuel Meder; ogsa-wg;
dave.pearson at oracle.com; gray at microsoft.com; humphrey at cs.virginia.edu;
grimshaw at virginia.edu; aherbert at microsoft.com; gcf at indiana.edu;
mark.linesch at hp.com; Frank Siebenlist; Tony Hey; Dave Berry
Subject: Re: [ogsa-wg] RE: Modeling State: Technical Questions

Steve's note raises a key point for me: do we really want to force the
user (as Savas seems to be advocating) to keep track of jobs running at
a remote site? 

I'd rather send a request "kill all my jobs" or "kill all my jobs that
have run for more than a day" to the factory than carefully keep track
of all jobs that I have active, and how long they have been running, so
that I can send the big document (or stream) discussed below. 

Ian.

At 02:10 PM 4/5/2005 +0100, Steve Loughran wrote:

Savas Parastatidis wrote:

Dear all,
I think something needs to be clarified with regards to handling
multiple jobs with one message. The beauty of document-oriented
interactions is that you can do things like...
<job-details-request>
  <job-id>urn:ogsa:job:guid:bla-bla-bla-001</job-id>
  <job-id>urn:ogsa:job:guid:bla-bla-bla-010</job-id>
  <job-id>urn:ogsa:job:guid:bla-bla-bla-002</job-id>
  <job-id>urn:ogsa:job:guid:bla-bla-bla-029</job-id>
</job-details-request>
Or
<job-suspend-request>
  <job-id>urn:ogsa:job:guid:bla-bla-bla-002</job-id>
  <job-id>urn:ogsa:job:guid:bla-bla-bla-005</job-id>
  <job-id>urn:ogsa:job:guid:bla-bla-bla-008</job-id>
</job-suspend-request>
The schema for the above document can allow anything from 0 to N number
of <job-id> elements.

the trouble with any bulk operation is you have to handle partial
failure. You need either atomic operations (not long lived transactions
over HTTP Savas, I wouldn't be that daft), or a way of indicating that
only a bit went wrong

Hence the 207 Multi-Status response in WebDav, the "something failed,
look in the message". WebDav is still single instance (here a RESTy
URL), but you can set >1 property and so have partial failure.

SOAP just has SOAPFault and extensions; no explicit multiple failure
response. WS-RF-ResourceProperties has a similar problem with
SetResourceProperties, but a different failure model in which any
failure to set can result in a WS-BaseFault, indicating which failed,
but providing no apparent information on which worked.

It seems to me that if you want to bulk stuff, you do need ways of (a)
handling partial failure and (b) declaring what happens on partial
failure. For the curions, WebDav's failure mode on file operations
(MOVE, COPY) is explicitly declared to be that of failed file operations
of Win98 on a FAT32 filesystem  [1,2]

Alternatively, you dont go for bulk operations, neither on a multiple
jobs, or on multiple properties of a job (remember, WS-RF doesn't
declare atomic/transacted property operations, so all you do here is
increase the window of instability, a window that already exists).
Instead you just stream a series of operations over the same HTTP1.1
connection -assuming that everything is accessible at the same far-end
host, and get a series of (potentially out of order, we are talking
HTTP1.1) responses.

This could be efficient, and you could do better handling of failure.
But you do need a SOAP stack that can keep an HTTP1.1 channel open for
multiple requests. Axis doesnt, even if you get httpclient to do the
HTTP work; I don't know about .NET/WSE. You also need developers to
model the communication correctly. Manipulating JAXRPC proxies as if
they represent remote objects is *clearly* the wrong way to do it. You'd
almost want to model a queue of requests waiting to be POSTed, a queue
you can fill up then push out. Something like this, in your Java-era
language of choice :-

//different queues for SOAP, REST
Queue q=new Soap12RequestQueue();

q.add(new StatePut(job1.uri,Job.LIFECYCLE,Job.SUSPENDED));
//let the queue reorder stuff if it wants to
q.add(new
StatePut(job2.uri,Job.LIFECYCLE,Job.SUSPENDED),Queue.POSITION_OPTIMAL);
q.add(new
StatePut(job3.uri,Job.LIFECYCLE,Job.SUSPENDED),Queue.POSITION_LAST);

q.setEventHandler(this);
q.nonBlockingSubmit();

No, there is no code behind this example, and I am avoiding any hints as
to what the even handler would look like. I think the key point is that
once you embrace remote operations as async actions, then you can model
the manipulations differently.  Note also that I am representing job
suspension not as an explicit suspend() operation, but as a request to
put a job into the suspended state. This API could work with our friend
REST just as easily as with WS-RF...

Anyway Savas, to conclude: do you have any evidence that a single
document is suboptimal compared to a sequences of requests over an open
HTTP/1.1 connection? That is, assuming we ignore the SHOULD in the
HTTP1.1 specification " Clients SHOULD NOT pipeline requests using
non-idempotent methods or non-idempotent sequences of methods" [3]

-Steve

[1] WebDav http://www.ietf.org/rfc/rfc2518.txt S8.9.2

"after encountering an error moving a non-collection
   resource as part of an infinite depth move, the server SHOULD try to
   finish as much of the original move operation as possible."

[2]
http://lists.w3.org/Archives/Public/w3c-dist-auth/1997JulSep/0177.html

[3] RFC2616 HTTP1.1

_______________________________________________________________
Ian Foster                    www.mcs.anl.gov/~foster
Math & Computer Science Div.  Dept of Computer Science
Argonne National Laboratory   The University of Chicago    
Argonne, IL 60439, U.S.A.     Chicago, IL 60637, U.S.A.
Tel: 630 252 4619             Fax: 630 252 1997
        Globus Alliance, www.globus.org <http://www.globus.org/> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/ogsa-wg/attachments/20050405/c1525bf4/attachment.htm