[ogsa-wg] RE: Modeling State: Technical Questions

Steve Loughran steve_loughran at hpl.hp.com
Tue Apr 5 08:10:07 CDT 2005

Savas Parastatidis wrote:
> Dear all,
> I think something needs to be clarified with regards to handling
> multiple jobs with one message. The beauty of document-oriented
> interactions is that you can do things like...
> <job-details-request>
>   <job-id>urn:ogsa:job:guid:bla-bla-bla-001</job-id>
>   <job-id>urn:ogsa:job:guid:bla-bla-bla-010</job-id>
>   <job-id>urn:ogsa:job:guid:bla-bla-bla-002</job-id>
>   <job-id>urn:ogsa:job:guid:bla-bla-bla-029</job-id>
> </job-details-request>
> Or
> <job-suspend-request>
>   <job-id>urn:ogsa:job:guid:bla-bla-bla-002</job-id>
>   <job-id>urn:ogsa:job:guid:bla-bla-bla-005</job-id>
>   <job-id>urn:ogsa:job:guid:bla-bla-bla-008</job-id>
> </job-suspend-request>
> The schema for the above document can allow anything from 0 to N number
> of <job-id> elements.

the trouble with any bulk operation is you have to handle partial 
failure. You need either atomic operations (not long lived transactions 
over HTTP Savas, I wouldn't be that daft), or a way of indicating that 
only a bit went wrong

Hence the 207 Multi-Status response in WebDav, the "something failed, 
look in the message". WebDav is still single instance (here a RESTy 
URL), but you can set >1 property and so have partial failure.

SOAP just has SOAPFault and extensions; no explicit multiple failure 
response. WS-RF-ResourceProperties has a similar problem with 
SetResourceProperties, but a different failure model in which any 
failure to set can result in a WS-BaseFault, indicating which failed, 
but providing no apparent information on which worked.

It seems to me that if you want to bulk stuff, you do need ways of (a) 
handling partial failure and (b) declaring what happens on partial 
failure. For the curions, WebDav's failure mode on file operations 
(MOVE, COPY) is explicitly declared to be that of failed file operations 
of Win98 on a FAT32 filesystem  [1,2]

Alternatively, you dont go for bulk operations, neither on a multiple 
jobs, or on multiple properties of a job (remember, WS-RF doesn't 
declare atomic/transacted property operations, so all you do here is 
increase the window of instability, a window that already exists). 
Instead you just stream a series of operations over the same HTTP1.1 
connection -assuming that everything is accessible at the same far-end 
host, and get a series of (potentially out of order, we are talking 
HTTP1.1) responses.

This could be efficient, and you could do better handling of failure. 
But you do need a SOAP stack that can keep an HTTP1.1 channel open for 
multiple requests. Axis doesnt, even if you get httpclient to do the 
HTTP work; I don't know about .NET/WSE. You also need developers to 
model the communication correctly. Manipulating JAXRPC proxies as if 
they represent remote objects is *clearly* the wrong way to do it. You'd 
almost want to model a queue of requests waiting to be POSTed, a queue 
you can fill up then push out. Something like this, in your Java-era 
language of choice :-

//different queues for SOAP, REST
Queue q=new Soap12RequestQueue();

q.add(new StatePut(job1.uri,Job.LIFECYCLE,Job.SUSPENDED));
//let the queue reorder stuff if it wants to


No, there is no code behind this example, and I am avoiding any hints as 
to what the even handler would look like. I think the key point is that 
once you embrace remote operations as async actions, then you can model 
the manipulations differently.  Note also that I am representing job 
suspension not as an explicit suspend() operation, but as a request to 
put a job into the suspended state. This API could work with our friend 
REST just as easily as with WS-RF...

Anyway Savas, to conclude: do you have any evidence that a single 
document is suboptimal compared to a sequences of requests over an open 
HTTP/1.1 connection? That is, assuming we ignore the SHOULD in the 
HTTP1.1 specification " Clients SHOULD NOT pipeline requests using 
non-idempotent methods or non-idempotent sequences of methods" [3]


[1] WebDav http://www.ietf.org/rfc/rfc2518.txt S8.9.2

"after encountering an error moving a non-collection
    resource as part of an infinite depth move, the server SHOULD try to
    finish as much of the original move operation as possible."

[2] http://lists.w3.org/Archives/Public/w3c-dist-auth/1997JulSep/0177.html

[3] RFC2616 HTTP1.1

More information about the ogsa-wg mailing list