[GRAAP-WG] Feedback on "Challenges in EU Grid Contracts"

Tue Aug 28 04:07:08 CDT 2007

On Aug 28, parkinm at cs.man.ac.uk modulated:

> However, I don't think a message acknowledgement (in its most general 
> sense) is "a transport-level problem" (I'm not sure if you were agreeing or 
> disagreeing to this point and what your reference to SOAP/HTTP means - it 
> isn't clear).
> 

There are multiple schools of thought on web service messaging, and I
am not sure which school I belong to. :-) SOAP is a transport agnostic
standard which has been conventionally applied with HTTP transport but
could just as easily be mapped to SMTP, local procedure calls, a
reliable message queueing system, or any other message-bearing
infrastructure.

It becomes murky because many of the arguments assume certain other
software engineering practices in addition to the standards
themselves.  For example, there are those who want to separate the
basic content-bearing messages, e.g. the abstract offer-accept
protocol, from the implementation, e.g. offer-as-SOAP-over-HTTP or
offer-as-SOAP-in-reliable-message-queue.  There is nothing in the WSDL
for a protocol which says that an in/out (RPC-style) message exchange
has to occur in any particular amount of time.  A year-long
asynchronous offer-accept exchange could be represented in a single
SOAP in/out message pair, if the transport bindings were chosen
appropriately to allow reliable routing and processing of the messages.

At the same time, the layers of abstraction are expected (by most
people) to leak on occasion, e.g. the application becoming aware that
a TCP/IP error prevented the SOAP-over-HTTP message delivery, or the
reliable message queue taking responsibility for the message so the
sender does not have to attempt retry etc.

> For example, how do I know that a service didn't crash between the 
> transport-level message being accepted and the application processing it?
> 

What difference would it make? :-) Is there some difference between
crashing after the transport ACK or the application ACK?  You seem to
be assuming that the application is somehow more trustworthy than the
transport to buffer/persist a message until application restart can
make progress.  This is an implementation choice that is not captured
at all in the WSDL, but in the endpoint software stacks and the SOAP
bindings that are in use...

In any case, I am not advocating one view or the other, but trying to
provide background on why the mixture of approaches has been defined
in WS-Agreement.

> >   This is a crucial technical point in WS-Agreement. In order to
> >   define the acceptance moment as when the acceptance is delivered to
> >   the initiator (as in your contract law discussion) would require a
> >   reliable delivery model where some trustworthy (third-party)
> >   observer can witness the delivery.
> 
> No, not necessarily. Note that business (generally) operates quite well 
> without third-parties and often uses unreliable messaging to form 
> contracts. There is also a well-known example (in legal circles, anyway) 
> used to illustrate the loss of acceptance messages where the mailbox rule 
> is not being applied, i.e. when acceptance is delivered to the offeror. It 
> does not require a "reliable delivery model" or "a trustworthy observer" to 
> be in place.
> 

I think you are bringing in some of those intuitions I warned
about... your legal example works because of a presumption of courts
and evidential proof of due diligence etc.  We could not readily
assume such an environment for resolving WS-Agreement negotiations,
since audit systems and legal systems are not within scope of the
GRAAP-WG.  We just toss up our hands and say one party is at the mercy
of the other party sending notice of the decision.  It becomes a
separate quality of implementation or trust issue for offerors to
choose offerees.

This was important in order to provide a stepping stone for existing,
traditional resource managers to be updated with WS-Agreement. The
underlying offer/accept handshake and decision bias is very much
consistent with conventional job-submission systems.  We were also
focused on relatively frequent (and automatic) real-time negotiation,
i.e. the job and transient network QoS usage scenarios that motivated
the GRAAP charter and its predecessors in the GARA and SNAP work. I
think concerns about legal status and financial obligations took a
back seat to concerns about practical distributed resource management.

All the same, I think our solution MAY be applied soundly in an
environment where more audit and legal rules are available to resolve
conflicts in retrospect. We took pains to provide the runtime
extensibility handling so that profiles/extensions could be defined
for things such as digital signature of offers and acceptance messages
(early public comments worried about getting digital signatures for
non-repudiation, and we tried to enable this without making it
mandatory for traditional Grid use cases).

> I agree, this is why we use the 'mail box' rule as otherwise we get into 
> evidential issues about how many times the accept was attempted to be 
> delivered, etc. But the offeror is still contracted whether they like it or 
> not.
> 

If the obligations can be contested based on "how many times the
accept was attempted to be delivered, etc.", then what does it mean to
say the offeror is "contracted"? (I ask this only out of academic
interest, since you have already advocated the mail box rule.)

> The message must be an offer/not be an offer. It can't be a non-obligating 
> offer, in that case it would be something else.
> 

OK, I'll accept this tautology and rephrase my statement to say that
there is a practical ambiguity in whether the message is an offer or
not, if one is not using the mail box rule, due to the ability for the
initiator to pretend not to receive the acceptance.

> A side question: what do you mean by "this information is distributed 
> asynchronously to all parties".

I was thinking of the co-allocation scenario where multiple SLAs are
being formed with different providers.  The initiator (in the advance
reservation case) is acting as the coordinator to obtain multiple
advance reservation SLAs (to prepare) and then obtaining matching
claim SLAs (to commit) if he decides to go forward with the entire
transaction.

> Thanks for taking the time to privide these clarifications. However they 
> don't address the other question of the resource provider having to 'lock' 
> it's resources in a 2PC-style when it makes an offer, as was also discussed 
> (and a solution proposed) in our paper.
> 

I am not sure I know what question or solution you mean here.  All I
see in my reading of the document is a statement that a resource
provider should be the offeree to avoid live lock.  Is this what you
mean?

WS-Agreement defines the initiator/responder roles (offeror and
offeree as you say) and is silent on resource provider/consumer roles
because these concepts are inherently domain-specific. However, in our
example of a job system, we illustrate the use of JSDL-based terms for
a client to make a job offer that the computing scheduler can accept
or reject.  This is our expected idiomatic application of
WS-Agreement, which matches your preferred scenario.

Your example in the document has a (to my eye) unusual scenario where
a user "asks for a quote" and the computing service makes the offer.
This is something we could have handled in SNAP via our invitation
message, but the GRAAP-WG intentionally removed this capability.  As I
described in an earlier email, one could approximate this through a
sophisticated use of advertisements (templates) and some template
exchange system, but I think it is safe to say we marginalized this
use case.

Interestingly, we found that there are differences of opinion as to
what the "resource" is in some concrete examples!  It really depends
on the domain, market, and community as to which aspect/role of the
SLA is the more rare resource worth optimizing.  If we assume that
SLAs are about exchange of resource/service, the question is whether
one party has a worse opportunity cost in obligating their resource in
exchange for the other party's resource.  If so, it would be
preferable to allow the party with the worst opportunity cost to act
as the offeree.

karl

-- 
Karl Czajkowski
Software Architect

Univa Corporation
1001 Warrenville Road, Suite 550
Lisle, IL 60532

karlcz at univa.com
www.univa.com
________________________________________________________________
www.univa.com.

The Leaders of Open Source Cluster and Grid Software

The information contained in this e-mail message is from Univa Corp. and
may be privileged, confidential, and protected from disclosure.  If you
are not the intended recipient, any further disclosure or use,
dissemination, distribution, or copying of this message or any
attachment is strictly prohibited. If you think that you have received
this e-mail message in error, please delete the e-mail, and either
e-mail the sender at the above address or notify us at our address.