[Nsi-wg] TransactionId

Wed Jul 20 22:40:36 CDT 2011

Henrik,

Thank you for the feedback.  I have a few comments I will place in line, but first I think I should clarify a couple of the design patterns used in the NSI protocol.

1. Although NSA can be ordered in a tree, the shape of that tree is dictated by the path of the reservation.  Each reservation can be a differently shaped tree, with a child in one reservation being a parent to the same NSA in another.  This leads to more of a peer relationship.  This is important to understand since the orientation of the RA/PA relationship can change between a pair of NSA for almost each reservation.

2. The NSI protocol is session-less.  We have modelled the entire protocol around the state machine of a connection reservation, and this state machine exists independently of a session.  We have introduces a set of asynchronous operations/messages to manipulate state transitions, and communicate successful or unsuccessful attempts at transitions.  Some people view the NSI protocol as a message oriented protocol, while some view it as an remote operation oriented protocol.  Either could hold depending on how you view we invoke transitions on the state machine.  Is "reservation" an operation or a message?

3. I have attempted to me faithful to the standard SOAP web services design patterns when creating the specification.  We follow a standard asynchronous messaging and eventing SOAP model, however, there is always room for improvement as we move forward with implementations.

Now on to the specific comments.

On 2011-07-20, at 5:13 AM, Henrik Thostrup Jensen wrote:

> Hi
> 
> On Tue, 19 Jul 2011, John MacAuley wrote:
> 
>> This weekend we had a lively discussion on transactionId.  In the end itwas agreed that we would continue
>> to support this identifier, however it will be renamed.  I have currently defined it as a uuid for
>> uniqueness as is done in many implementations using a transactionId.
> 
> I'm going to chip in a bit here.
> 
> I think a big source of the confusion (and also for the failed messages) 
> is that the protocol has switched from RPC to message based, and is now 
> being wrapped into an RPC protocol (WSDL/SOAP). I think this is giving us 
> the worst of both worlds and is needlessy complicated.

I think that during the meetings on the weekend seeing how we could simplify the protocol and close the open holes that exist.  We questioned the existing protocol, and every proposed change to see if it would simplify the code, and the handling of the state machine.  I enjoyed the discussions since this was the most detailed dive into the protocol and modelling the whole group took.  The fact we were arguing over the usefulness of transactionId and returning state information in failed messages means we are extremely close to complete.  I just wish we were at this point 6-8 months ago.  Excellent progress.

> 
> If we choose a message-based protocol, there shouldn't be any direct 
> responses to requests as such. Instead the requester would receive 
> messages when the state is updated at the provider, e.g.:
> 
> R -> Reserve -> P
> P -> Status  -> R (switched from INITIAL to RESERVING)
> P -> Status  -> R (switched from RESERVING to RESERVED)

One of the issues we may have with the current specification is that there is an implicit assumption that when a reservation|provision|release|terminate request message is sent to an NSA (and ack returned) that the NSA transitions into the transient  "ing" state.  We could extend the existing specification to explicitly model the "ing" states but either doing some additional initial processing on the request message such the the response indicates the "ing" is occurring.  The other way to do this is to created an "ing" message from PA to RA similar to the confirmed.  This would explicitly transition us to the "ing" state in the state machine.

We could also compress the "ing", confirmed, and failed messages into a single status message, but I am not sure if this would simplify anything.

> 
> That way there would only be one message type from provider to requester 
> (StateUpdate), and there is no need for a request/correlation id, only 
> connection id. Messages can also be duplicated without any risk if the 
> protocol is designed properly.
> 
> If we go with an RPC based model it would look like this:
> 
> R reserves at P
> R blocks until P answers created or failed.

I can argue that what we are implementing is an RPC model with callbacks.  This is a standard asynchronous design pattern for implementations that do not what to block while waiting for results.  I believe most mature RPC-based models support this as well.

> 
> As it can take some time for a reservation to complete this is also an 
> option:
> 
> R reserves at P
> P answers to R that the reservation request has been received
> R polls P in regular intervals for state updates.
> 
> Note that since we have an RPC design, only the requester initiates a 
> session. As RPC is session based per definition, there is not an actual 
> need for a session/correlation id at NSA level, though it might be the 
> case for the protocol (many protocols do this in order to make multiple 
> requests per connection, but thats a detail).

Not sure I understand your use of "session" here since the term is so overloaded.  But from an RPC point of view I believe our session only exists from the point we issue the request, to the point the acknowledgement response returns.  We have decoupled the confirm from the request session.

Polling is Satan's work and we avoid it at all costs :-)

> 
> What we have now is a very odd design with RPCs going in both directions, 
> which is odd to say at least.
> 

Not in the least bit odd.  There are existing web service standards that follow this model, especially for event delivery using HTTP.  Also, don't forget we have queries going from both parents to children and from children to parents.

> The current design is a bit wrt. to failure handling, e.g. what happens if 
> a reserveConfirmed message/request is not delivered (if the requesting NSA 
> is unavailable for some reason). This can probably addresses using query, 
> but it is not specified in the protocol what should happen.

If after a period of retries you cannot deliver a confirmed message you can drop it.  The RA will either request the operation again, or catch the missed confirm with an audit using query.

> 
>> Now Inder asked if we could make it a sequence I'd to help with processing on the RA side.  Comments?
> 
> I'm glad you decided not to on this one. Sequences usually end up having 
> to deal with holes by introducing hold-back queues and bending over 
> backwards to avoid duplicates etc. (and if holes or duplicats are okay, 
> one should not have used sequences anyway).
> 
Took me endless hours on a plane to think it through.  Actually, I realized it about ten minutes after I sent the message and turned off my phone for the flight. :-)

> 
>     Best regards, Henrik
> 
>  Henrik Thostrup Jensen <htj at ndgf.org>
>  NORDUnet / Nordic Data Grid Facility.
> _______________________________________________
> nsi-wg mailing list
> nsi-wg at ogf.org
> http://www.ogf.org/mailman/listinfo/nsi-wg

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1791 bytes
Desc: not available
Url : http://www.ogf.org/pipermail/nsi-wg/attachments/20110720/dd856695/attachment.bin