[Nsi-wg] New state machine with two phase reserve and modify

Jerry Sobieski jerry at nordu.net
Tue Jul 3 17:23:31 EDT 2012


Hi John-

Regarding this state machine-
We really need to consider how this Modify can be handled more like 
ballet than hockey.  :-)


--- First, what are we trying to solve with 2PC?   The multi-domain 
pass/fail scenario where some succeed and some fail causing the 
successful modifications to be rolled back is a resource management 
issue - not a protocol state issue.  If the successfl domains will still 
have to be rolled back - we don't escape this fact.   Its just that the 
RA is not responsible for doing it.   You are placing the task on the 
PA.  And forcing the RA to now issue two messages to make sure he wants 
what he just asked for.

A better approach would be think of this differently - think of the new 
modified connection as a separate new connection ... what is it about 
the new connection that you want?  The same path as the old connection?  
Ok. We do not want to release the old resources into the wild...may lose 
them.  OK.  We want to minimize the impact to the data flow of the 
resource modification.  OK.

So make a constraint for the path finder: "You must use resources that 
are assigned to ( "available" | "Connection X") "  The protocol runs 
normally... The reservation is made.  Resources are already multi-booked 
to different connections for scheduling purposes, so double booking 
resources to two connections simultaneously should not be technically 
difficult.  Just need to make sure we can switch them over appropriately 
- but fortunately this isn't a state machine issue, its a configuration 
issue and resource database issue..

Please see my alternative "shadow" proposal that does not require 
wholesale replacement of the state machine.

--- This proposal seems to require 2PC for normal Reservations as well 
as Modifications.   Is this right?   This is a major (and unnecessary 
IMO) departure from the existing State Machine we have worked on so hard 
for over two years.   We have had so many Very Long discussions on this 
in early 2010 in which we decided the simple single Reserve command with 
an active Terminate was preferable to the 2PC (the 2PC requiring a top 
down confirm...doubling the number of messages required, and introducing 
race conditions with timeouts that were not easily resolved.)  Why, 
after so long and demonstrated success in v1.0, do we suddenly need to 
change horses to a two phase commit for the entire model?    I fear this 
will cause extensive re-writes ... for what feature again?  TO extend a 
schedule?

2PC is an optimization strategy where you /assume a fail/ is likely to 
occur down tree, and the RA does not wish to be responsible for cleaning 
up unneeded reservations.   If you assume reservations will mostly be 
successful, then a 2PC imposes twice the messaging, and imposes this 
coding complexity on all applications so that some very few can use this 
complex approach to extend their schedule?   This is way too complex an 
approach.

--- The 10 minute timeout for backing out a reservation implies that 
those resources are blocked until they are freed - which poses all sorts 
of security and resource management problems.  If an agent is 
sophisticated enough to manage a Modify function, it should be 
sophisticated enough to be responsible for retracting unused 
reservations.  Indeed, the 2PC should charge based upon request, rather 
than confirm - since a user could arbitrarilly request many big 
connections and just never confirm them.

--- Other protocols recognize long delays in provisioning.   Most of 
these have a "Continuing" message that represents a heartbeat of 
sorts...  If a protocol agent cannot complete its processing in a set 
time - but is still trying, it sends a "Continuing" message to the 
requester to indicate it is still working.   As long as CONT messages 
are being passed between nodes downstream, the upstream nodes will 
continue issuing CONT messages.   If someone along the line gets fed up, 
they issue Terminates appropriately and everything is torn down.  NSI 
could do this as well.

--- In slide 6 (?) you describe a bunch of NRM functional 
requirements.   I think I understand what you mean to do, but we should 
not phase these as NRM functions.  NSI has nothing to do with NRM 
functions.  Its inconsistent with the spec and can be very misleading 
and confusing.   ->  As far as NSI is concerned, all events are NSI 
protocol events.  For example - given a NSA "John" presiding over a 
local domain "Ottawa", and say two peering with domains Guy/Cambridge, 
and Inder/Berkeley, all interactions between John and Guy  or John and 
Inder are handled by NSI CS protocol.   The local Ottawa domain - which 
John considers his "local" domain - is managed by John's NRM consisting 
of a dozen minions with hockey sticks.   From an NSI standpoint, that 
local Ottawa domain could be placed under control of another [more 
sophisticated] NSA Jerry, and John would then interact with Jerry/Ottawa 
using the NSI protocol instead of body checks. (:-)   As far as NSI is 
concerned, the interaction John had with Ottawa using his traditional 
ways is functionally and equivallently replaced by NSA Jerry running 
NSI.  Thus, the *only* events that NSI needs to stipulate are those NSI 
events.
:-)   (The NSA _/implementation/_ is responsible for translating any 
local events into their appropriate NSI protocol events.  but the spec 
does not care anything about this translation.)

--- In general, it seems to me this whole proposal is trying to dance 
around the issue of adding or removing resources to a connection in 
service using our existing inflexible path finding and constraints 
specifications.   All this means is the Path Finder should be able to 
compute a new path using the existing allocated resources - albeit 
allocated to a specific connection.  It seems that if we were able to 
indicate this simple change to path finders, then all the funky modify 
state becomes unnecessary.  We just reserve a connection that can do 
double booking of resources, and provision it when it is confirmed.   If 
the new connection overlays the existing connection, all the better.  
Right?   And we can use existing primitives.

Best regards
Jerry
On 7/2/12 11:06 PM, John MacAuley wrote:
> Peoples,
>
> Here is the new and improved NSI CS state machine fresh off the presses and ready for your viewing pleasure.  Please study it and prepare questions for the Wednesday call.  We would like to close on this action ASAP.
>
> Thank you,
> John.
>
>
> _______________________________________________
> nsi-wg mailing list
> nsi-wg at ogf.org
> https://www.ogf.org/mailman/listinfo/nsi-wg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/nsi-wg/attachments/20120703/153ee185/attachment.html>


More information about the nsi-wg mailing list