[Nsi-wg] time issue

Wed Oct 6 13:02:05 CDT 2010

Some  comments below -  I will include some of this in a revised version of my doc later today --

On Oct 6, 2010, at 10:15 AM, Jerry Sobieski wrote:

> Hi John-  This is a pretty good summary.   Just a few comments (I can never resist:-) inline...
> 
> John Vollbrecht wrote:
>> I would suggest that there are at least several  issues that need to be resolved here.  I try to define the issues and what my solution to each of them would be - at least right now.
>> 
>> --
>> 
>> 
>> One is how to manage time across multiple NSAs - this is a problem when there are two NSAs, and more of a problem when there may be multiple NSAs involved in making a connection.
>> 
>> 1) time syncrhonization Jeff has suggested that we require NSAs to run NTP to synchronize time between NSAs.  The only down side I see of this is that it requires each NSA to run NTP.  If we agree on this approach, then we might want to investigate if only provider NSAs need to run NTP, assuming a non provider will talk to a provider -  this might make it easier for applications to be NSAs.
>> 
>> The up side is that a request for starting and ending time is understood the same way by all NSAs.  This means that a request that goes to multiple lNSAs in an authorization sequence will all see the same time and understand it the same way.
>> 
>> *I vote for requiring NTP and working out the details - especially the potential difference in scheduling between segments because of possible skew in NTP time across NSAs.
>>  
> IMO - if we *require* the NSAs to be synchronized around a common global time (which is what NTP does), we need to do three things:
>    1. state clearly and concisely *why* this is required (I do not think it is *required*)
>   2.  specify the acuracy that is required to meet #1
>   3.  come up with a means to verify time synchronization (since out of sync NSAs must not be tolerated.)
> 
> Else, we can *recommend* that the NSAs *should* use a protocol such as NTP to synchronize their time.   We can still say why this is useful, but we do not require such synchronization.    (I think this is the right approach).
> 
> Either way, the protocol must still be designed to handle discrepencies between NSAs.  Synchronization is not perfect, nor does it address the non-deterministic processing times and propagation delays that exist in a distributed architectures such as NSI.  These are "timing" issues, not "Time" issues (if you get my point:-).   The protocol state machine in different NSAs should converge to common state for  a connection given enough time for messages to filter appropriately thru the service tree and underlying network.  This is what I think we need to consider in more detail.
> 
> I vote that we *recommend* NTP or a similar protocol be used to synchronize day clocks so that coordinated scheduling across independent agents is most effective.

I think discussing the purpose of synching clocks is important.  In my view the main reason is to allow automatic provisioning to happen at the same time in all NRMs that are involved in creating a connection.  I think the STP is a relatively simple way to be sure that all agents see the same time, thought there are probably other ways.  I think this is a topic that can be broken out from other discussions of time, if we assume that somehow all NRMs will have a "close enough" agreement on time.  We can include "close enough" in final timing diagram as a "time skew" whose size depends on how the time synching is done.

>> 2) Relationship between available and scheduled time
>> 2a) available time definition
>> I think we agree that start and end time (or duration) in a request will be for available time.  Available time is time when the connection is actually available to carry traffic.  This is the time that is sync'd using NTP.  Note that available time is always an estimate because startup duration is variable.
>>  
> No.  The time that is sync'd using NTP is the *system clock* in each NSA.   
I agree - I wasn't trying to imply anything else.

> The Available Time or the Resource Time are simply static database entries in the ResourceDB, or in a "ConnectionDB".  These times do not need synchronization, they are specified by users or NSAs.   Its the system clocks that need to be synchronized.   A "scheduler" process in each NSA periodically compares a scheduled process (say Resource Time) to the "current time" as represented by the system clock.  If these two times match, a functional routine is called to process that event.   Its important to recognize that any of the times we put in the ResourceDB represent allocation windows that we want to coincide, but it is the system clock - that is continually changing - that gets synchronozed by NTP.
> 
> I do concur that the Available Time represents the time at which the circuit is intended to be In-Service and carrying traffic.
> 
> I disagree that the Available Time is an estimate.   It is a target.  It is the Requested Available Time (from the RA) or the Committed Available Time (from a PA).   
actually I think it is the estimated (committed) time from the PA.  This consists of an estimated start time which is what causes the total available time to vary.  The tear down time is committed - the network *will* start teardown at a specific time.

> But it is not an estimate.  The *estimated* time is the "Estimated Provisioning Duration" which can not be known exactly in advance.   The variance in Actual Provisioning Duration causes the Actual Available Time to vary - which is why you called it an estimate.   But IMO we should be clear that their are a number of times being batted about regarding when the cirucit is supposed to be usable.   IMO there are three:  Requested Available TIme,
 [I would say estimated]
> Committed Available Time, and Actual Available TIme.  
>> 2b) resource or scheduled time
>> This is the time a resource is scheduled for by its own NRM.  It calculates this time by inserting startup time for resources it controls before the requested start time, and inserts tear down time after the requested end time.  This time is different for every NRM and may be different for different equipment within an NRM - that is an NRM implementation.  In automated provisioning resource time is not shared with other NSAs.  The impact of this will be when trying to schedule connections close to each other, they connnection reservations must be separated by startup and teardown times in participating NSAs.
>> 
>> *I would like to see us accept these time concepts in talking about time issues.  There are probably issues I  haven't thought of yet that should be included in the descriptions.
>> 
>>  
> With the caveats and nuances I mentioned above about sync and the Avail Time distinctions, I think you have captured the sentiment well.
> 
>> 3) Difference between automatic and manual provisioning
>> 3a) Automatic provisioning is defined as having each NRM initiate connection so that it is estimated to be available at the requested start time. The time is estimated. It would be good to put some sort of bound on this, similar to what is done for NTP.  This requires that the provider be able to make a estimate of startup and teardown time such that it occurs in a predicable way.  Failure of a provider to do so would be an issue for SLA.
>> 
>>  
> I would be pedantic and say the NSI does not specify what the NRM does.  NSI *only* specifies what NSAs do. 
> Auto Start is defined as each NSA initiates provisioning independently based upon a locally scheduled provisioning start time.    Manual start is where each NSA awaits a  ProvisionRequest message from another authorized NSA to initiate provisioning.
> 
> I think trying to bound the error on "best estimates" is pointless, and frankly out of scope.  The start time is either met, or it is not.    Its like On-time Departure figures for an airline...they are estimates and do not guarantee anything.  They are of questionable accuracy, self-stated, and not consistently computed.   Is "On-Time" actually "On Time" or within 5 minutes of scheduled?  Does an hour late count as a missed departure the same as a 5 minutes late departure? 
> The way to improve this IMO is to simply say that, apriori, "Available Time" is either Requested Available Time or Committed Available Time.   If the post facto Actual Available Time occured after the Committed Available Time, tough.  It happens, its unavoidable, get over it.    From a protocol standpoint, once the Committed Available Time has passed, the RA has a decision to make:  Do I stay? Or do I go?  i.e. should I wait a little while to see if things fall into place?  Or should I give up and tear down the reservation?   These are the only protocol options.
> 
> The user may consult an SLA to see if there is recourse, but that is not part of the NSI protocol.   The savvy user would institute an SLA before the fact to insure that there is substantial incentive for the Provider to meet the Committed Available Time.
> 
> Doing some studies to track ontime departures could be useful, but it is not part of the NSI protocol.   It is way off track in my opinion.  It is something another tool or other software (perfSonar?) should be doing.
> 
>> 3b) Manual provisioning is defined as having a provision message sent to an NSA to initiate provisioning.  This capability requires that requestor know when a connection is available to be provisioned.  Presumably this time is the start time in the reservation minus startup time.    How the requestor know this time is not clear, though it is possible to build a protocol that would make it available.  In addition, the method of determining startime when a provision request is sent through a sequence of NSAs is also difficult, though not impossible.  
>> * I would vote to include automatic provisioning in V1 architecture and protocol, and include manual provisioning as a future in architecture and not in V1 protocol.  This gives us time to explore manual provisioning and its uses cases before trying to define how it is implemented.
>> 
>>  
> I agree.  Leave Manual provisioning for future.
> 
> However, we still have ASAP requests to deal with.   I think manual start and asap circuits may share many of the same challenges:-)    This is due in large part to our requirement that reservation must precede all provisioning and estimated provisioning time (pure craziness:-)
This is also the point I think Tomohiro makes in his note.

> 
> To elaborate a bit on the topic on manual provisioning...  If the requested start time represented the start of the reservation (what we call Resource Time), then manual provisioning is a snap.  The RA stipulates when the begining of provisioning will take place, and then the RA initiaties it at that time.   This is in fact a very simple and deterministic process because it is entirely message driven from top of the tree down to the NRMs.   We only run into confusion when we cross purposes and say the requested start time from the RA represents a time when we want the provisioning to have completed - and then say we'll initiate the provisioning.   These are IMHO counter to one another.
> 
> IMO, for manual circuits, the Requested Start Time is the beginning of the Provisioning phase, or the  "Resource Time" referenced above.   And for auto-start circuits,  the Requested Start Time can represent either the Resource Time or the Available Time (Provisioning Start or In-Service Start respectively).     For autostart, there is no fundamental dependency that we prepend an estimated provisioning time to the requested start time...that is just a convention the we agreed the network would assume responsibility for meeting the user's deadline.   We could agree otherwise as well.
> 
> Frankly, it would make for a simpler and more reliable protocol if the Requested Start Time always represented the beginning of the Resource Reservation, AND provisioning time was always considered part of the reservation.   This would keep us out of the game of estimating future performance...which will never be exact, and always wasteful.   
This seems a network centric view.  From and application point of view it would be simpler if time was always the available time, even if it is a good estimate.  I think this is where you have an issue with SLA (I might be wrong of course).  In the available time is something the network promises to provide (within some bound), then it can be spelled out in an SLA.  If it doesn't meet that time, then the terms of the SLA kick in.  If you need something in the protocol to determine if the SLA has been met, then this should be included in the protocol.  Saying one never knows when a connection will be available is a recipe for having no-one ever use the service.

>  As a complementary suggestion, I would have the End Time always represent the end of the reservation, and any de-provisioning that takes place does so after that time.   Finally, this would make it easy to implement either autostart or manual start as all NSAs would have reserved the resources for a common start time.
> 
> Jerry
>