[Nsi-wg] NSI error handling draft - next version
John Vollbrecht
jrv at internet2.edu
Tue May 4 16:20:40 CDT 2010
Seems like this is a good place to think about the relationship of
Management and Service planes. This is -- I think -- different that
between transport and service planes. Interesting -- the planes
picture might come into its own.
John
On Apr 28, 2010, at 2:38 PM, Inder Monga wrote:
> John,
>
> Great points about administrative and maintenance procedures.
>
> We would have to make an assumption that the NSA/NRM gets an event
> with the right "notification" of the reason for topology change -
> through the OSS/network management platform. Otherwise, we will not
> be able to differentiate between the cause of the topology change
> and will not be able to estimate the duration of that change like in
> case of maintenance. We can assume the default case to be #1 if the
> not notified of the exact cause.
>
> Thanks,
> inder
>
> On Apr 28, 2010, at 8:50 AM, John MacAuley wrote:
>
>> Peoples,
>>
>> Had someone show up in my office so I missed the conversation over
>> "Resource change from available to not available." I thought I
>> would provide some input on the topic based on my DRAC experiences.
>>
>> I think there are three types of events that can initiate a
>> topology change that should be understood when defining the error
>> handling. Two of these are actually not errors but normal
>> operating procedures within a network:
>>
>> 1. Physical network failure resulting in a topology change -
>> typically the temporary removal of a link from topology with no
>> knowledge of when it will be restored.
>>
>> 2. The permanent removal of a link from the topology by a network
>> administrator. Actually, this one should include the
>> reconfiguration of the network where an entire node could be removed.
>>
>> 3. The temporary removal of a link by a network administrator for
>> maintainence purposes. This will typically have a defined start
>> and end time based on the maintenance window.
>>
>> #1 is interesting in that it impacts existing schedules in an in-
>> service state, reserved schedules not yet in service, and any new
>> reservation requests.
>>
>> a) Those schedules in-service using the links impacted by the
>> topology change may undergo some type of restoration. If this was
>> a protected circuit then underlying transport will restore the
>> service and we may not want to do anything about it. If this was
>> an unprotected service then perhaps re-dial could be initiated by
>> the NRM in an attempt to achieve a lazy restore.
>>
>> b) Depending on the estimated length of the temporary topology
>> change we may need to recompute the paths of those schedules
>> reserved but not yet provisioned. We should not recompute the
>> paths from the point of failure to the end of time but for some
>> predefined floating window optimistic enough to give the failure
>> time to recover, and reduce the amount schedules that would be
>> recomputed. For example, a floating one hour window would mean all
>> reservations up to an hour in the future that could be impacted by
>> the failure can be recomputed. If the failure is cleared and the
>> topology is restored then there is a one hour window that should
>> have been cleared. The interesting side-effect is we now have a
>> window of time to make sure the link remains trouble free. The
>> question is have we blocked that link from use or can a new
>> schedule use the remaining hour if it comes in after the trouble
>> has cleared.
>>
>> c) If a new reservation request for a future point in time arrives
>> while a failure has taken the link out of topology do we remove the
>> link from computation, or do we add an optimistic guard time after
>> which we can assume the link will be restored?
>>
>> #2 is different from a fault condition in that an administrator has
>> removed the link from topology. We can model this gracefully if we
>> can have a high priority (preemptive) administration reservation
>> that can block the bandwidth on a link from the point in time the
>> link will be removed through until infinity. Any schedules this
>> preemptive schedule impacts will need to be recomputed as discussed
>> in the previous example, or if provisioned switched to protection/
>> re-dialed to restore. At some point on or after the start of the
>> preemptive schedule the link can be permanently removed from
>> topology and the reservation blocking that link cleared.
>>
>> #3 is similar to #2 except there is a defined end time for the
>> preemptive schedule blocking the link. Only reservations
>> overlapping with the maintenance window would need to be
>> recomputed. Obviously, any provisioned schedules would need to be
>> switched to protection or re-dialed to restore.
>>
>> John.
>>
>> On 10-04-28 2:14 AM, Inder Monga wrote:
>>>
>>> Hi All,
>>>
>>> An updated draft based on comments. We attached a table in the
>>> front to summarize and use it for discussions. Look forward to
>>> discuss this tomorrow.
>>>
>>> Thanks,
>>> Inder
>>>
>>>
>>>
>>> On Apr 20, 2010, at 10:49 PM, Chin Guok wrote:
>>>
>>>> Hi all,
>>>>
>>>> I've attached a draft of the error handling section that Inder
>>>> and I came up with for the NSI Architecture document.
>>>>
>>>> This is a rough first draft, and there are some obvious portions
>>>> missing, but it gives an idea of where we heading.
>>>>
>>>> Comments are most welcomed.
>>>>
>>>> Thanks.
>>>>
>>>> - Chin<NSI Error Handling Chin_Inder
>>>> v2.docx>_______________________________________________
>>>> nsi-wg mailing list
>>>> nsi-wg at ogf.org
>>>> http://www.ogf.org/mailman/listinfo/nsi-wg
>>>
>>>
>>> _______________________________________________
>>> nsi-wg mailing list
>>> nsi-wg at ogf.org
>>> http://www.ogf.org/mailman/listinfo/nsi-wg
>>>
>>
>> _______________________________________________
>> nsi-wg mailing list
>> nsi-wg at ogf.org
>> http://www.ogf.org/mailman/listinfo/nsi-wg
>
> ---
> Inder Monga http://100gbs.lbl.gov
> imonga at es.net http://www.es.net
> (510) 499 8065 (c)
> (510) 486 6531 (o)
>
> _______________________________________________
> nsi-wg mailing list
> nsi-wg at ogf.org
> http://www.ogf.org/mailman/listinfo/nsi-wg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/nsi-wg/attachments/20100504/eb9b075f/attachment.html
More information about the nsi-wg
mailing list