[Nsi-wg] Pathfinding, Labels, and Topology ([still] a bit long)

Thu Dec 1 04:45:44 CST 2011

Hi,

On to the fine points of topology handling:

On 1 Dec 2011, at 00:44, Jerry Sobieski wrote:
>>> Our current topology model works in that it has a 1 in 4 chance of getting the right VLAN across a network is acceptable. However, we're still using only 4 VLANs, once we go to 4096, we get to a 1 in 4096 chance.
> In most cases you could look at this as actually more likely to work.   In SC topo if we had one VLAN in use (25%utilization), we had a 75% chance of a successful second choice. In the scenario above if we have one VLAN in use, we have a 99.92% chance of a correct hit for the second VLAN (!). And we would still have better chances for the first 1000 VLANs we randomly choose!!!!   And if we have 1000 vlans in service (25% utilization) we still have a 75% chance of a successful choice.

I'm not talking about availability, I'm talking about compatibility.
As it is right now, the inter-domain pathfinding is done by the Aggregator NSA agent.
This aggregator agent has a view of the inter-domain topology, where STPs are mapped to ports with VLANs. So, we assume there are 4096 different STPs for one port.
The actual value of the VLAN label is not available to Aggregator agent when it is doing inter-domain pathfinding.
A path planned through an inter-domain object currently consists of a consecutive list of STPs.

For example (domains are identified using the first letter):
 [source, A1, B22, B78, C6, C42, D09, destination]

We do not know the value of the underlying VLAN labels. The inter-domain segments in this case are (A1,B22), (B78,C6) and (C42,D09). Since we have 4096 labels, there are 4096 different options for each of those segments.
How do we know that we use the same VLAN label on all those three segments? It could very well be that this is an empty network, all of them are available, yet we still only have a 1 in 4096^2= 16,777,216 of getting the right option.

>>> In the demonstration at SC we relied on the human to make requests from one endpoint to another endpoint, using the same VLAN. I have not seen any requests made using different VLAN labels.
>>> Also, I have seen and heard that NSA implementations used the last part of the ID to figure out the correct label and use that in their pathfinding algorithms.
>>> I do not think that that is a desirable solution.
> While I am afraid and _/literally appalled (!)/_ that some NSAs may have indeed parsed the STP name for a vlan hint, this was incorrect and is easily broken.  It makes totally incorrect use of the topo information and is a really REALLY BAD assumption.  (I put that vlan info in the STP tag to make it easier for developers to debug things - not as a shortcut for anything...rest assured the next topo file will have no such human readability.)   This is like parsing an IP Hostname (www.google.com) to recover its IP address...it doesn't work.  I can easily create a topology that describes the same SC layout that breaks those implementations.  Would you trust other networks to be so exacting?   STPs are symbolic references - they do not contain any technology specific information themselves.

We are all aware of that. However, with the current topology in the demonstration we did not have any other option.

> So I don't want to hear that about seriously flawed implementations and weak pathfinders are the driver excuse for changing the topology model or the abstractions of the architecture.

I am not putting this up as an excuse, I'm observing reality.
Pathfinders will indeed have to become more robust and do retries on failed paths. Given my above calculations, that retry should be really robust. We will also have to update our timeout values to use a days timescale, since trying 16 million different paths is going to take a long time.

>>> Let me reiterate:
>>> The current NSI implementation is completely unaware of labels. This makes it near impossible to make informed decisions about paths crossing several domains. For each domain a path crosses the chance of finding the right path decreases exponentially.
> What do you mean by an "informed" decision?  Even if you knew all about the labels there is no guaranty that the other constraints on the connection are available. i.e. the endpoint (labeled or otherwise) is just one constraint that must be met for success.
> 
> The chance of finding a successful path is a function of the number of labels, the diameter of the network, *AND* the availability of those labels, *AND* the algorithm for selecting the trial order by the RA, *AND* most importantly the availablitiy of the other transit resources.  Yes the worst case is exponential...but the *likelyhood* of the worst case is of equal importance.    The easiest way to reduce the lieklyhood of a worst case exhaustive search is to provide *MORE* STPs and do a random trial order.  This would make the likelyhood of a hit camparitively much higher.   Of course a better solution would be to have access to all topology state...but that poses equally exponentially complex issues and is not going to happen either.
>>> 
>>> The only way to make label unaware pathfinding work is by making 4096 versions of each of the different domains in the global network.
> While this would work, its not the *only* way to work.  Proof:  It worked for SC.

I do not want to hear that it worked for SC. We had a flawed implementation on a toy-scale model, where humans were imposing constraints on which paths were requested.

>>>  The connections between those different networks will then depend on the label-swapping capabilities of those networks.
> Sigh.  Lets face it: The reason VLANs pose a problem is that they block easily.  The better networks will implement label swapping switching technologies.     Flat vlans just don't scale well on a global basis.  Particularly with existing conventional ethernet hardware.  For instance: Even if you knew VLAN 1780 was available between StarLight and NetherLight and also available between StarLight and ESnet, if 1780 was in use on the port facing JGNX it would be unavailable to any other crossconnect.   It would be blocked for your use between NL and Esnet.  Which means you would have to select a different egress VLAN at NL *and* at ESnet.    So just knowing which VLANs are available on one port does not tell you if it is available internally or the likelyhood that it might be.    Its a crap shoot.  A guess.   A shot in the dark.  Conventional Ethernet sucks for global provisioning.  Accept this my child and enlightenment will open your eyes. (:-)

Conventional Ethernet sucks balls. But we're pretty much stuck with it for a good while yet.
Over the next few years label swapping will become easier to do, but on the other hand we will also see a rise in wavelength requests.
Wavelength switching is possible, but costly, it is much easier to use the same wavelength through a whole path.

>>> Note also that the number of domain descriptions will increase exponentially as soon as we start considering multi-layer networks.
> I am not sure agree with this.  Topology hiding and transfer functions make this a far simpler problem.  The overall complexity is not reduced, but we delegate responsibility to agents who have the deatiled information and authority to allocate the resources.  So the more topology and state you try to express the harder the problem becomes.   At some point we have to accept that summarization is the only way we can hope to make this scale and that pathfinding will be a non-deterministic process - based on probablities of success, but guesses none the less.   We want to always "choose wisely" but understand that we won't always be so lucky.
> 

> Thanks for your dedication to this issue, Jeroen.   I appreciate your intensity.

Likewise! I appreciate the discussion. But please try to keep this concise.

Jeroen.