[Nsi-wg] NSI Issues list – Rio de Janeiro Demo

Jerry Sobieski jerry at nordu.net
Sat Sep 17 21:44:21 CDT 2011


Hey John and Radek...some comments and additions to your comments.. 
rabid drooling is my own...:-)
Jerry

On 9/14/11 5:23 PM, John MacAuley wrote:
> Comments in line.
>
> On 2011-09-14, at 12:37 PM, Radek Krzywania wrote
>
>>
>> - GenericAcknowledgmentType - should xxxConfirmed/Failed messages be 
>> sent before returning this type, or asynchronously? It could be 
>> removed as well
>
> The acknowledgement should be returned at the point you have accepted 
> the message for processing. The other messages come later 
> (considerably later in some cases).
>
> I validate the key attributes first just in case I need to send a 
> fault message back instead of the ACK.  For example, on a reservation 
> I verify the correlationId, replyTo, providerNSA, requesterNSA, and 
> connectionId are present.  I also verify that the message is destined 
> for me.  Once this is done I dispatch it for processing and return the 
> ACK.  There is a small chance that the Confirmed/Failed beats the ACK 
> back, but you should defensively code for that.
Hmm, I think I need to take you to task on this John.   The *ack* is a 
MTL function that should not have anything to do with NSI layer message 
evaluation.  i.e. its not the MTL's responsibility to send a NSI 
response of any type.    The ACK (as opposed to a confirm or fail) is 
indication that the message was received, nothing else.  An ACK should 
be returned regardless of whether a NSI message is proper or otherwise.

Curious: Why do you verify that the message is "for you"?  How did you 
get it if it was not for you?  But again, checking the NSA fields is 
really a NSI function - not the MTL.

I do want to agree that naming of requesterNSA and providerNSA in the 
NSI msg header is a bad decision.  The MTL is forced to look at the NSI 
message primitive types ( Request/Response types) to determine if the 
"requesterNSA" is the destination for this message or if the 
"providerNSA" is the destination NSA for the message.  This is broken in 
the NSI spec.  The MTL should not need to inspect any NSI layer context 
to handle the message.  Just deliver to the indicated NSA.     We should 
rename these two header fields to "sourceNSA" and "destinationNSA".   
The NSI layer only uses these fields to indicate where the message is 
going and who is sending it.  The NSI layer (the State Machine semantic 
routine) already knows which is the RA and which is the PA.  And the MTL 
should not need to care - only to use these fields to direct the 
message.  Renaming will simplify the handling and correct a blatant 
layer violation.
>>
>> - queryConfirmed, queryFailed on provider interface - breaks consistency
>>
>> - query on requester interface - breaks consistency
>
> These were requested for error handling so that a providerNSA could 
> query the requesterNSA to correlate schedule information after a 
> restart.  Are people aware they can get a query across the requester 
> NSA interface?
Ugh- another "extension" not in the spec.

While we have discussed it, there is next to nothing in the way of 
recovery processing in the spec.   And so anything inserted in 
anticipation of how that *might* be performed is also jumping the gun 
and should be extracted.  We have a pretty comprehensive Query() 
primitive now that should be able to reveal anything about reservations 
next door...what more is necessary?

Regarding the "interface" note....  This doesn't make sense to me.   The 
spec does not reference "interfaces" for primitives at all.  The spec 
says that an NSA supports a set of primitives - all of them.  It 
differentiates a "role" for an NSA in a particular connection instance, 
but does not explicitly ascribe primitives to said roles.   Mostly 
because these roles are only a small part of how the NSA functions.     
In practice, perhaps some implementation may never expect to handle or 
utilize certain primitives - this is ok, but this is a 
_/implementation/_ issue - not an NSI spec issue.   All of this 
continued confusion about whether a primitive is a PA primitive or an RA 
primitive is unnecessary and was caused by adding stuff to the WSDL that 
wasn't in the spec and wasn't discussed or agreed to.  ARRGH!!!  There 
should be one WSDL for all NSAs. Period.  And that WSDL is supposed to 
support a simple guaranteed message transport of NSI messages.   If you 
continue to try to bury NSI processing in the message transport layer 
and then try to optimize things by jamming Great Ideas into the wsdl 
without discussion with the group you will continue to have confusion.

We agreed that the NSI layer would be independent of the MTL layer.   
The WSDL should not be violating that explicit decision of the Working 
Group by parsing NSI layer messaging.

Fundamentally, splitting the WSDL into two actors is not something the 
protocol required and has caused a tremendous amount of unnecessary 
confusion.   I realize it was done with good intention, _/but it was not 
part of the spec./_   The spec says a the protocol consists of all of 
these primitives.  If a user only wants to be an RA, fine! But that is a 
user decision and the spec does not need to care.  Someone build them a 
user API library that allows a user to ignore those routines he'll never 
use, but we don't go jamming stuff in the spec for this.

Indeed, if the WSDL was truly observing the layers, it would not have 
*ANY* RA or PA primitives, it would have MTL functions such as send(), 
Acknowledge()  and receive() or some such.   The event handler in the 
NSI layer should be dispatching the correct software processing. ...

I suggest we bite the bullet and re-combine the two WSDLs asap so there 
is only one to deal with.
And for version 2.0 we relegate the WSDL to the trash heap - or barring 
that to being the proper and respectful MTL it is supposed to be.
>
>>
>> - maybe allow suggestion that either chain or tree reservation model 
>> is prefered in reservation request
>
> This is an interesting feature.  Is it just a routing suggestion?
Ugh!  Why on earth would you do that?

First, such a decision requires a substantive detailed understanding of 
both the end to end data plane topology as well as a fair amount of 
knowledge about the control plane accessibility.
If the RA is smart enough to know it wants a tree or chain 
decomposition, then the RA should make the appropriate segmentation 
itself.  Period.   Thats why we have both in the protocol- because the 
group felt that the RAs would want to make those decisions themselves.   
If the RA elects to delegate the request to some PA, then let the PA do 
its job...as it deems appropriate.   It seems oxymoronic to send a 
request to another NSA because you were too lazy or ill informed to make 
the reservation yourself, and then tell the other NSA how to do it.  
This seems utterly silly.:-)

If the RA can't effect the segmentation it desires by doing so itself, 
then you have to ask the  question *why* it can not do so??  If it does 
not know enough topology, then how does it know to ask for chain or 
tree?   If it knows the topology but does not have the authority with 
other NSAs, then why would you think an intermediary will fare better 
using those same authorization credentials?   (Its like the 14 yr old 
kid who can't buy beer asking another clerk to buy the beer for him....!:-)

All NSAs are alike.  Period.   They *all* have the protocol capabilities 
to do what they want to do.   They can segment any path they wish, they 
can contact any NSA they wish, ...  there is nothing *in the protocol* 
that separates NSAs functionally.  Policy decides most of this for 
NSI...we don't need to add more nuanced overhead to the protocol.

Not to get too foamed at the mouth,...but the protocol is extremely 
powerful as it is.   It already gives anyone the ability to create an 
NSA, and then the ability to ask for just about anything they want.   
But it requires authorization and knowledge of topology to do some of 
these things.   If your NSA does not have the topology knowledge or 
authority to do something itself, its either not very smart, or not very 
trusted... and it probably does not have the authority or topology 
because other NSAs don't think it can make good decisons... Ergo- bad 
idea to circumvent the PA's responsibility by letting an RA tell it what 
to do.

>> Now my two cents:
>>
>> - We experienced from the demo, that sending a provision request just 
>> after reservation request, results in simple ack. So in fact we are 
>> unaware if the reservation is in auto provision state or not. We just 
>> get provisioned status update when reservation start time comes. We 
>> can wait 1 minute for that during demo, but if the service is about 
>> to start in 4 weeks...
> Yes.  Functioning as defined.  I am concerned about this as well.  It 
> causes complicated code (nothing to do with Web Services JERRY, but 
> with the protocol).
>
:-)

Hmm,

I agree the provisionConfirmed() should not be left hanging for weeks.   
That *is* an NSI problem.

I disagree however if a single "ACK" (not a NSI Response msg) is the 
only acknowledgment for two messages...thats wrong.    The spec says 
that NSI messages are assumed to be guaranteed delivery.    If you send 
a NSI message, the receiving MTL should ACK that message as soon as the 
message is cleanly received and properly queued for NSI layer processing 
at the receiver. And every message received from the same NSA should be 
queued and ACK'd serially in this fashion as well.   We decided that the 
ACK in the layers of John's overly simplified WS* MTL would be used for 
that purpose (I think the ACKs are actually a SOAP function...?)

If a Provision arrives immediately following a Reserve, the MTL should 
not care. Period.  The MTL should not care what types of NSI messages it 
is sending or receiving.   Each message  should be handled as just 
another message received from another NSA - and as soon as it is queued 
for NSI layer, an ACK should be sent back.  The MTL should *never* 
*EVER* (!!!) be examining the NSI context to evaluate the validity of a 
msg or acting at the NSI layer.   Doing so is a layer violation.   The 
ACKs are part of the MTL...not NSI. NSI assumes all messages are 
delivered and only enters error processing if/when the primitive 
Response() timeout occurs.

As for the autoprovision state or not...  That is a different issue.  If 
NSA Allen sends a Resv() to NSA Betty, immediately followed by a Prov() 
to Betty... IMHO, Betty should return a ReserveConfirm() as soon as 
resources are allocated, and a ProvisionConfirmed() as soon as Betty has 
evaluated the ProvisionReq() and noted in her db to autostart the 
connection.  Perhaps we can set a flag or a state in the 
ProvisionConfirmed response to indicate the circuit is in autostart 
mode, not "in service".     I am not certain we need another message 
when an AutoStart is performed and the circuit goes into service...thats 
why the circuit was autostarted in the first place - so the RA doesn't 
want or need to deal with such messaging.    If the RA wants, it can 
Query() the connection at the start time to see what the control plane 
thinks the state/status is...

So I propose:
1) we need a clearer definition of the MTL API ...what the MTL should be 
doing,
2) we need to review the SM - I think we do not want Provision() 
responses hanging out for possibly weeks or months in this autoStart 
scenario.


>>
>> - People were asking for query message, to retrieve the reservation 
>> status at all partners and a path.
"People" ?   Anyone can Query() a ConnectionID.   This was solved in SLC 
I thought...   If they have proper Authorization credentials, they will 
receive a full dump of the entire tree. Soup to nuts.  If their 
credentials are not all powerful, they will get as far down the tree as 
their credentials will allow - and then the tree will stop.   They will 
however learn which NSA rejected the child Query().  With this, the RA 
can issue a directed Query() to that NSA presumably with different (more 
powerful) credentials and can fillout the tree.   If the RA does not 
have proper Authorization, he is SOL.   Sorry - thats why its call 
"Authorization".   (:-)
>>
>> - People were also interested if the same path will be configured for 
>> each provision message
I don't know that we are explicit in the spec about this...  I had 
always assumed that a detailed Path would be reserved at the time of the 
ReservationReq().   Though it could be modified up to start time as long 
as it met the original reserved service parameters.   I would assert 
that once the connection is provisioned that the path should be fixed 
until the end of the reservation.   A provision() should not change 
that.   In essence, the PA must meet the parameters specified in the 
reservation request, but is otherwise free to manage its resources as it 
sees fit up until the start of the reservation.  At the start of the 
reservation, the path is locked so that it will remain the same physical 
path until terminated.  Provisioning simply changes the state and 
perhaps should also shut the connection to prevent data flow, but does 
not otherwise change the configuration of the resources.
>> - protocol does not specify it, it’s actually up to domain, but in 
>> some cases it may be important (see Express scenario later)
>> - Express was also interested in specyfing path for the reservation. 
>> E.g. they have one reservation now, and another in two months - two 
>> reservations needs to go the same path. We can't solve it now. 
>> Express also experienced issues with any connections they get - they 
>> need to tune it before use. Sa they do that about 2 weeks before 
>> operations, and check again about 1 day or hours before sending data. 
>> This scenario is a real user scenario, which we should discuss. I've 
>> invited the guy (apology - I forgot his name, but I will find him at 
>> the venue during break) from Express for the March OGF meeting in 
>> Oxford so we could discuss it.
>>
IMO, This is an example of trying to recreate the problems we are trying 
to solve.  We are trying to break the circle of fear where Networks are 
allowed to get away with empty performance promises.    If the PA 
actually provides what they confirmed, then a quick test at the start of 
the reservation will verify it.  Indeed, such a test should be performed 
by both the PA(s) and the RA.   If the service is properly requested 
(User's responsibility) and properly provisioned (Network's 
responsibility) why would a connection then need "tuning" ?...it either 
functions as requested, or it does not.

This is really a change in mindset - a user should not have to "tune" 
the provider's delivered service.  The PA better &^%$& well know how to 
configure their equipment to meet their committed service levels.   And 
a PA should not ignore the importance of service provisioning details 
such as port buffers, burst characteristics, capacity interleaving, and 
the like and expect it to fly.  The networks ignore these details at 
their own risk.   The wise PA will have thoroughly tested as many 
possible service instance scenarios as possible prior to offering a 
service.  They are after all, *guaranteeing* the service.

However, there are two problems here: a) there is little pain imposed 
upon R&E providers when they miss their guaranteed service levels, and 
b) even a well intentioned provider - particularly early on - is likely 
to miss some aspects that may impact service...The PA is still at fault, 
but due to an honest oversight rather than incompetence or 
indifference.   I don't think we need to or should change the protocol 
to address competency issues in the operational or engineering staffs.   
If the user wants the same connection on a day to day or week to week 
basis, then reserve the connection and hold it...forever if need be.   
If the system is working correctly, tuning of the connection is not 
something that will be required.   If it does not work correctly, and 
the connection is flawed, then we need to figure out why - its not 
likely the protocol.   I think we should take the network PA at its 
word: that a reservation confirmation means the PA has agreed to do 
this, and that we can rely on the PA to do so.   And we flame them well 
and hold their feet to the fire when they miss.   If the user does not 
trust the PA but continues to use the PA after the PA has repeatedly 
failed to deliver the confirmed performance, then the RA is not very 
smart...or the RA needs a bigger stick.

Thanks for reading so much technical stuff...
Jerry
>> Best regards
>> Radek
>>
>> ________________________________________________________________________
>> Radoslaw Krzywania                      Network Research and Development
>>                                           Poznan Supercomputing and
>> radek.krzywania at man.poznan.pl <mailto:radek.krzywania at man.poznan.pl> 
>>                   Networking Center
>> +48 61 850 25 26 http://www.man.poznan.pl
>> ________________________________________________________________________
>>
>>> -----Original Message-----
>>> From: nsi-wg-bounces at ogf.org <mailto:nsi-wg-bounces at ogf.org> 
>>> [mailto:nsi-wg-bounces at ogf.org] On Behalf
>>> Of John MacAuley
>>> Sent: Wednesday, September 14, 2011 4:46 PM
>>> To: nsi-plugfest at m.aist.go.jp <mailto:nsi-plugfest at m.aist.go.jp>; 
>>> nsi-wg at ogf.org <mailto:nsi-wg at ogf.org> WG
>>> Subject: [Nsi-wg] NSI Issues list – Rio de Janeiro Demo
>>>
>>> Peoples,
>>>
>>> Here are some of the notes I captured.
>>>
>>> The reservation XML message contains two element tags both named
>>> “reservation”.  The first is the reservation operation and the 
>>> second is the
>>> reservation information contents.  This will need to be fixed in the 
>>> types XSD.
>>>
>>> The “providerNSA” and “requesterNSA” elements in the NSI protocol header
>>> caused some confusion.  In the CS document we stated that this would be
>>> populated with the NSnetwork URN as defined in Salt Lake City. However,
>>> when the demo topology schema was created we introduced a new NSA
>>> object.  Some people used the URN of the NSA object in the 
>>>  “providerNSA”
>>> and “requesterNSA” elements.  We will need to make a definitive 
>>> statement
>>> as we define the formal NSI topology.  Do we rename the elements to
>>> “providerNSnetwork” and “requesterNSnetwork” and continue using the
>>> NSnetwork URNs, or do we switch over to NSA URN?
>>>
>>> Demo topology definition was an issue, but in the end we had something
>>> workable for the demo.  Going forward we need to formalize NSI topology
>>> definition and how it relates to other schema such as DTOX and NML. 
>>>  I am
>>> going to start working on the topology discovery/exchange problem to 
>>> get use
>>> moving forward.
>>>
>>> Security presented interoperability issues as some people enforced
>>> authentication and authorization while others did not.  Some systems 
>>> could
>>> not provide the needed credentials, and resulted in some 
>>> incompatibility.  I
>>> believe we are clear on what we want to implement for NSA-to-NSA 
>>> security,
>>> but we need to formalize the session security requirements so we can 
>>> begin
>>> to implement in the NSAs.
>>>
>>> Namespace issues – this is not an issue with the protocol 
>>> specification but with
>>> a few of the less mature SOAP web services toolkits.  They get confused
>>> between the interface and type namespaces for the protocol elements 
>>> due to
>>> the use of “ref” to reference protocol operation elements from the 
>>> types XSD
>>> in the interfaces WSDL.
>>>
>>> Now that we have had some demo experience with error handling, I 
>>> think it is
>>> time to start formalizing values in the NsiExceptionType.  I have 
>>> already
>>> identified ten generic error conditions, and I am sure there are 
>>> many more.
>>> As a follow up to this item, NSA implementations need to populate the
>>> NsiExceptionType.variables with additional information to help 
>>> identify the
>>> problem that caused the rejection of the message.  I had done this in
>>> OpenDRAC so any errors returned to other NSA would contain information
>>> formatted as follows:
>>>
>>> <messageId>SVC0001</messageId>
>>> <text>Invalid or missing parameter</text>
>>> <variables>
>>> <Attribute Name="replyTo"
>>> NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic">
>>> <AttributeValue xsi:type="xs:string"
>>> xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>>                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>> &lt;null&gt;</AttributeValue>
>>> </Attribute>
>>> </variables>
>>>
>>> Interface version discovery – I had discussed this with the group 
>>> previously,
>>> but now that we are moving forward, up issuing NSI schema versions, and
>>> adding new interfaces we will need a mechanism for discovering the
>>> interfaces and versions supported by an NSA.  I will propose 
>>> something formal
>>> when I get a chance.
>>>
>>> John.
>>> _______________________________________________
>>> nsi-wg mailing list
>>> nsi-wg at ogf.org <mailto:nsi-wg at ogf.org>
>>> http://www.ogf.org/mailman/listinfo/nsi-wg
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.ogf.org/pipermail/nsi-wg/attachments/20110917/e28d349e/attachment-0001.html 


More information about the nsi-wg mailing list