draft-ietf-grow-diverse-bgp-path-dist-07.txt   draft-ietf-grow-diverse-bgp-path-dist-08.txt 
GROW Working Group R. Raszuk, Ed. GROW Working Group R. Raszuk, Ed.
Internet-Draft NTT MCL Internet-Draft NTT MCL
Intended status: Informational R. Fernando Intended status: Informational R. Fernando
Expires: November 4, 2012 K. Patel Expires: January 2, 2013 K. Patel
Cisco Systems Cisco Systems
D. McPherson D. McPherson
Verisign Verisign
K. Kumaki K. Kumaki
KDDI Corporation KDDI Corporation
May 3, 2012 July 2012
Distribution of diverse BGP paths. Distribution of diverse BGP paths.
draft-ietf-grow-diverse-bgp-path-dist-07 draft-ietf-grow-diverse-bgp-path-dist-08
Abstract Abstract
The BGP4 protocol specifies the selection and propagation of a single The BGP4 protocol specifies the selection and propagation of a single
best path for each prefix. As defined and widely deployed today BGP best path for each prefix. As defined and widely deployed today BGP
has no mechanisms to distribute alternate paths which are not has no mechanisms to distribute alternate paths which are not
considered best path between its speakers. This behaviour results in considered best path between its speakers. This behaviour results in
number of disadvantages for new applications and services. number of disadvantages for new applications and services.
This document presents an alternative mechanism for solving the The main objective of this document is to observe that by simply
problem based on the concept of parallel route reflector planes. adding new session between route reflector and it's client Nth best
Such planes can be built in parallel or they can co-exist on the path can be distributed. Document also compares existing solutions
current route reflection platforms. Document also compares existing and proposed ideas that enable distribution of more paths than just
solutions and proposed ideas that enable distribution of more paths the best path.
than just the best path.
This proposal does not specify any changes to the BGP protocol This proposal does not specify any changes to the BGP protocol
definition. It does not require upgrades to provider edge or core definition. It does not require software upgrade of provider edge
routers nor does it need network wide upgrades. routers acting as route reflector clients.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 4, 2012. This Internet-Draft will expire on January 2, 2013.
Copyright Notice Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 14 skipping to change at page 3, line 14
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. History . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. History . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. BGP Add-Paths Proposal . . . . . . . . . . . . . . . . . . 4 2.1. BGP Add-Paths Proposal . . . . . . . . . . . . . . . . . . 4
3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Multi plane route reflection . . . . . . . . . . . . . . . . . 6 4. Multi plane route reflection . . . . . . . . . . . . . . . . . 6
4.1. Co-located best and backup path RRs . . . . . . . . . . . 9 4.1. Co-located best and backup path RRs . . . . . . . . . . . 9
4.2. Randomly located best and backup path RRs . . . . . . . . 11 4.2. Randomly located best and backup path RRs . . . . . . . . 11
4.3. Multi plane route servers for Internet Exchanges . . . . . 13 4.3. Multi plane route servers for Internet Exchanges . . . . . 14
5. Discussion on current models of IBGP route distribution . . . 14 5. Discussion on current models of IBGP route distribution . . . 14
5.1. Full Mesh . . . . . . . . . . . . . . . . . . . . . . . . 14 5.1. Full Mesh . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2. Confederations . . . . . . . . . . . . . . . . . . . . . . 15 5.2. Confederations . . . . . . . . . . . . . . . . . . . . . . 15
5.3. Route reflectors . . . . . . . . . . . . . . . . . . . . . 16 5.3. Route reflectors . . . . . . . . . . . . . . . . . . . . . 16
6. Deployment considerations . . . . . . . . . . . . . . . . . . 16 6. Deployment considerations . . . . . . . . . . . . . . . . . . 16
7. Summary of benefits . . . . . . . . . . . . . . . . . . . . . 18 7. Summary of benefits . . . . . . . . . . . . . . . . . . . . . 18
8. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 19 8. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 19
9. Security considerations . . . . . . . . . . . . . . . . . . . 19 9. Security considerations . . . . . . . . . . . . . . . . . . . 19
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 20 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 20
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
13.1. Normative References . . . . . . . . . . . . . . . . . . . 20 13.1. Normative References . . . . . . . . . . . . . . . . . . . 21
13.2. Informative References . . . . . . . . . . . . . . . . . . 21 13.2. Informative References . . . . . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction 1. Introduction
Current BGP4 [RFC4271] protocol specification allows for the Current BGP4 [RFC4271] protocol specification allows for the
selection and propagation of only one best path for each prefix. The selection and propagation of only one best path for each prefix. The
BGP protocol as defined today has no mechanism to distribute other BGP protocol as defined today has no mechanism to distribute other
then best path between its speakers. This behaviour results in a than best path between its speakers. This behaviour results in a
number of problems in the deployment of new applications and number of problems in the deployment of new applications and
services. services.
This document presents an alternative mechanism for solving the This document presents a mechanism for solving the problem based on
problem based on the concept of parallel route reflector planes. It the conceptual creation of parallel route reflector planes. It also
also compares existing solutions and proposed ideas that enable compares existing solutions and proposes ideas that enable
distribution of more paths than just the best path. The parallel distribution of more paths than just the best path. The parallel
route reflector planes solution brings very significant benefits at a route reflector planes solution brings very significant benefits at a
negligible capex and opex deployment price as compared to the negligible capex and opex deployment price as compared to the
alternative techniques and is being considered by a number of network alternative techniques (full bgp mesh or add-paths) and is being
operators for deployment in their networks. considered by a number of network operators for deployment in their
networks.
This proposal does not specify any changes to the BGP protocol This proposal does not specify any changes to the BGP protocol
definition. It does not require upgrades to provider edge or core definition. It does not require upgrades to provider edge or core
routers nor does it need network wide upgrades. The only upgrade routers nor does it need network wide upgrades. The only upgrade
required is the new functionality on the new or current route required is the new functionality on the new or current route
reflectors. reflectors.
2. History 2. History
The need to disseminate more paths than just the best path is The need to disseminate more paths than just the best path is
primarily driven by three requirements. First is the problem of BGP primarily driven by three requirements. First is the problem of BGP
oscillations [I-D.ietf-idr-route-oscillation]. The second is the oscillations [RFC3345]. The second is the desire for faster
desire for faster reachability restoration in the event of network or reachability restoration in the event of network link or network
network element's failure. Third requirement is to enhance BGP load element's failure. Third requirement is to enhance BGP load
balancing capabilities. Those reasons have lead to the proposal of balancing capabilities. Those reasons have lead to the proposal of
BGP add-paths [I-D.ietf-idr-add-paths]. BGP add-paths [I-D.ietf-idr-add-paths].
2.1. BGP Add-Paths Proposal 2.1. BGP Add-Paths Proposal
As it has been proven that distribution of only the best path of a As it has been proven that distribution of only the best path of a
route is not sufficient to meet the needs of continuously growing route is not sufficient to meet the needs of continuously growing
number of services carried over BGP the add-paths proposal was number of services carried over BGP, the add-paths proposal was
submitted in 2002 to enable BGP to distribute more then one path. submitted in 2002 to enable BGP to distribute more than one path.
This is achieved by including as a part of the NLRI an additional This is achieved by including as a part of the NLRI an additional
four octet value called the Path Identifier. four octet value called the Path Identifier.
The implication of this change on a BGP implementation is that it The implication of this change on a BGP implementation is that it
must now maintain per path, instead of per prefix, peer advertisement must now maintain per path, instead of per prefix, peer advertisement
state to track which of the peers each path was advertised to. This state to track to which of the peers given path was advertised to.
new requirement has its own memory and processing cost. Suffice to
say that it took over 9 years for some commercial BGP implementation This new requirement comes with its own memory and processing cost.
to support the new add-path behaviour in production code, in major
part due to this resource overhead.
An important observation is that distribution of more than one best An important observation is that distribution of more than one best
path by Autonomous System Border Routers (ASBRs) with multiple EBGP path by Autonomous System Border Routers (ASBRs) with multiple EBGP
peers attached to it where no "next hop self" is set may result in peers attached to it where no "next hop self" is set may result in
bestpath selection inconsistency within the autonomous system. bestpath selection inconsistency within the autonomous system.
Therefore it is also required to attach in the form of a new Therefore it is also required to attach in the form of a new
attribute the possible tie breakers and propagate those within the attribute the possible tie breakers and propagate those within the
domain. The example of such attribute for the purpose of fast domain. The example of such attribute for the purpose of fast
connectivity restoration to address that very case of ASBR injecting connectivity restoration to address that very case of ASBR injecting
multiple external paths into the IBGP mesh has been presented and multiple external paths into the IBGP mesh has been presented and
discussed in Fast Connectivity Restoration Using BGP Add-paths discussed in Fast Connectivity Restoration Using BGP Add-paths
[I-D.ietf-idr-add-paths] document. Based on the additionally [I-D.ietf-idr-add-paths] document. Based on the additionally
propagated information also best path selection is recommended to be propagated information also best path selection is recommended to be
modified to make sure that best and backup path selection within the modified to make sure that best and backup path selection within the
domain stays consistent. More discussion on this particular point domain stays consistent. More discussion on this particular point
will be contained in the deployment considerations section below. In will be contained in the deployment considerations section below. In
the proposed solution in this document we observe that in order to the proposed solution in this document we observe that in order to
address most of the applications just use of best external address most of the applications just use of best external
advertisement is required. For ASBRs which are peering to multiple advertisement is required. For ASBRs which are peering to multiple
upstream ASs setting "next hop self" is recommended. upstream domains setting "next hop self" is recommended.
The add paths protocol extensions have to be implemented by all the The add paths protocol extensions have to be implemented by all the
routers within an AS in order for the system to work correctly. It routers within an Autonomous System (AS) in order for the system to
remains quite a research topic to analyze benefits or risk associated work correctly. It remains quite a research topic to analyze
with partial add-paths deployments. The risk becomes even greater in benefits or risk associated with partial add-paths deployments. The
networks not using some form of edge to edge encapsulation. risk becomes even greater in networks not using some form of edge to
edge encapsulation.
The required code modifications include enhancements such as the Fast The required code modifications can offer the foundation for
Connectivity Restoration Using BGP Add-path enhancements such as the Fast Connectivity Restoration Using BGP Add-
[I-D.pmohapat-idr-fast-conn-restore]. The deployment of such path [I-D.pmohapat-idr-fast-conn-restore]. The deployment of such
technology in an entire service provider network requires software technology in an entire service provider network requires software
and perhaps sometimes in the cases of End-of-Engineering or End-of- and perhaps sometimes in the cases of End-of-Engineering or End-of-
Life equipment even hardware upgrades. Such operation may or may not Life equipment even hardware upgrades. Such operation may or may not
be economically feasible. Even if add-path functionality was be economically feasible. Even if add-path functionality was
available today on all commercial routing equipment and across all available today on all commercial routing equipment and across all
vendors, experience indicates that to achieve 100% deployment vendors, experience indicates that to achieve 100% deployment
coverage within any medium or large global network may easily take coverage within any medium or large global network may easily take
years. years.
While it needs to be clearly acknowledged that the add-path mechanism While it needs to be clearly acknowledged that the add-path mechanism
provides the most general way to address the problem of distributing provides the most general way to address the problem of distributing
many paths between BGP speakers, this document provides a much easier many paths between BGP speakers, this document provides a much easier
to deploy solution that requires no modification to the BGP protocol to deploy solution that requires no modification to the BGP protocol
where only a few additional paths may be required. The alternative where only a few additional paths may be required. The alternative
method presented is capable of addressing critical service provider method presented is capable of addressing critical service provider
requirements for disseminating more than a single path across an AS requirements for disseminating more than a single path across an AS
with a significantly lower deployment cost what in the light of set with a significantly lower deployment cost. That in the light of
general network scaling concerns documented in RFC4984 [RFC4271] number of general network scaling concerns documented in RFC4984
"Report from the IAB Workshop on Routing and Addressing" may provide [RFC4271] "Report from the IAB Workshop on Routing and Addressing"
a significant advantage. may provide a significant advantage.
3. Goals 3. Goals
The proposal described in this document is not intended to compete The proposal described in this document is not intended to compete
with add-paths. Instead if deployed it is to be used as a very easy with add-paths. It provides an interim solution until the
method to accommodate the majority of applications which may require standardization and implementation of add-paths and until support for
presence of alternative BGP exit points. that function can be deployed across the network.
It is presented to network operators as a possible choice and It is presented to network operators as a possible choice and
provides those operators who need additional paths today an provides those operators who need additional paths today an
alternative from the need to transition to a full mesh. alternative from the need to transition to a full mesh. The Nth best
path describes a set of N paths with different BGP next hops with no
implication of ordering or preference among said N paths.
It is intended as a way to buy more time allowing for a smoother and It is intended as a way to buy more time allowing for a smoother and
gradual migration where router upgrades will be required for perhaps gradual migration where router upgrades will be required for perhaps
different reasons. It will also allow the time required where different reasons. It will also allow the time required where
standard RP/RE memory size can easily accommodate the associated standard RP/RE memory size can easily accommodate the associated
overhead with other techniques without any compromises. overhead with other techniques without any compromises.
4. Multi plane route reflection 4. Multi plane route reflection
The idea contained in the proposal assumes the use of route The idea contained in the proposal assumes the use of route
reflection within the network. Other techniques as described in the reflection within the network.
following sections already provide means for distribution of
alternate paths today.
Let's observe today's picture of simple route reflected domain: Let's observe today's picture of simple route reflected domain:
ASBR3 ASBR3
*** ***
* * * *
+------------* *-----------+ +------------* *-----------+
| AS1 * * | | AS1 * * |
| *** | | *** |
| | | |
skipping to change at page 7, line 35 skipping to change at page 7, line 35
| *** *** | | *** *** |
| * * * * | | * * * * |
+-----* *---------* *----+ +-----* *---------* *----+
* * * * * * * *
*** *** *** ***
ASBR1 ASBR2 ASBR1 ASBR2
EBGP EBGP
Figure1: Simple route reflection Figure1: Simple route reflection
Abbreviations used: RR - Route Reflector, P - Core router.
Figure 1 shows an AS that is connected via EBGP peering at ASBR1 and Figure 1 shows an AS that is connected via EBGP peering at ASBR1 and
ASBR2 to an upstream AS or set of ASes. For a given destination "D" ASBR2 to an upstream AS or set of ASes. For a given destination "D"
ASBR1 and ASBR2 will each have an external path P1 and P2 ASBR1 and ASBR2 may have an external path P1 and P2 respectively.
respectively. The AS network uses two route reflectors RR1 and RR2 The AS network uses two route reflectors RR1 and RR2 for redundancy
for redundancy reasons. The route reflectors propagate the single reasons. The route reflectors propagate the single BGP best path for
BGP best path for each route to all clients. All ASBRs are clients each route to all clients. All ASBRs are clients of RR1 and RR2.
of RR1 and RR2.
Below are the possible cases of the path information that ASBR3 may Below are the possible cases of the path information that ASBR3 may
receive from route reflectors RR1 and RR2: receive from route reflectors RR1 and RR2:
1. When best path tie breaker is the IGP distance: When paths P1 and 1. When best path tie breaker is the IGP distance: When paths P1 and
P2 are considered to be equally good best path candidates the P2 are considered to be equally good best path candidates the
selection will depend on the distance of the path next-hops from selection will depend on the distance of the path next-hops from
the route reflector making the decision. Depending on the the route reflector making the decision. Depending on the
positioning of the route reflectors in the IGP topology they may positioning of the route reflectors in the IGP topology they may
choose the same best path or a different one. In such a case choose the same best path or a different one. In such a case
skipping to change at page 8, line 18 skipping to change at page 8, line 19
Preference: In this case only one path from preferred exit point Preference: In this case only one path from preferred exit point
ASBR will be available to RRs since the other peering ASBR will ASBR will be available to RRs since the other peering ASBR will
consider the IBGP path as best and will not announce (or if consider the IBGP path as best and will not announce (or if
already announced will withdraw) its own external path. The already announced will withdraw) its own external path. The
exception here is the use of BGP Best-External proposal which exception here is the use of BGP Best-External proposal which
will allow stated ASBR to still propagate to the RRs its own will allow stated ASBR to still propagate to the RRs its own
external path. Unfortunately RRs will not be able to distribute external path. Unfortunately RRs will not be able to distribute
it any further to other clients as only the overall best path it any further to other clients as only the overall best path
will be reflected. will be reflected.
There is no requirement of path ordering. The "Nth best path" really
describes set of N paths with different bgp next hops.
The proposed solution is based on the use of additional route The proposed solution is based on the use of additional route
reflectors or new functionality enabled on the existing route reflectors or new functionality enabled on the existing route
reflectors that instead of distributing the best path for each route reflectors that instead of distributing the best path for each route
will distribute an alternative path other then best. The best path will distribute an alternative path other than best. The best path
(main) reflector plane distributes the best path for each route as it (main) reflector plane distributes the best path for each route as it
does today. The second plane distributes the second best path for does today. The second plane distributes the second best path for
each route and so on. Distribution of N paths for each route can be each route and so on. Distribution of N paths for each route can be
achieved by using N reflector planes. achieved by using N reflector planes.
As diverse-path functionality may be enabled on a per peer basis one As diverse-path functionality may be enabled on a per peer basis one
of the deployment model can be realized to continue advertisement of of the deployment model can be realized to continue advertisement of
overall best path from both route reflectors while in addition new overall best path from both route reflectors while in addition new
session can be provisioned to get additional path. That will allow session can be provisioned to get additional path. That will allow
the non interupted use of best path even if one of the RRs goes down the non interrupted use of best path even if one of the RRs goes down
provided that the overall best path is still a valid one. provided that the overall best path is still a valid one.
Each plane of route reflectors is a logical entity and may or may not Each plane of route reflectors is a logical entity and may or may not
be co-located with the existing best path route reflectors. Adding a be co-located with the existing best path route reflectors. Adding a
route reflector plane to a network may be as easy as enabling a route reflector plane to a network may be as easy as enabling a
logical router partition, new BGP process or just a new configuration logical router partition, new BGP process or just a new configuration
knob on an existing route reflector and configuring an additional knob on an existing route reflector and configuring an additional
IBGP session from the current clients if required. There are no code IBGP session from the current clients if required. There are no code
changes required on the route reflector clients for this mechanism to changes required on the route reflector clients for this mechanism to
work. It is easy to observe that the installation of one or more work. It is easy to observe that the installation of one or more
skipping to change at page 10, line 19 skipping to change at page 10, line 48
either as logical or physical new control plane RRs in the same either as logical or physical new control plane RRs in the same
IGP points as RR1 and RR2 respectively. IGP points as RR1 and RR2 respectively.
2. Enabling best-external on ASBRs 2. Enabling best-external on ASBRs
3. Enabling RR1' and RR2' for 2nd plane route reflection. 3. Enabling RR1' and RR2' for 2nd plane route reflection.
Alternatively instructing existing RR1 and RR2 to calculate also Alternatively instructing existing RR1 and RR2 to calculate also
2nd best path. 2nd best path.
4. Unless one of the existing RRs is turned to advertise only 4. Unless one of the existing RRs is turned to advertise only
diverse path to it's current clients configuring new ASBRs-RR' diverse path to its current clients configuring new ASBRs-RR'
IBGP sessions IBGP sessions
The expected behaviour is that under any BGP condition the ASBR3 and The expected behaviour is that under any BGP condition the ASBR3 and
P routers will receive both paths P1 and P2 for destination D. The P routers will receive both paths P1 and P2 for destination D. The
availability of both paths will allow them to implement a number of availability of both paths will allow them to implement a number of
new services as listed in the applications section below. new services as listed in the applications section below.
As an alternative to fully meshing all RRs and RRs' an operator who As an alternative to fully meshing all RRs and RRs' an operator who
has a large number of reflectors deployed today may choose to peer has a large number of reflectors deployed today may choose to peer
newly introduced RRs' to a hierarchical RR' which would be an IBGP newly introduced RRs' to a hierarchical RR' which would be an IBGP
interconnect point within the 2nd plane as well as between planes. interconnect point within the 2nd plane as well as between planes.
One of the deployment model of this scenario can be achieved by One of the deployment model of this scenario can be achieved by
simple upgrade of the existing route reflectors without the need to simple upgrade of the existing route reflectors without the need to
deploy any new logical or physical platforms. Such upgrade would deploy any new logical or physical platforms. Such upgrade would
allow route reflectors to service both upgraded to add-paths peers as allow route reflectors to service both upgraded to add-paths peers as
well as those peers which can not be immediately upgraded while in well as those peers which can not be immediately upgraded while in
the same time allowing to distribute more then single best path. The the same time allowing to distribute more than single best path. The
obvious protocol benefit of using existing RRs to distribute towards obvious protocol benefit of using existing RRs to distribute towards
their clients best and diverse bgp paths over different IBGP session their clients best and diverse bgp paths over different IBGP session
is the automatic assurance that such client would always get is the automatic assurance that such client would always get
different paths with their next hop being different. different paths with their next hop being different.
The way to accomplish this would be to create a separate IBGP session The way to accomplish this would be to create a separate IBGP session
for each N-th BGP path. Such session should be preferably terminated for each N-th BGP path. Such session should be preferably terminated
at a different loopback address of the route reflector. At the BGP at a different loopback address of the route reflector. At the BGP
OPEN stage of each such session a different bgp_router_id may be OPEN stage of each such session a different bgp_router_id may be
used. Correspondingly route reflector should also allow its clients used. Correspondingly route reflector should also allow its clients
skipping to change at page 13, line 7 skipping to change at page 13, line 7
reflectors. reflectors.
4. Enabling RR' or any of the existing RR for 2nd plane path 4. Enabling RR' or any of the existing RR for 2nd plane path
calculation calculation
5. If required fully meshing newly added RRs' with the all other 5. If required fully meshing newly added RRs' with the all other
reflectors in both planes. That condition does not apply if the reflectors in both planes. That condition does not apply if the
newly added RR'(s) already have peering to all ASBRs/PEs. newly added RR'(s) already have peering to all ASBRs/PEs.
6. Unless one of the existing RRs is turned to advertise only 6. Unless one of the existing RRs is turned to advertise only
diverse path to it's current clients configuring new ASBRs-RR' diverse path to its current clients configuring new ASBRs-RR'
IBGP sessions IBGP sessions
In this scenario the operator has the flexibility to introduce the In this scenario the operator has the flexibility to introduce the
new additional route reflector functionality on any existing or new new additional route reflector functionality on any existing or new
hardware in the network. Any of the existing routers that are not hardware in the network. Any of the existing routers that are not
already members of the best path route reflector plane can be easily already members of the best path route reflector plane can be easily
configured to serve the 2nd plane either via using a logical / configured to serve the 2nd plane either via using a logical /
virtual router partition or by having their bgp implementation virtual router partition or by having their bgp implementation
compliant to this specification. compliant to this specification.
skipping to change at page 13, line 45 skipping to change at page 13, line 45
As a result of this solution ASBR3 and other ASBRs peering to RR' As a result of this solution ASBR3 and other ASBRs peering to RR'
will be receiving the 2nd best path. will be receiving the 2nd best path.
Similarly to section 4.1 as an alternative to fully meshing all RRs & Similarly to section 4.1 as an alternative to fully meshing all RRs &
RRs' an operator who may have a large number of reflectors already RRs' an operator who may have a large number of reflectors already
deployed today may choose to peer newly introduced RRs' to a deployed today may choose to peer newly introduced RRs' to a
hierarchical RR' which would be an IBGP interconnect point between hierarchical RR' which would be an IBGP interconnect point between
planes. planes.
It is recommended that an implementation will advertise overall best
path over Nth diverse-path session if there is no other BGP path with
different next hop present. That is equivalent to today's case where
client is connected to more than one RR.
4.3. Multi plane route servers for Internet Exchanges 4.3. Multi plane route servers for Internet Exchanges
Another group of devices where the proposed multi-plane architecture Another group of devices where the proposed multi-plane architecture
may be of particular applicability are EBGP route servers used at may be of particular applicability are EBGP route servers used at
many of internet exchange points. many of internet exchange points.
In such cases 100s of ISPs are interconnected on a common LAN. In such cases 100s of ISPs are interconnected on a common LAN.
Instead of having 100s of direct EBGP sessions on each exchange Instead of having 100s of direct EBGP sessions on each exchange
client, a single peering is created to the transparent route server. client, a single peering is created to the transparent route server.
The route server can only propagate a single best path. Mandating The route server can only propagate a single best path. Mandating
the upgrade for 100s of different service providers in order to the upgrade for 100s of different service providers in order to
implement add-path may be much more difficult as compared to asking implement add-path may be much more difficult as compared to asking
them for provisioning one new EBGP session to an Nth best-path route them for provisioning one new EBGP session to an Nth best-path route
server plane. That will allow to distribute more then single best server plane. That will allow to distribute more than the single
BGP path from a given route server to such IX peer. best BGP path from a given route server to such Internet Exchange
Point (IX) peer.
The solution proposed in this document fits very well with the The solution proposed in this document fits very well with the
requirement of having broader EBGP path diversity among the members requirement of having broader EBGP path diversity among the members
of any Internet Exchange Point. of any Internet Exchange Point.
5. Discussion on current models of IBGP route distribution 5. Discussion on current models of IBGP route distribution
In today's networks BGP4 operates as specified in [RFC4271] In today's networks BGP4 operates as specified in [RFC4271]
There are a number of technology choices for intra-AS BGP route There are a number of technology choices for intra-AS BGP route
skipping to change at page 14, line 49 skipping to change at page 15, line 10
historically there have been a number of challenges in realizing such historically there have been a number of challenges in realizing such
an IBGP full mesh in a large scale network. While some of these an IBGP full mesh in a large scale network. While some of these
challenges are no longer applicable today some may still apply, to challenges are no longer applicable today some may still apply, to
include the following: include the following:
1. Number of TCP sessions: The number of IBGP sessions on a single 1. Number of TCP sessions: The number of IBGP sessions on a single
router in a full mesh topology of a large scale service provider router in a full mesh topology of a large scale service provider
can easily reach 100s. While on hardware and software used in can easily reach 100s. While on hardware and software used in
the late 70s, 80s and 90s such numbers could be of concern, today the late 70s, 80s and 90s such numbers could be of concern, today
customer requirements for the number of BGP sessions per box are customer requirements for the number of BGP sessions per box are
reaching 1000s. This is already an order of magnitude more then reaching 1000s. This is already an order of magnitude more than
the potential number of IBGP sessions. Advancement in hardware the potential number of IBGP sessions. Advancement in hardware
and software used in production routers mean that running a full and software used in production routers mean that running a full
mesh of IBGP sessions should not be dismissed due to the mesh of IBGP sessions should not be dismissed due to the
resulting number of TCP sessions alone. resulting number of TCP sessions alone.
2. Provisioning: When operating and troubleshooting large networks 2. Provisioning: When operating and troubleshooting large networks
one of the top-most requirements is to keep the design as simple one of the top-most requirements is to keep the design as simple
as possible. When the autonomous systems network is composed of as possible. When the autonomous systems network is composed of
hundreds of nodes it becomes very difficult to manually provision hundreds of nodes it becomes very difficult to manually provision
a full mesh of IBGP sessions. Adding or removing a router a full mesh of IBGP sessions. Adding or removing a router
skipping to change at page 16, line 33 skipping to change at page 16, line 41
networks are exposed to a phenomenon called BGP path starvation which networks are exposed to a phenomenon called BGP path starvation which
essentially results in inability to deliver a number of applications essentially results in inability to deliver a number of applications
discussed later. discussed later.
The route reflection equivalent when interconnecting BGP speakers The route reflection equivalent when interconnecting BGP speakers
between domains is popularly called the Route Server and is globally between domains is popularly called the Route Server and is globally
deployed today in many internet exchange points. deployed today in many internet exchange points.
6. Deployment considerations 6. Deployment considerations
The diverse BGP path dissemination proposal allows the distribution Distribution of diverse BGP paths proposal allows the dissemination
of more paths than just the best-path to route reflector or route of more paths than just the best-path to route reflector or route
server clients of today's BGP4 implementations. As deployment server clients of today's BGP4 implementations. As deployment
recommendation it needs to be mentioned that fast connectivty recommendation it needs to be mentioned that fast connectivity
restoration as well as majority of intra-domain BGP level load restoration as well as majority of intra-domain BGP level load
balancing needs can be accomodated with only two paths (overall best balancing needs can be accommodated with only two paths (overall best
as well as second best). Therefor as deployment recommendation this as well as second best). Therefor as deployment recommendation this
document suggests use of N=2 with diverse-path. document suggests use of N=2 with diverse-path.
From the client's point of view receiving additional paths via From the client's point of view receiving additional paths via
separate IBGP sessions terminated at the new router reflector plane separate IBGP sessions terminated at the new router reflector plane
is functionally equivalent to constructing a full mesh peering is functionally equivalent to constructing a full mesh peering
without the problems that such a full mesh would come with set of without the problems that such a full mesh would come with set of
problems as discussed in earlier section. problems as discussed in earlier section.
By precisely defining the number of reflector planes, network By precisely defining the number of reflector planes, network
skipping to change at page 17, line 34 skipping to change at page 17, line 43
1. ASBRs advertising their single best external paths with no local- 1. ASBRs advertising their single best external paths with no local-
preference or multi-exit-discriminator present. preference or multi-exit-discriminator present.
2. ASBRs advertising their single best external paths with local- 2. ASBRs advertising their single best external paths with local-
preference or multi-exit-discriminator present and with BGP best- preference or multi-exit-discriminator present and with BGP best-
external functionality enabled. external functionality enabled.
3. ASBRs with multiple external paths. 3. ASBRs with multiple external paths.
Let's discuss the 3rd above case in more detail. This describes the This section focuses on discussion of the 3rd above case in more
scenario of a single ASBR connected to multiple EBGP peers. In detail. This describes the scenario of a single ASBR connected to
practice this peering scenario is quite common. It is mostly due to multiple EBGP peers. In practice this peering scenario is quite
the geographic location of EBGP peers and the diversity of those common. It is mostly due to the geographic location of EBGP peers
peers (for example peering to multiple tier 1 ISPs etc...). It is and the diversity of those peers (for example peering to multiple
not designed for failure recovery scenarios as single failure of the tier 1 ISPs etc...). It is not designed for failure recovery
ASBR would simultaneously result in loss of connectivity to all of scenarios as single failure of the ASBR would simultaneously result
the peers. In most medium and large geographically distributed in loss of connectivity to all of the peers. In most medium and
networks there is always another ASBR or multiple ASBRs providing large geographically distributed networks there is always another
peering backups, typically in other geographically diverse locations ASBR or multiple ASBRs providing peering backups, typically in other
in the network. geographically diverse locations in the network.
When an operator uses ASBRs with multiple peerings setting next hop When an operator uses ASBRs with multiple peerings setting next hop
self will effectively allow to locally repair the atomic failure of self will effectively allow to locally repair the atomic failure of
any external peer without any compromise to the data plane. The most any external peer without any compromise to the data plane. The most
common reason for not setting next hop self is traditionally the common reason for not setting next hop self is traditionally the
associated drawback of loosing ability to signal the external associated drawback of loosing ability to signal the external
failures of peering ASBRs or links to those ASBRs by fast IGP failures of peering ASBRs or links to those ASBRs by fast IGP
flooding. Such potential drawback can be easily avoided by using flooding. Such potential drawback can be easily avoided by using
different peering address from the address used for next hop mapping different peering address from the address used for next hop mapping
as well as removing such next hop from IGP at the last possible BGP as well as removing such next hop from IGP at the last possible BGP
skipping to change at page 18, line 29 skipping to change at page 18, line 39
suboptimal exit points is reasonable, so long as forwarding suboptimal exit points is reasonable, so long as forwarding
information loops are not introduced. In the mean time BGP control information loops are not introduced. In the mean time BGP control
plane will on its own re-advertise newly elected best external path, plane will on its own re-advertise newly elected best external path,
route reflector planes will calculate their Nth best paths and route reflector planes will calculate their Nth best paths and
propagate to its clients. The result is that after seconds even if propagate to its clients. The result is that after seconds even if
potential sub-optimality were encountered it will be quickly and potential sub-optimality were encountered it will be quickly and
naturally healed. naturally healed.
7. Summary of benefits 7. Summary of benefits
The diverse BGP path dissemination proposal provides the following Distribution of diverse BGP paths proposal provides the following
benefits when compared to the alternatives: benefits when compared to the alternatives:
1. No modifications to BGP4 protocol. 1. No modifications to BGP4 protocol.
2. No requirement for upgrades to edge and core routers. Backward 2. No requirement for upgrades to edge and core routers (as required
compatible with the existing BGP deployments. in draft-ietf-idr-add-paths-07). Backward compatible with the
existing BGP deployments.
3. Can be easily enabled by introduction of a new route reflector, 3. Can be easily enabled by introduction of a new route reflector,
route server plane dedicated to the selection and distribution of route server plane dedicated to the selection and distribution of
Nth best-path or just by new configuration of the upgraded Nth best-path or just by new configuration of the upgraded
current route reflector(s). current route reflector(s).
4. Does not require major modification to BGP implementations in the 4. Does not require major modification to BGP implementations in the
entire network which will result in an unnecessary increase of entire network which will result in an unnecessary increase of
memory and CPU consumption due to the shift from today's per memory and CPU consumption due to the shift from today's per
prefix to a per path advertisement state tracking. prefix to a per path advertisement state tracking.
skipping to change at page 19, line 17 skipping to change at page 19, line 27
8. Applications 8. Applications
This section lists the most common applications which require This section lists the most common applications which require
presence of redundant BGP paths: presence of redundant BGP paths:
1. Fast connectivity restoration where backup paths with alternate 1. Fast connectivity restoration where backup paths with alternate
exit points would be pre-installed as well as pre-resolved in the exit points would be pre-installed as well as pre-resolved in the
FIB of routers. That would allow for a local action upon FIB of routers. That would allow for a local action upon
reception of a critical event notification of network / node reception of a critical event notification of network / node
failure. This failure recovery mechaism based on the presence of failure. This failure recovery mechanism based on the presence
backup paths is also suitable for gracefully addressing scheduled of backup paths is also suitable for gracefully addressing
maintenane requirements as described in scheduled maintenance requirements as described in
[I-D.decraene-bgp-graceful-shutdown-requirements]. [I-D.decraene-bgp-graceful-shutdown-requirements].
2. Multi-path load balancing for both IBGP and EBGP. 2. Multi-path load balancing for both IBGP and EBGP.
3. BGP control plane churn reduction both intra-domain and inter- 3. BGP control plane churn reduction both intra-domain and inter-
domain. domain.
An important point to observe is that all of the above intra-domain An important point to observe is that all of the above intra-domain
applications based on the use of reflector planes but are also applications based on the use of reflector planes but are also
applicable in the inter-domain Internet exchange point examples. As applicable in the inter-domain Internet exchange point examples. As
discussed in section 4.3 an internet exchange can conceptually deploy discussed in section 4.3 an internet exchange can conceptually deploy
shadow route server planes each responsible for distribution of an shadow route server planes each responsible for distribution of an
Nth best path to its EBGP peers. In practice it may just equal to Nth best path to its EBGP peers. In practice it may just equal to
new short configuration and establishment of new BGP sessions to IX new short configuration and establishment of new BGP sessions to IX
peers. peers.
9. Security considerations 9. Security considerations
The new mechanism for diverse BGP path dissemination proposed in this The new mechanism for diverse BGP path dissemination proposed in this
document does not introduce any new security concerns as compared to document does not introduce any new security concerns as compared to
base BGP4 specification [RFC4271]. base BGP4 specification [RFC4271] especially when compared against
full iBGP mesh topology.
In addition authors observe that all BGP security issues as described
in [RFC4272] do apply to the additional BGP session or sessions as
recommended by this specification. Therefor all recommended
mitigation techniques to BGP security are applicable here.
10. IANA Considerations 10. IANA Considerations
The new mechanism for diverse BGP path dissemination does not require Following [RFC5226] authors declare that the new mechanism for
any new allocations from IANA. diverse BGP path dissemination does not require any new allocations
from IANA.
11. Contributors 11. Contributors
The following people contributed significantly to the content of the The following people contributed significantly to the content of the
document: document:
Selma Yilmaz Selma Yilmaz
Cisco Systems Cisco Systems
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
US US
Email: seyilmaz@cisco.com Email: seyilmaz@cisco.com
Satish Mynam Satish Mynam
Cisco Systems Juniper Networks
170 West Tasman Drive 1194 N. Mathilda Ave
San Jose, CA 95134 Sunnyvale, CA 94089
US US
Email: mynam@cisco.com Email: smynam@juniper.net
Isidor Kouvelas Isidor Kouvelas
Cisco Systems Cisco Systems
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
US US
Email: kouvelas@cisco.com Email: kouvelas@cisco.com
12. Acknowledgments 12. Acknowledgments
The authors would like to thank Bruno Decraene, Bart Peirens, Eric The authors would like to thank Bruno Decraene, Bart Peirens, Eric
Rosen, Jim Uttaro, Renwei Li and Wes George for their valuable input. Rosen, Jim Uttaro, Renwei Li, Wes George and Adrian Farrel for their
valuable input.
The authors would also like to express special thank you to number of The authors would also like to express special thank you to number of
operators who helped to optimize the provided solution to be as close operators who helped to optimize the provided solution to be as close
as possible to their daily operational practices. Especially many as possible to their daily operational practices. Especially many
thx goes to Ted Seely, Shan Amante, Benson Schliesser and Seiichi thx goes to Ted Seely, Shan Amante, Benson Schliesser and Seiichi
Kawamura. Kawamura.
13. References 13. References
13.1. Normative References 13.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006. Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route
Reflection: An Alternative to Full Mesh Internal BGP
(IBGP)", RFC 4456, April 2006.
[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
"Multiprotocol Extensions for BGP-4", RFC 4760, "Multiprotocol Extensions for BGP-4", RFC 4760,
January 2007. January 2007.
[RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an
IANA Considerations Section in RFCs", BCP 26, RFC 5226, IANA Considerations Section in RFCs", BCP 26, RFC 5226,
May 2008. May 2008.
13.2. Informative References 13.2. Informative References
[I-D.decraene-bgp-graceful-shutdown-requirements] [I-D.decraene-bgp-graceful-shutdown-requirements]
Decraene, B., Francois, P., pelsser, c., Ahmad, Z., and A. Decraene, B., Francois, P., pelsser, c., Ahmad, Z., and A.
Armengol, "Requirements for the graceful shutdown of BGP Armengol, "Requirements for the graceful shutdown of BGP
sessions", sessions",
draft-decraene-bgp-graceful-shutdown-requirements-01 (work draft-decraene-bgp-graceful-shutdown-requirements-01 (work
in progress), March 2009. in progress), March 2009.
[I-D.ietf-idr-add-paths] [I-D.ietf-idr-add-paths]
Walton, D., Chen, E., Retana, A., and J. Scudder, Walton, D., Chen, E., Retana, A., and J. Scudder,
"Advertisement of Multiple Paths in BGP", "Advertisement of Multiple Paths in BGP",
draft-ietf-idr-add-paths-06 (work in progress), draft-ietf-idr-add-paths-07 (work in progress), June 2012.
September 2011.
[I-D.ietf-idr-best-external] [I-D.ietf-idr-best-external]
Marques, P., Fernando, R., Chen, E., Mohapatra, P., and H. Marques, P., Fernando, R., Chen, E., Mohapatra, P., and H.
Gredler, "Advertisement of the best external route in Gredler, "Advertisement of the best external route in
BGP", draft-ietf-idr-best-external-05 (work in progress), BGP", draft-ietf-idr-best-external-05 (work in progress),
January 2012. January 2012.
[I-D.ietf-idr-route-oscillation]
McPherson, D., "BGP Persistent Route Oscillation
Condition", draft-ietf-idr-route-oscillation-01 (work in
progress), February 2002.
[I-D.pmohapat-idr-fast-conn-restore] [I-D.pmohapat-idr-fast-conn-restore]
Mohapatra, P., Fernando, R., Filsfils, C., and R. Raszuk, Mohapatra, P., Fernando, R., Filsfils, C., and R. Raszuk,
"Fast Connectivity Restoration Using BGP Add-path", "Fast Connectivity Restoration Using BGP Add-path",
draft-pmohapat-idr-fast-conn-restore-02 (work in draft-pmohapat-idr-fast-conn-restore-02 (work in
progress), October 2011. progress), October 2011.
[I-D.raszuk-idr-ibgp-auto-mesh] [I-D.raszuk-idr-ibgp-auto-mesh]
Raszuk, R., "IBGP Auto Mesh", Raszuk, R., "IBGP Auto Mesh",
draft-raszuk-idr-ibgp-auto-mesh-00 (work in progress), draft-raszuk-idr-ibgp-auto-mesh-00 (work in progress),
June 2003. June 2003.
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route [RFC3345] McPherson, D., Gill, V., Walton, D., and A. Retana,
Reflection: An Alternative to Full Mesh Internal BGP "Border Gateway Protocol (BGP) Persistent Route
(IBGP)", RFC 4456, April 2006. Oscillation Condition", RFC 3345, August 2002.
[RFC4984] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis",
Workshop on Routing and Addressing", RFC 4984, RFC 4272, January 2006.
September 2007.
[RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous [RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous
System Confederations for BGP", RFC 5065, August 2007. System Confederations for BGP", RFC 5065, August 2007.
Authors' Addresses Authors' Addresses
Robert Raszuk (editor) Robert Raszuk (editor)
NTT MCL NTT MCL
101 S Ellsworth Avenue Suite 350 101 S Ellsworth Avenue Suite 350
San Mateo, CA 94401 San Mateo, CA 94401
 End of changes. 53 change blocks. 
110 lines changed or deleted 122 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/