draft-ietf-grow-diverse-bgp-path-dist-08.txt   rfc6774.txt 
GROW Working Group R. Raszuk, Ed. Internet Engineering Task Force (IETF) R. Raszuk, Ed.
Internet-Draft NTT MCL Request for Comments: 6774 NTT MCL
Intended status: Informational R. Fernando Category: Informational R. Fernando
Expires: January 2, 2013 K. Patel ISSN: 2070-1721 K. Patel
Cisco Systems Cisco Systems
D. McPherson D. McPherson
Verisign Verisign
K. Kumaki K. Kumaki
KDDI Corporation KDDI Corporation
July 2012 November 2012
Distribution of diverse BGP paths. Distribution of Diverse BGP Paths
draft-ietf-grow-diverse-bgp-path-dist-08
Abstract Abstract
The BGP4 protocol specifies the selection and propagation of a single The BGP4 protocol specifies the selection and propagation of a single
best path for each prefix. As defined and widely deployed today BGP best path for each prefix. As defined and widely deployed today, BGP
has no mechanisms to distribute alternate paths which are not has no mechanisms to distribute alternate paths that are not
considered best path between its speakers. This behaviour results in considered best path between its speakers. This behavior results in
number of disadvantages for new applications and services. a number of disadvantages for new applications and services.
The main objective of this document is to observe that by simply The main objective of this document is to observe that by simply
adding new session between route reflector and it's client Nth best adding a new session between a route reflector and its client, the
path can be distributed. Document also compares existing solutions Nth best path can be distributed. This document also compares
and proposed ideas that enable distribution of more paths than just existing solutions and proposed ideas that enable distribution of
the best path. more paths than just the best path.
This proposal does not specify any changes to the BGP protocol This proposal does not specify any changes to the BGP protocol
definition. It does not require software upgrade of provider edge definition. It does not require a software upgrade of provider edge
routers acting as route reflector clients. (PE) routers acting as route reflector clients.
Status of this Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This document is not an Internet Standards Track specification; it is
provisions of BCP 78 and BCP 79. published for informational purposes.
Internet-Drafts are working documents of the Internet Engineering This document is a product of the Internet Engineering Task Force
Task Force (IETF). Note that other groups may also distribute (IETF). It represents the consensus of the IETF community. It has
working documents as Internet-Drafts. The list of current Internet- received public review and has been approved for publication by the
Drafts is at http://datatracker.ietf.org/drafts/current/. Internet Engineering Steering Group (IESG). Not all documents
approved by the IESG are a candidate for any level of Internet
Standard; see Section 2 of RFC 5741.
Internet-Drafts are draft documents valid for a maximum of six months Information about the current status of this document, any errata,
and may be updated, replaced, or obsoleted by other documents at any and how to provide feedback on it may be obtained at
time. It is inappropriate to use Internet-Drafts as reference http://www.rfc-editor.org/info/rfc6774.
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 2, 2013.
Copyright Notice Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction ....................................................2
2. History . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. History .........................................................3
2.1. BGP Add-Paths Proposal . . . . . . . . . . . . . . . . . . 4 2.1. BGP Add-Paths Proposal .....................................3
3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3. Goals ...........................................................5
4. Multi plane route reflection . . . . . . . . . . . . . . . . . 6 4. Multi-Plane Route Reflection ....................................6
4.1. Co-located best and backup path RRs . . . . . . . . . . . 9 4.1. Co-located Best- and Backup-Path RRs .......................8
4.2. Randomly located best and backup path RRs . . . . . . . . 11 4.2. Randomly Located Best- and Backup-Path RRs ................10
4.3. Multi plane route servers for Internet Exchanges . . . . . 14 4.3. Multi-Plane Route Servers for Internet Exchanges ..........12
5. Discussion on current models of IBGP route distribution . . . 14 5. Discussion on Current Models of IBGP Route Distribution ........13
5.1. Full Mesh . . . . . . . . . . . . . . . . . . . . . . . . 14 5.1. Full Mesh .................................................13
5.2. Confederations . . . . . . . . . . . . . . . . . . . . . . 15 5.2. Confederations ............................................14
5.3. Route reflectors . . . . . . . . . . . . . . . . . . . . . 16 5.3. Route Reflectors ..........................................15
6. Deployment considerations . . . . . . . . . . . . . . . . . . 16 6. Deployment Considerations ......................................15
7. Summary of benefits . . . . . . . . . . . . . . . . . . . . . 18 7. Summary of Benefits ............................................17
8. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 19 8. Applications ...................................................18
9. Security considerations . . . . . . . . . . . . . . . . . . . 19 9. Security Considerations ........................................19
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 10. Contributors ..................................................19
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 20 11. Acknowledgments ...............................................20
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 12. References ....................................................20
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 12.1. Normative References ....................................20
13.1. Normative References . . . . . . . . . . . . . . . . . . . 21 12.2. Informative References ..................................20
13.2. Informative References . . . . . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction 1. Introduction
Current BGP4 [RFC4271] protocol specification allows for the The current BGP4 protocol specification [RFC4271] allows for the
selection and propagation of only one best path for each prefix. The selection and propagation of only one best path for each prefix. As
BGP protocol as defined today has no mechanism to distribute other defined today, the BGP protocol has no mechanism to distribute paths
than best path between its speakers. This behaviour results in a other than best path between its speakers. This behavior results in
number of problems in the deployment of new applications and a number of problems in the deployment of new applications and
services. services.
This document presents a mechanism for solving the problem based on This document presents a mechanism for solving the problem based on
the conceptual creation of parallel route reflector planes. It also the conceptual creation of parallel route-reflector planes. It also
compares existing solutions and proposes ideas that enable compares existing solutions and proposes ideas that enable
distribution of more paths than just the best path. The parallel distribution of more paths than just the best path. The parallel
route reflector planes solution brings very significant benefits at a route-reflector planes solution brings very significant benefits at a
negligible capex and opex deployment price as compared to the negligible capex and opex deployment price as compared to the
alternative techniques (full bgp mesh or add-paths) and is being alternative techniques (full BGP mesh or add-paths [ADD-PATHS]) and
considered by a number of network operators for deployment in their is being considered by a number of network operators for deployment
networks. in their networks.
This proposal does not specify any changes to the BGP protocol This proposal does not specify any changes to the BGP protocol
definition. It does not require upgrades to provider edge or core definition. It does not require upgrades to provider edge or core
routers nor does it need network wide upgrades. The only upgrade routers, nor does it need network-wide upgrades. The only upgrade
required is the new functionality on the new or current route required is the new functionality on the new or current route
reflectors. reflectors.
2. History 2. History
The need to disseminate more paths than just the best path is The need to disseminate more paths than just the best path is
primarily driven by three requirements. First is the problem of BGP primarily driven by three issues. The first is the problem of BGP
oscillations [RFC3345]. The second is the desire for faster oscillations [RFC3345]. The second is the desire for faster
reachability restoration in the event of network link or network reachability restoration in the event of failure of the network link
element's failure. Third requirement is to enhance BGP load or network element. The third is a need to enhance BGP load-
balancing capabilities. Those reasons have lead to the proposal of balancing capabilities. These issues have led to the proposal of BGP
BGP add-paths [I-D.ietf-idr-add-paths]. add-paths [ADD-PATHS].
2.1. BGP Add-Paths Proposal 2.1. BGP Add-Paths Proposal
As it has been proven that distribution of only the best path of a As it has been proven that distribution of only the best path of a
route is not sufficient to meet the needs of continuously growing route is not sufficient to meet the needs of the continuously growing
number of services carried over BGP, the add-paths proposal was number of services carried over BGP, the add-paths proposal was
submitted in 2002 to enable BGP to distribute more than one path. submitted in 2002 to enable BGP to distribute more than one path.
This is achieved by including as a part of the NLRI an additional This is achieved by including an additional four-octet value called
four octet value called the Path Identifier. the "Path Identifier" as a part of the Network Layer Reachability
Information (NLRI).
The implication of this change on a BGP implementation is that it The implication of this change on a BGP implementation is that it
must now maintain per path, instead of per prefix, peer advertisement must now maintain a per-path, instead of per-prefix, peer
state to track to which of the peers given path was advertised to. advertisement state to track to which of the peers a given path was
advertised. This new requirement comes with its own memory and
This new requirement comes with its own memory and processing cost. processing cost.
An important observation is that distribution of more than one best An important observation is that distribution of more than one best
path by Autonomous System Border Routers (ASBRs) with multiple EBGP path by the Autonomous System Border Routers (ASBRs) with multiple
peers attached to it where no "next hop self" is set may result in External BGP (EBGP) peers attached where no "next-hop self" is set
bestpath selection inconsistency within the autonomous system. may result in inconsistent best-path selection within the autonomous
Therefore it is also required to attach in the form of a new system. Therefore, it is also required to attach the possible
attribute the possible tie breakers and propagate those within the tiebreakers in the form of a new attribute and propagate those within
domain. The example of such attribute for the purpose of fast the domain. The example of such an attribute for the purpose of fast
connectivity restoration to address that very case of ASBR injecting connectivity restoration to address that very case of ASBR injecting
multiple external paths into the IBGP mesh has been presented and multiple external paths into the Internal BGP (IBGP) mesh has been
discussed in Fast Connectivity Restoration Using BGP Add-paths presented and discussed in "Advertisement of Multiple Paths in BGP"
[I-D.ietf-idr-add-paths] document. Based on the additionally [ADD-PATHS]. Based on the additionally propagated information, best-
propagated information also best path selection is recommended to be path selection is recommended to be modified to make sure that best-
modified to make sure that best and backup path selection within the and backup-path selection within the domain stays consistent. More
domain stays consistent. More discussion on this particular point discussion on this particular point is contained in Section 6,
will be contained in the deployment considerations section below. In "Deployment Considerations". In the proposed solution in this
the proposed solution in this document we observe that in order to document, we observe that to address most of the applications, just
address most of the applications just use of best external use of the best external advertisement is required. For ASBRs that
advertisement is required. For ASBRs which are peering to multiple are peering to multiple upstream domains, setting "next-hop self" is
upstream domains setting "next hop self" is recommended. recommended.
The add paths protocol extensions have to be implemented by all the The add-paths protocol extensions have to be implemented by all the
routers within an Autonomous System (AS) in order for the system to routers within an Autonomous System (AS) in order for the system to
work correctly. It remains quite a research topic to analyze work correctly. Analyzing the benefits or risks associated with
benefits or risk associated with partial add-paths deployments. The partial add-paths deployments remains quite a topic for research.
risk becomes even greater in networks not using some form of edge to The risk becomes even greater in networks not using some form of
edge encapsulation. edge-to-edge encapsulation.
The required code modifications can offer the foundation for The required code modifications can offer the foundation for
enhancements such as the Fast Connectivity Restoration Using BGP Add- enhancements, such as the "Fast Connectivity Restoration Using BGP
path [I-D.pmohapat-idr-fast-conn-restore]. The deployment of such Add-path" [FAST-CONN]. The deployment of such technology in an
technology in an entire service provider network requires software entire service-provider network requires software, and perhaps
and perhaps sometimes in the cases of End-of-Engineering or End-of- sometimes, in the case of End-of-Engineering or End-of-Life
Life equipment even hardware upgrades. Such operation may or may not equipment, even hardware upgrades. Such an operation may or may not
be economically feasible. Even if add-path functionality was be economically feasible. Even if add-path functionality was
available today on all commercial routing equipment and across all available today on all commercial routing equipment and across all
vendors, experience indicates that to achieve 100% deployment vendors, experience indicates that it may easily take years to
coverage within any medium or large global network may easily take achieve 100% deployment coverage within any medium or large global
years. network.
While it needs to be clearly acknowledged that the add-path mechanism While it needs to be clearly acknowledged that the add-path mechanism
provides the most general way to address the problem of distributing provides the most general way to address the problem of distributing
many paths between BGP speakers, this document provides a much easier many paths between BGP speakers, this document provides a solution
to deploy solution that requires no modification to the BGP protocol that is much easier to deploy and requires no modification to the BGP
where only a few additional paths may be required. The alternative protocol where only a few additional paths may be required. The
method presented is capable of addressing critical service provider alternative method presented is capable of addressing critical
requirements for disseminating more than a single path across an AS service-provider requirements for disseminating more than a single
with a significantly lower deployment cost. That in the light of path across an AS with a significantly lower deployment cost. That,
number of general network scaling concerns documented in RFC4984 in light of the number of general network scaling concerns documented
[RFC4271] "Report from the IAB Workshop on Routing and Addressing" in RFC 4984 [RFC4984], "Report from the IAB Workshop on Routing and
may provide a significant advantage. Addressing", may provide a significant advantage.
3. Goals 3. Goals
The proposal described in this document is not intended to compete The proposal described in this document is not intended to compete
with add-paths. It provides an interim solution until the with add-paths. It provides an interim solution until add-paths are
standardization and implementation of add-paths and until support for standardized and implemented and until support for that function can
that function can be deployed across the network. be deployed across the network.
It is presented to network operators as a possible choice and It is presented to network operators as a possible choice and
provides those operators who need additional paths today an provides those operators who need additional paths today an
alternative from the need to transition to a full mesh. The Nth best alternative from the need to transition to a full mesh. The Nth best
path describes a set of N paths with different BGP next hops with no path describes a set of N paths with different BGP next hops with no
implication of ordering or preference among said N paths. implication of ordering or preference among said N paths.
It is intended as a way to buy more time allowing for a smoother and It is intended as a way to buy more time, allowing for a smoother and
gradual migration where router upgrades will be required for perhaps gradual migration where router upgrades will be required for,
different reasons. It will also allow the time required where perhaps, different reasons. It will also allow the time required so
standard RP/RE memory size can easily accommodate the associated that standard RP/RE memory size can easily accommodate the associated
overhead with other techniques without any compromises. overhead with other techniques without any compromises.
4. Multi plane route reflection 4. Multi-Plane Route Reflection
The idea contained in the proposal assumes the use of route The idea contained in the proposal assumes the use of route
reflection within the network. reflection within the network.
Let's observe today's picture of simple route reflected domain: Let's observe today's picture of a simple route-reflected domain:
ASBR3 ASBR3
*** ***
* * * *
+------------* *-----------+ +------------* *-----------+
| AS1 * * | | AS1 * * |
| *** | | *** |
| | | |
| | | |
| | | |
skipping to change at page 7, line 32 skipping to change at page 6, line 37
| IBGP | | IBGP |
| | | |
| | | |
| *** *** | | *** *** |
| * * * * | | * * * * |
+-----* *---------* *----+ +-----* *---------* *----+
* * * * * * * *
*** *** *** ***
ASBR1 ASBR2 ASBR1 ASBR2
EBGP EBGP
Figure 1: Simple route reflection
Figure1: Simple route reflection Abbreviations used:
RR - Route Reflector
Abbreviations used: RR - Route Reflector, P - Core router. P - Core router
Figure 1 shows an AS that is connected via EBGP peering at ASBR1 and Figure 1 shows an AS that is connected via EBGP peering at ASBR1 and
ASBR2 to an upstream AS or set of ASes. For a given destination "D" ASBR2 to an upstream AS or set of ASes. For a given destination "D",
ASBR1 and ASBR2 may have an external path P1 and P2 respectively. ASBR1 and ASBR2 may have an external path P1 and P2, respectively.
The AS network uses two route reflectors RR1 and RR2 for redundancy The AS network uses two route reflectors, RR1 and RR2, for redundancy
reasons. The route reflectors propagate the single BGP best path for reasons. The route reflectors propagate the single BGP best path for
each route to all clients. All ASBRs are clients of RR1 and RR2. each route to all clients. All ASBRs are clients of RR1 and RR2.
Below are the possible cases of the path information that ASBR3 may Following are the possible cases of the path information that ASBR3
receive from route reflectors RR1 and RR2: may receive from route reflectors RR1 and RR2:
1. When best path tie breaker is the IGP distance: When paths P1 and 1. When the best-path tiebreaker is the IGP distance: When paths P1
P2 are considered to be equally good best path candidates the and P2 are considered to be equally good best-path candidates,
selection will depend on the distance of the path next-hops from the selection will depend on the distance of the path's next hops
the route reflector making the decision. Depending on the from the route reflector making the decision. Depending on the
positioning of the route reflectors in the IGP topology they may positioning of the route reflectors in the IGP topology, they may
choose the same best path or a different one. In such a case choose the same best path or a different one. In such a case,
ASBR3 may receive either the same path or different paths from ASBR3 may receive either the same path or different paths from
each of the route reflectors. each of the route reflectors.
2. When best path tie breaker is Multi-Exit-Discriminator or Local 2. When the best-path tiebreaker is MULTI_EXIT_DISC (MED) or
Preference: In this case only one path from preferred exit point LOCAL_PREF: In this case, only one path from the preferred exit
ASBR will be available to RRs since the other peering ASBR will point ASBR will be available to RRs since the other peering ASBR
consider the IBGP path as best and will not announce (or if will consider the IBGP path as best and will not announce (or if
already announced will withdraw) its own external path. The already announced will withdraw) its own external path. The
exception here is the use of BGP Best-External proposal which exception here is the use of the BGP Best-External proposal
will allow stated ASBR to still propagate to the RRs its own [EXT-PATH], which will allow a stated ASBR to still propagate to
external path. Unfortunately RRs will not be able to distribute the RRs on its own external path. Unfortunately, RRs will not be
it any further to other clients as only the overall best path able to distribute it any further to other clients, as only the
will be reflected. overall best path will be reflected.
There is no requirement of path ordering. The "Nth best path" really There is no requirement of path ordering. The "Nth best path" really
describes set of N paths with different bgp next hops. describes set of N paths with different BGP next hops.
The proposed solution is based on the use of additional route The proposed solution is based on the use of additional route
reflectors or new functionality enabled on the existing route reflectors or new functionality enabled on the existing route
reflectors that instead of distributing the best path for each route reflectors that, instead of distributing the best path for each
will distribute an alternative path other than best. The best path route, will distribute an alternative path other than best. The
(main) reflector plane distributes the best path for each route as it best-path (main) reflector plane distributes the best path for each
does today. The second plane distributes the second best path for route as it does today. The second plane distributes the second best
each route and so on. Distribution of N paths for each route can be path for each route, and so on. Distribution of N paths for each
achieved by using N reflector planes. route can be achieved by using N reflector planes.
As diverse-path functionality may be enabled on a per peer basis one As diverse-path functionality may be enabled on a per-peer basis, one
of the deployment model can be realized to continue advertisement of of the deployment models can be realized to continue advertisement of
overall best path from both route reflectors while in addition new the overall best path from both route reflectors, while in addition a
session can be provisioned to get additional path. That will allow new session can be provisioned to get an additional path. This will
the non interrupted use of best path even if one of the RRs goes down allow the uninterrupted use of the best path, even if one of the RRs
provided that the overall best path is still a valid one. goes down, provided that the overall best path is still a valid one.
Each plane of route reflectors is a logical entity and may or may not Each plane of the route reflectors is a logical entity and may or may
be co-located with the existing best path route reflectors. Adding a not be co-located with the existing best-path route reflectors.
route reflector plane to a network may be as easy as enabling a Adding a route-reflector plane to a network may be as easy as
logical router partition, new BGP process or just a new configuration enabling a logical router partition, new BGP process, or just a new
knob on an existing route reflector and configuring an additional configuration knob on an existing route reflector and configuring an
IBGP session from the current clients if required. There are no code additional IBGP session from the current clients if required. There
changes required on the route reflector clients for this mechanism to are no code changes required on the route-reflector clients for this
work. It is easy to observe that the installation of one or more mechanism to work. It is easy to observe that the installation of
additional route reflector control planes is much cheaper and an one or more additional route-reflector control planes is much cheaper
easier than the need of upgrading 100s of route reflector clients in and is easier than upgrading hundreds of route-reflector clients in
the entire network to support different bgp protocol encoding. the entire network to support different BGP protocol encoding.
Diverse path route reflectors need the new ability to calculate and Diverse-path route reflectors need the new ability to calculate and
propagate the Nth best path instead of the overall best path. An propagate the Nth best path instead of the overall best path. An
implementation is encouraged to enable this new functionality on a implementation is encouraged to enable this new functionality on a
per neighbor basis. per-neighbor basis.
While this is an implementation detail, the code to calculate Nth While this is an implementation detail, the code to calculate the Nth
best path is also required by other BGP solutions. For example in best path is also required by other BGP solutions. For example, in
the application of fast connectivity restoration BGP must calculate a the application of fast connectivity restoration, BGP must calculate
backup path for installation into the RIB and FIB ahead of the actual a backup path for installation into the Routing Information Base
(RIB) and Forwarding Information Base (FIB) ahead of the actual
failure. failure.
To address the problem of external paths not being available to route To address the problem of external paths not being available to route
reflectors due to local preference or MED factors it is recommended reflectors due to LOCAL_PREF or MED factors, it is recommended that
that ASBRs enable the best-external functionality in order to always ASBRs enable [EXT-PATH] functionality in order to always inject their
inject their external paths to the route reflectors. external paths to the route reflectors.
4.1. Co-located best and backup path RRs 4.1. Co-located Best- and Backup-Path RRs
To simplify the description let's assume that we only use two route To simplify the description, let's assume that we only use two route-
reflector planes (N=2). When co-located the additional 2nd best path reflector planes (N=2). When co-located, the additional second-best-
reflectors are connected to the network at the same points from the path reflectors are connected to the network at the same points from
perspective of the IGP as the existing best path RRs. Let's also the perspective of the IGP as the existing best-path RRs. Let's also
assume that best-external is enabled on all ASBRs. assume that best-external functionality is enabled on all ASBRs.
ASBR3 ASBR3
*** ***
* * * *
+------------* *-----------+ +------------* *-----------+
| AS1 * * | | AS1 * * |
| *** | | *** |
| | | |
| RR1 RR2 | | RR1 RR2 |
| *** *** | | *** *** |
skipping to change at page 10, line 32 skipping to change at page 9, line 32
| | | |
| *** *** | | *** *** |
| * * * * | | * * * * |
+-----* *---------* *----+ +-----* *---------* *----+
* * * * * * * *
*** *** *** ***
ASBR1 ASBR2 ASBR1 ASBR2
EBGP EBGP
Figure2: Co-located 2nd best RR plane Figure 2: Co-located Second-Best-Path RR Plane
The following is a list of configuration changes required to enable The following is a list of configuration changes required to enable
the 2nd best path route reflector plane: the second-best-path route-reflector plane:
1. Unless same RR1/RR2 platform is being used adding RR1' and RR2' 1. Unless the same RR1/RR2 platform is being used, adding RR1' and
either as logical or physical new control plane RRs in the same RR2' either as the logical or physical new control-plane RRs in
IGP points as RR1 and RR2 respectively. the same IGP points as RR1 and RR2, respectively.
2. Enabling best-external on ASBRs 2. Enabling best-external functionality on ASBRs.
3. Enabling RR1' and RR2' for 2nd plane route reflection. 3. Enabling RR1' and RR2' for second plane route reflection.
Alternatively instructing existing RR1 and RR2 to calculate also Alternatively, instructing existing RR1 and RR2 to calculate the
2nd best path. second-best path also.
4. Unless one of the existing RRs is turned to advertise only 4. Unless one of the existing RRs is set to advertise only diverse
diverse path to its current clients configuring new ASBRs-RR' path to its current clients, configuring new ASBRs-RR' IBGP
IBGP sessions sessions.
The expected behaviour is that under any BGP condition the ASBR3 and The expected behavior is that under any BGP condition, the ASBR3 and
P routers will receive both paths P1 and P2 for destination D. The P routers will receive both paths P1 and P2 for destination D. The
availability of both paths will allow them to implement a number of availability of both paths will allow them to implement a number of
new services as listed in the applications section below. new services as listed in Section 8 ("Applications").
As an alternative to fully meshing all RRs and RRs' an operator who As an alternative to fully meshing all RRs and RRs', an operator that
has a large number of reflectors deployed today may choose to peer has a large number of reflectors deployed today may choose to peer
newly introduced RRs' to a hierarchical RR' which would be an IBGP newly introduced RRs' to a hierarchical RR', which would be an IBGP
interconnect point within the 2nd plane as well as between planes. interconnect point within the second plane as well as between planes.
One of the deployment model of this scenario can be achieved by One deployment model of this scenario can be achieved by simply
simple upgrade of the existing route reflectors without the need to upgrading the existing route reflectors without deploying any new
deploy any new logical or physical platforms. Such upgrade would logical or physical platforms. Such an upgrade would allow route
allow route reflectors to service both upgraded to add-paths peers as reflectors to service both peers that have upgraded to add-paths, as
well as those peers which can not be immediately upgraded while in well as those peers that cannot be immediately upgraded while at the
the same time allowing to distribute more than single best path. The same time allowing distribution of more than a single best path. The
obvious protocol benefit of using existing RRs to distribute towards obvious protocol benefit of using existing RRs to distribute towards
their clients best and diverse bgp paths over different IBGP session their clients' best and diverse BGP paths over different IBGP
is the automatic assurance that such client would always get sessions is the automatic assurance that such a client would always
different paths with their next hop being different. get different paths with their next hop being different.
The way to accomplish this would be to create a separate IBGP session The way to accomplish this would be to create a separate IBGP session
for each N-th BGP path. Such session should be preferably terminated for each Nth BGP path. Such a session should be preferably
at a different loopback address of the route reflector. At the BGP terminated at a different loopback address of the route reflector.
OPEN stage of each such session a different bgp_router_id may be At the BGP OPEN stage of each such session, a different bgp_router_id
used. Correspondingly route reflector should also allow its clients may be used. Correspondingly, the route reflector should also allow
to use the same bgp_router_id on each such session. its clients to use the same bgp_router_id on each such session.
4.2. Randomly located best and backup path RRs 4.2. Randomly Located Best- and Backup-Path RRs
Now let's consider a deployment case where an operator wishes to Now let's consider a deployment case in which an operator wishes to
enable a 2nd RR' plane using only a single additional router in a enable a second RR' plane using only a single additional router in a
different network location to his current route reflectors. This different network location from his current route reflectors. This
model would be of particular use in networks where some form of end- model would be of particular use in networks in which some form of
to-end encapsulation (IP or MPLS) is enabled between provider edge end-to-end encapsulation (IP or MPLS) is enabled between provider-
routers. edge routers.
Note that this model of operation assumes that the present best path Note that this model of operation assumes that the present best-path
route reflectors are only control plane devices. If the route route reflectors are only control-plane devices. If the route
reflector is in the data forwarding path then the implementation must reflector is in the data-forwarding path, then the implementation
be able to clearly separate the Nth best-path selection from the must be able to clearly separate the Nth best-path selection from the
selection of the paths to be used for data forwarding. The basic selection of the paths to be used for data forwarding. The basic
premise of this mode of deployment assumes that all reflector planes premise of this mode of deployment assumes that all reflector planes
have the same information to choose from which includes the same set have the same information to choose from, which includes the same set
of BGP paths. It also requires the ability to ignore the step of of BGP paths. It also requires the ability to ignore the step of
comparison of the IGP metric to reach the bgp next hop during best- comparison of the IGP metric to reach the BGP next hop during best-
path calculation. path calculation.
ASBR3 ASBR3
*** ***
* * * *
+------------* *-----------+ +------------* *-----------+
| AS1 * * | | AS1 * * |
| IBGP *** | | IBGP *** |
| | | |
| *** | | *** |
skipping to change at page 12, line 32 skipping to change at page 11, line 32
| *** | | *** |
| *** *** | | *** *** |
| * * * * | | * * * * |
+-----* *---------* *----+ +-----* *---------* *----+
* * * * * * * *
*** *** *** ***
ASBR1 ASBR2 ASBR1 ASBR2
EBGP EBGP
Figure3: Experimental deployment of 2nd best RR Figure 3: Experimental Deployment of Second-Best-Path RR Plane
The following is a list of configuration changes required to enable The following is a list of configuration changes required to enable
the 2nd best path route reflector RR' as a single platform or to the second-best-path route reflector RR' as a single platform or to
enable one of the existing control plane RRs for diverse-path enable one of the existing control-plane RRs for diverse-path
functionality: functionality:
1. If needed adding RR' logical or physical as new route reflector 1. If needed, adding RR' logical or physical as a new route
anywhere in the network reflector anywhere in the network.
2. Enabling best-external on ASBRs 2. Enabling best-external functionality on ASBRs.
3. Disabling IGP metric check in BGP best path on all route 3. Disabling IGP metric check in BGP best path on all route
reflectors. reflectors.
4. Enabling RR' or any of the existing RR for 2nd plane path 4. Enabling RR' or any of the existing RR for second plane path
calculation calculation.
5. If required fully meshing newly added RRs' with the all other 5. If required, fully meshing newly added RRs' with all the other
reflectors in both planes. That condition does not apply if the reflectors in both planes. This condition does not apply if the
newly added RR'(s) already have peering to all ASBRs/PEs. newly added RR'(s) already have peering to all ASBRs/PEs.
6. Unless one of the existing RRs is turned to advertise only 6. Configure new BGP sessions between ASBRs and RRs (unless one of
diverse path to its current clients configuring new ASBRs-RR' the existing RRs is set to advertise only diverse path to its
IBGP sessions current clients).
In this scenario the operator has the flexibility to introduce the In this scenario, the operator has the flexibility to introduce the
new additional route reflector functionality on any existing or new new additional route-reflector functionality on any existing or new
hardware in the network. Any of the existing routers that are not hardware in the network. Any existing routers that are not already
already members of the best path route reflector plane can be easily members of the best-path route-reflector plane can be easily
configured to serve the 2nd plane either via using a logical / configured to serve the second plane either by using a
virtual router partition or by having their bgp implementation logical/virtual router partition or by having their BGP
compliant to this specification. implementation compliant to this specification.
Even if the IGP metric is not taken into consideration when comparing Even if the IGP metric is not taken into consideration when comparing
paths during the bestpath calculation, an implementation still has to paths during the best-path calculation, an implementation still has
consider paths with unreachable nexthops as invalid. It is worth to consider paths with unreachable next hops invalid. It is worth
pointing out that some implementations today already allow for pointing out that some implementations today already allow for
configuration which results in no IGP metric comparison during the configuration that results in no IGP metric comparison during the
best path calculation. best-path calculation.
The additional planes of route reflectors do not need to be fully The additional planes of route reflectors do not need to be fully
redundant as the primary one does. If we are preparing for a single redundant as the primary plane does. If we are preparing for a
network failure event, a failure of a non backed up N-th best-path single network failure event, a failure of a non-backed-up Nth best-
route reflector would not result in an connectivity outage of the path route reflector would not result in a connectivity outage of the
actual data plane. The reason is that this would at most affect the actual data plane. The reason is that this would, at most, affect
presence of a backup path (not an active one) on same parts of the the presence of a backup path (not an active one) on the same parts
network. If the operator chooses to create the N-th best path plane of the network. If the operator chooses to create the Nth best-path
redundantly by installing not one, but two or more route reflectors plane redundantly by installing not one, but two or more route
serving each additional plane the additional robustness will be reflectors serving each additional plane, the additional robustness
achieved. will be achieved.
As a result of this solution ASBR3 and other ASBRs peering to RR' As a result of this solution, ASBR3 and other ASBRs peering to RR'
will be receiving the 2nd best path. will be receiving the second best path.
Similarly to section 4.1 as an alternative to fully meshing all RRs & Similarly to Section 4.1, as an alternative to fully meshing all RRs
RRs' an operator who may have a large number of reflectors already and diverse path RRs', operators may choose to peer newly introduced
deployed today may choose to peer newly introduced RRs' to a RRs' to a hierarchical RR', which would be an IBGP interconnect point
hierarchical RR' which would be an IBGP interconnect point between between planes.
planes.
It is recommended that an implementation will advertise overall best It is recommended that an implementation advertise the overall best
path over Nth diverse-path session if there is no other BGP path with path over the Nth diverse-path session if there is no other BGP path
different next hop present. That is equivalent to today's case where with a different next hop present. This is equivalent to today's
client is connected to more than one RR. case where the client is connected to more than one RR.
4.3. Multi plane route servers for Internet Exchanges 4.3. Multi-Plane Route Servers for Internet Exchanges
Another group of devices where the proposed multi-plane architecture Another group of devices in which the proposed multi-plane
may be of particular applicability are EBGP route servers used at architecture may be of particular applicability is the EBGP route
many of internet exchange points. servers used at many Internet exchange points.
In such cases 100s of ISPs are interconnected on a common LAN. In such cases, hundreds of ISPs are interconnected on a common LAN.
Instead of having 100s of direct EBGP sessions on each exchange Instead of having hundreds of direct EBGP sessions on each exchange
client, a single peering is created to the transparent route server. client, a single peering is created to the transparent route server.
The route server can only propagate a single best path. Mandating The route server can only propagate a single best path. Mandating
the upgrade for 100s of different service providers in order to the upgrade for hundreds of different service providers in order to
implement add-path may be much more difficult as compared to asking implement add-path may be much more difficult as compared to asking
them for provisioning one new EBGP session to an Nth best-path route them to provision one new EBGP session to an Nth best path route
server plane. That will allow to distribute more than the single server plane. This allows the distribution of more than the single
best BGP path from a given route server to such Internet Exchange best BGP path from a given route server to such an Internet exchange
Point (IX) peer. point (IX) peer.
The solution proposed in this document fits very well with the The solution proposed in this document fits very well with the
requirement of having broader EBGP path diversity among the members requirement of having broader EBGP path diversity among the members
of any Internet Exchange Point. of any Internet exchange point.
5. Discussion on current models of IBGP route distribution 5. Discussion on Current Models of IBGP Route Distribution
In today's networks BGP4 operates as specified in [RFC4271] In today's networks, BGP4 operates as specified in [RFC4271].
There are a number of technology choices for intra-AS BGP route There are a number of technology choices for intra-AS BGP route
distribution: distribution:
1. Full mesh 1. Full mesh
2. Confederations 2. Confederations
3. Route reflectors 3. Route reflectors
5.1. Full Mesh 5.1. Full Mesh
A full mesh, the most basic iBGP architecture, exists when all the A full mesh, the most basic IBGP architecture, exists when all BGP
BGP speaking routers within the AS peer directly with all other BGP speaking routers within the AS peer directly with all other BGP
speaking routers within the AS, irrespective of where a given router speaking routers within the AS, irrespective of where a given router
resides within the AS (e.g., P router, PE router, etc..). resides within the AS (e.g., P router, PE router, etc.).
While this is the simplest intra-domain path distribution method, While this is the simplest intra-domain path-distribution method,
historically there have been a number of challenges in realizing such historically, there have been a number of challenges in realizing
an IBGP full mesh in a large scale network. While some of these such an IBGP full mesh in a large-scale network. While some of these
challenges are no longer applicable today some may still apply, to challenges are no longer applicable, the following (as well as
include the following: others) may still apply:
1. Number of TCP sessions: The number of IBGP sessions on a single 1. Number of TCP sessions: The number of IBGP sessions on a single
router in a full mesh topology of a large scale service provider router in a full-mesh topology of a large-scale service provider
can easily reach 100s. While on hardware and software used in can easily reach hundreds. Such numbers could be a concern on
the late 70s, 80s and 90s such numbers could be of concern, today hardware and software used in the late 70s, 80s, and 90s. Today,
customer requirements for the number of BGP sessions per box are customer requirements for the number of BGP sessions per box are
reaching 1000s. This is already an order of magnitude more than reaching thousands. This is already an order of magnitude more
the potential number of IBGP sessions. Advancement in hardware than the potential number of IBGP sessions. Advancements in the
and software used in production routers mean that running a full hardware and software used in production routers means that
mesh of IBGP sessions should not be dismissed due to the running a full mesh of IBGP sessions should not be dismissed due
resulting number of TCP sessions alone. to the resulting number of TCP sessions alone.
2. Provisioning: When operating and troubleshooting large networks 2. Provisioning: When operating and troubleshooting large networks,
one of the top-most requirements is to keep the design as simple one of the topmost requirements is to keep the design as simple
as possible. When the autonomous systems network is composed of as possible. When the autonomous system's network is composed of
hundreds of nodes it becomes very difficult to manually provision hundreds of nodes, it becomes very difficult to manually
a full mesh of IBGP sessions. Adding or removing a router provision a full mesh of IBGP sessions. Adding or removing a
requires reconfiguration of all the other routers in the AS. router requires reconfiguration of all other routers in the AS.
While this is a real concern today there is already work in While this is a real concern today, there is already work in
progress in the IETF to define IBGP peering automation through an progress in the IETF to define IBGP peering automation through an
IBGP Auto Discovery [I-D.raszuk-idr-ibgp-auto-mesh] mechanism. IBGP Auto Discovery mechanism [AUTO-MESH].
3. Number of paths: Another concern when deploying a full IBGP mesh 3. Number of paths: Another concern when deploying a full IBGP mesh
is the number of BGP paths for each route that have to be stored is the number of BGP paths for each route that have to be stored
at every node. This number is very tightly related to the number at every node. This number is very tightly related to the number
of external peerings of an AS, the use of local preference or of external peerings of an AS, the use of LOCAL_PREF or MED
multi-exit-discriminator techniques and the presence of best- techniques, and the presence of best-external [EXT-PATH]
external [I-D.ietf-idr-best-external] advertisement advertisement configuration. If we make a rough assumption that
configuration. If we make a rough assumption that the BGP4 path the BGP4-path data structure consumes about 80-100 bytes, the
data structure consumes about 80-100 bytes the resulting control resulting control-plane memory requirement for 500,000 IPv4
plane memory requirement for 500,000 IPv4 routes with one routes with one additional external path is 38-48 MB, while for 1
additional external path is 38-48 MB while for 1 million IPv4 million IPv4 routes, it grows linearly to 76-95 MB. It is not
routes it grows linearly to 76-95 MB. It is not possible to possible to reach a general conclusion if this condition is
reach a general conclusion if this condition is negligible or if negligible or if it is a show stopper for a full-mesh deployment
it is a show stopper for a full mesh deployment without direct without direct reference to a given network.
reference to a given network.
To summarize, a full mesh IBGP peering can offer natural To summarize, a full-mesh IBGP peering can offer natural
dissemination of multiple external paths among BGP speakers. When dissemination of multiple external paths among BGP speakers. When
realized with the help of IBGP Auto Discovery peering automation this realized with the help of IBGP Auto Discovery peering automation,
seems like a viable deployment especially in medium and small scale this seems like a viable deployment, especially in medium- and small-
networks. scale networks.
5.2. Confederations 5.2. Confederations
For the purpose of this document let's observe that confederations For the purpose of this document, let's observe that confederations
[RFC5065] can be viewed as a hierarchical full mesh model. [RFC5065] can be viewed as a hierarchical full-mesh model.
Within each sub-AS BGP speakers are fully meshed and as discussed in Within each sub-AS, BGP speakers are fully meshed, and as discussed
section 2.1 all full mesh characteristics (number of TCP sessions, in Section 2.1, all full-mesh characteristics (number of TCP
provisioning and potential concern over number of paths still apply sessions, provisioning, and potential concern over number of paths
in the sub-AS scale). still apply in the sub-AS scale).
In addition to the direct peering of all BGP speakers within each In addition to the direct peering of all BGP speakers within each
sub-AS, all sub-AS border routers must also be fully meshed with each sub-AS, all sub-AS border routers must also be fully meshed with each
other. Sub-AS border routers configured with best-external other. Sub-AS border routers configured with best-external
functionality can inject additional exit paths within a sub-AS. functionality can inject additional (diverse) paths within a sub-AS.
To summarize, it is technically sound to use confederations with the To summarize, it is technically sound to use confederations with the
combination of best-external to achieve distribution of more than a combination of best-external to achieve distribution of more than a
single best path per route in a large autonomous systems. single best path per route in a large autonomous systems.
In topologies where route reflectors are deployed within the In topologies where route reflectors are deployed within the
confederation sub-ASes the technique describe here does apply. confederation sub-ASes, the technique described here applies.
5.3. Route reflectors 5.3. Route Reflectors
The main motivation behind the use of route reflectors [RFC4456] is The main motivation behind the use of route reflectors [RFC4456] is
the avoidance of the full mesh session management problem described the avoidance of the full-mesh session management problem described
above. Route reflectors, for good or for bad, are the most common above. Route reflectors, for good or for bad, are the most common
solution today for interconnecting BGP speakers within an internal solution today for interconnecting BGP speakers within an internal
routing domain. routing domain.
Route reflector peerings follow the advertisement rules defined by Route-reflector peerings follow the advertisement rules defined by
the BGP4 protocol. As a result only a single best path per prefix is the BGP4 protocol. As a result, only a single best path per prefix
sent to client BGP peers. That is the main reason why many current is sent to client BGP peers. This is the main reason many current
networks are exposed to a phenomenon called BGP path starvation which networks are exposed to a phenomenon called BGP path starvation,
essentially results in inability to deliver a number of applications which essentially results in the inability to deliver a number of
discussed later. applications discussed later.
The route reflection equivalent when interconnecting BGP speakers When interconnecting BGP speakers between domains, the route
between domains is popularly called the Route Server and is globally reflection equivalent is popularly called the "Route Server" and is
deployed today in many internet exchange points. globally deployed today in many Internet exchange points.
6. Deployment considerations 6. Deployment Considerations
Distribution of diverse BGP paths proposal allows the dissemination Distribution of the diverse-BGP-paths proposal allows the
of more paths than just the best-path to route reflector or route dissemination of more paths than just the best path to the route-
server clients of today's BGP4 implementations. As deployment reflector or route-server clients of today's BGP4 implementations.
recommendation it needs to be mentioned that fast connectivity As a deployment recommendation, it needs to be mentioned that fast
restoration as well as majority of intra-domain BGP level load connectivity restoration as well as a majority of intra-domain BGP-
balancing needs can be accommodated with only two paths (overall best level load balancing needs can be accommodated with only two paths
as well as second best). Therefor as deployment recommendation this (overall best and second best). Therefore, as a deployment
document suggests use of N=2 with diverse-path. recommendation, this document suggests use of N=2 with diverse-path.
From the client's point of view receiving additional paths via From the client's point of view, receiving additional paths via
separate IBGP sessions terminated at the new router reflector plane separate IBGP sessions terminated at the new route-reflector plane is
is functionally equivalent to constructing a full mesh peering functionally equivalent to constructing a full-mesh peering without
without the problems that such a full mesh would come with set of the problems such a full mesh would come with, as discussed in
problems as discussed in earlier section. earlier section.
By precisely defining the number of reflector planes, network By precisely defining the number of reflector planes, network
operators have full control over the number of redundant paths in the operators have full control over the number of redundant paths in the
network. This number can be defined to address the needs of the network. This number can be defined to address the needs of the
service(s) being deployed. service(s) being deployed.
The Nth plane route reflectors should be acting as control plane The Nth-plane route reflectors should act as control-plane network
network entities. While they can be provisioned on the current entities. While they can be provisioned on the current production
production routers selected Nth best BGP paths should not be used routers, selected Nth-best BGP paths should not be used directly in
directly in the date plane with the exception of such paths being BGP the date plane with the exception of such paths being BGP multipath
multipath eligible and such functionality is enabled. On RRs being eligible and such functionality is enabled. Regarding RRs being in
in the data plane unless multipath is enabled 2nd best path is the data plane unless multipath is enabled, the second best path is
expected to be a backup path and should be installed as such into expected to be a backup path and should be installed as such into the
local RIB/FIB. local RIB/FIB.
The use of terminology of "planes" in this document is more of a The use of the term "planes" in this document is more of a conceptual
conceptual nature. In practice all paths are still kept in the nature. In practice, all paths are still kept in the single table
single table where normal best path is calculated. That means that where normal best path is calculated. This means that tools like the
tools like looking glass should not observe any changes nor impact looking glass should not observe any changes or impact when
when diverse-path has been enabled. diverse-path has been enabled.
The proposed architecture deployed along with the BGP best-external The proposed architecture deployed along with the BGP best-external
functionality covers all three cases where the classic BGP route functionality covers all three cases where the classic BGP route-
reflection paradigm would fail to distribute alternate exit points reflection paradigm would fail to distribute alternate (diverse)
paths. paths. These are
1. ASBRs advertising their single best external paths with no local- 1. ASBRs advertising their single best-external paths with no
preference or multi-exit-discriminator present. LOCAL_PREF or MED present.
2. ASBRs advertising their single best external paths with local- 2. ASBRs advertising their single best-external paths with
preference or multi-exit-discriminator present and with BGP best- LOCAL_PREF or MED present and with BGP best-external
external functionality enabled. functionality enabled.
3. ASBRs with multiple external paths. 3. ASBRs with multiple external paths.
This section focuses on discussion of the 3rd above case in more This section focuses on discussion of case 3 above in more detail.
detail. This describes the scenario of a single ASBR connected to This describes the scenario of a single ASBR connected to multiple
multiple EBGP peers. In practice this peering scenario is quite EBGP peers. In practice, this peering scenario is quite common. It
common. It is mostly due to the geographic location of EBGP peers is mostly due to the geographic location of EBGP peers and the
and the diversity of those peers (for example peering to multiple diversity of those peers (for example, peering to multiple tier-1
tier 1 ISPs etc...). It is not designed for failure recovery ISPs, etc.). It is not designed for failure-recovery scenarios, as
scenarios as single failure of the ASBR would simultaneously result single failure of the ASBR would simultaneously result in loss of
in loss of connectivity to all of the peers. In most medium and connectivity to all of the peers. In most medium and large
large geographically distributed networks there is always another geographically distributed networks, there is always another ASBR or
ASBR or multiple ASBRs providing peering backups, typically in other multiple ASBRs providing peering backups, typically in other
geographically diverse locations in the network. geographically diverse locations in the network.
When an operator uses ASBRs with multiple peerings setting next hop When an operator uses ASBRs with multiple peerings, setting next-hop
self will effectively allow to locally repair the atomic failure of self will effectively allow local repair of the atomic failure of any
any external peer without any compromise to the data plane. The most external peer without any compromise to the data plane.
common reason for not setting next hop self is traditionally the Traditionally, the most common reason for not setting next-hop self
associated drawback of loosing ability to signal the external is the associated drawback of losing the ability to signal the
failures of peering ASBRs or links to those ASBRs by fast IGP external failures of peering ASBRs or links to those ASBRs by fast
flooding. Such potential drawback can be easily avoided by using IGP flooding. Such a potential drawback can be easily avoided by
different peering address from the address used for next hop mapping using a different peering address from the address used for next-hop
as well as removing such next hop from IGP at the last possible BGP mapping and removing the next-hop from the IGP at the last possible
path failure. BGP path failure.
Herein one may correctly observe that in the case of setting next hop Herein, one may correctly observe that in the case of setting next-
self on an ASBR, attributes of other external paths such ASBR is hop self on an ASBR, attributes of other external paths such that the
peering with may be different from the attributes of its best ASBR is peering with may be different from the attributes of its best
external path. Therefore, not injecting all of those external paths external path. Therefore, not injecting all of those external paths
with their corresponding attribute can not be compared to equivalent with their corresponding attributes cannot be compared to equivalent
paths for the same prefix coming from different ASBRs. paths for the same prefix coming from different ASBRs.
While such observation in principle is correct one should put things While such observation, in principle, is correct, one should put
in perspective of the overall goal which is to provide data plane things in perspective of the overall goal, which is to provide data-
connectivity upon a single failure with minimal interruption/packet plane connectivity upon a single failure with minimal
loss. During such transient conditions, using even potentially interruption/packet loss. During such transient conditions, using
suboptimal exit points is reasonable, so long as forwarding even potentially suboptimal exit points is reasonable, so long as
information loops are not introduced. In the mean time BGP control forwarding information loops are not introduced. In the mean time,
plane will on its own re-advertise newly elected best external path, the BGP control plane will on its own re-advertise the newly elected
route reflector planes will calculate their Nth best paths and best external path, and route-reflector planes will calculate their
propagate to its clients. The result is that after seconds even if Nth best paths and propagate them to its clients. The result is that
potential sub-optimality were encountered it will be quickly and after seconds, even if potential suboptimality were encountered, it
naturally healed. will be quickly and naturally healed.
7. Summary of benefits 7. Summary of Benefits
Distribution of diverse BGP paths proposal provides the following Distribution of the diverse-BGP-paths proposal provides the following
benefits when compared to the alternatives: benefits when compared to the alternatives:
1. No modifications to BGP4 protocol. 1. No modifications to the BGP4 protocol.
2. No requirement for upgrades to edge and core routers (as required 2. No requirement for upgrades to edge and core routers (as required
in draft-ietf-idr-add-paths-07). Backward compatible with the in [ADD-PATHS]). It is backward compatible with the existing BGP
existing BGP deployments. deployments.
3. Can be easily enabled by introduction of a new route reflector, 3. Can be easily enabled by the introduction of a new route
route server plane dedicated to the selection and distribution of reflector, a route server plane dedicated to the selection and
Nth best-path or just by new configuration of the upgraded distribution of Nth best-path, or just by new configuration of
current route reflector(s). the upgraded current route reflector(s).
4. Does not require major modification to BGP implementations in the 4. Does not require major modification to BGP implementations in the
entire network which will result in an unnecessary increase of entire network, which would result in an unnecessary increase of
memory and CPU consumption due to the shift from today's per memory and CPU consumption due to the shift from today's per-
prefix to a per path advertisement state tracking. prefix to a per-path advertisement state tracking.
5. Can be safely deployed gradually on a RR cluster basis. 5. Can be safely deployed gradually on an RR cluster basis.
6. The proposed solution is equally applicable to any BGP address 6. The proposed solution is equally applicable to any BGP address
family as described in Multiprotocol Extensions for BGP-4 RFC4760 family as described in "Multiprotocol Extensions for BGP-4"
[RFC4760]. In particular it can be used "as is" without any [RFC4760]. In particular, it can be used "as is" without any
modifications to both IPv4 and IPv6 address families. modifications to both IPv4 and IPv6 address families.
8. Applications 8. Applications
This section lists the most common applications which require This section lists the most common applications that require the
presence of redundant BGP paths: presence of redundant BGP paths:
1. Fast connectivity restoration where backup paths with alternate 1. Fast connectivity restoration in which backup paths with
exit points would be pre-installed as well as pre-resolved in the alternate exit points would be pre-installed as well as
FIB of routers. That would allow for a local action upon pre-resolved in the FIB of routers. This allows for a local
reception of a critical event notification of network / node action upon reception of a critical event notification of
failure. This failure recovery mechanism based on the presence network/node failure. This failure recovery mechanism that is
of backup paths is also suitable for gracefully addressing based on the presence of backup paths is also suitable for
scheduled maintenance requirements as described in gracefully addressing scheduled maintenance requirements as
[I-D.decraene-bgp-graceful-shutdown-requirements]. described in [BGP-SHUTDOWN].
2. Multi-path load balancing for both IBGP and EBGP. 2. Multi-path load balancing for both IBGP and EBGP.
3. BGP control plane churn reduction both intra-domain and inter- 3. BGP control-plane churn reduction for both intra-domain and
domain. inter-domain.
An important point to observe is that all of the above intra-domain An important point to observe is that all of the above intra-domain
applications based on the use of reflector planes but are also applications are based on the use of reflector planes but are also
applicable in the inter-domain Internet exchange point examples. As applicable in the inter-domain Internet exchange point examples. As
discussed in section 4.3 an internet exchange can conceptually deploy discussed in Section 4.3, an Internet exchange can conceptually
shadow route server planes each responsible for distribution of an deploy shadow route server planes, each responsible for distribution
Nth best path to its EBGP peers. In practice it may just equal to of an Nth best path to its EBGP peers. In practice, it may just be
new short configuration and establishment of new BGP sessions to IX equal to a new short configuration and establishment of new BGP
peers. sessions to IX peers.
9. Security considerations 9. Security Considerations
The new mechanism for diverse BGP path dissemination proposed in this The new mechanism for diverse BGP path dissemination proposed in this
document does not introduce any new security concerns as compared to document does not introduce any new security concerns as compared to
base BGP4 specification [RFC4271] especially when compared against the base BGP4 specification [RFC4271] and especially when compared
full iBGP mesh topology. against full-IBGP-mesh topology.
In addition authors observe that all BGP security issues as described
in [RFC4272] do apply to the additional BGP session or sessions as
recommended by this specification. Therefor all recommended
mitigation techniques to BGP security are applicable here.
10. IANA Considerations
Following [RFC5226] authors declare that the new mechanism for In addition, the authors observe that all BGP security issues as
diverse BGP path dissemination does not require any new allocations described in [RFC4272] apply to the additional BGP session or
from IANA. sessions as recommended by this specification. Therefore, all
recommended mitigation techniques to BGP security are applicable
here.
11. Contributors 10. Contributors
The following people contributed significantly to the content of the The following people contributed significantly to the content of the
document: document:
Selma Yilmaz Selma Yilmaz
Cisco Systems Cisco Systems
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
US US
Email: seyilmaz@cisco.com Email: seyilmaz@cisco.com
Satish Mynam Satish Mynam
Juniper Networks Juniper Networks
1194 N. Mathilda Ave 1194 N. Mathilda Ave
Sunnyvale, CA 94089 Sunnyvale, CA 94089
US US
Email: smynam@juniper.net Email: smynam@juniper.net
Isidor Kouvelas Isidor Kouvelas
Cisco Systems Cisco Systems
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
US US
Email: kouvelas@cisco.com Email: kouvelas@cisco.com
12. Acknowledgments 11. Acknowledgments
The authors would like to thank Bruno Decraene, Bart Peirens, Eric The authors would like to thank Bruno Decraene, Bart Peirens, Eric
Rosen, Jim Uttaro, Renwei Li, Wes George and Adrian Farrel for their Rosen, Jim Uttaro, Renwei Li, Wes George, and Adrian Farrel for their
valuable input. valuable input.
The authors would also like to express special thank you to number of The authors would also like to express a special thank you to a
operators who helped to optimize the provided solution to be as close number of operators who helped optimize the provided solution to be
as possible to their daily operational practices. Especially many as close as possible to their daily operational practices. In
thx goes to Ted Seely, Shan Amante, Benson Schliesser and Seiichi particular, many thanks to Ted Seely, Shane Amante, Benson
Kawamura. Schliesser, and Seiichi Kawamura.
13. References
13.1. Normative References 12. References
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 12.1. Normative References
Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
Reflection: An Alternative to Full Mesh Internal BGP Border Gateway Protocol 4 (BGP-4)", RFC 4271, January
(IBGP)", RFC 4456, April 2006. 2006.
[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route
"Multiprotocol Extensions for BGP-4", RFC 4760, Reflection: An Alternative to Full Mesh Internal BGP
January 2007. (IBGP)", RFC 4456, April 2006.
[RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
IANA Considerations Section in RFCs", BCP 26, RFC 5226, "Multiprotocol Extensions for BGP-4", RFC 4760, January
May 2008. 2007.
13.2. Informative References [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an
IANA Considerations Section in RFCs", BCP 26, RFC 5226,
May 2008.
[I-D.decraene-bgp-graceful-shutdown-requirements] 12.2. Informative References
Decraene, B., Francois, P., pelsser, c., Ahmad, Z., and A.
Armengol, "Requirements for the graceful shutdown of BGP
sessions",
draft-decraene-bgp-graceful-shutdown-requirements-01 (work
in progress), March 2009.
[I-D.ietf-idr-add-paths] [ADD-PATHS] Walton, D., Chen, E., Retana, A., and J. Scudder,
Walton, D., Chen, E., Retana, A., and J. Scudder, "Advertisement of Multiple Paths in BGP", Work in
"Advertisement of Multiple Paths in BGP", Progress, June 2012.
draft-ietf-idr-add-paths-07 (work in progress), June 2012.
[I-D.ietf-idr-best-external] [AUTO-MESH] Raszuk, R., "IBGP Auto Mesh", Work in Progress, January
Marques, P., Fernando, R., Chen, E., Mohapatra, P., and H. 2004.
Gredler, "Advertisement of the best external route in [BGP-SHUTDOWN]
BGP", draft-ietf-idr-best-external-05 (work in progress), Decraene, B., Francois, P., Pelsser, C., Ahmad, Z., and
January 2012. A. Armengol, "Requirements for the Graceful Shutdown of
BGP Sessions", Work in Progress, September 2009.
[I-D.pmohapat-idr-fast-conn-restore] [EXT-PATH] Marques, P., Fernando, R., Chen, E., Mohapatra, P., and
Mohapatra, P., Fernando, R., Filsfils, C., and R. Raszuk, H. Gredler, "Advertisement of the Best External Route in
"Fast Connectivity Restoration Using BGP Add-path", BGP", Work in Progress, January 2012.
draft-pmohapat-idr-fast-conn-restore-02 (work in
progress), October 2011.
[I-D.raszuk-idr-ibgp-auto-mesh] [FAST-CONN] Mohapatra, P., Fernando, R., Filsfils, C., and R. Raszuk,
Raszuk, R., "IBGP Auto Mesh", "Fast Connectivity Restoration Using BGP Add-path", Work
draft-raszuk-idr-ibgp-auto-mesh-00 (work in progress), in Progress), October 2011.
June 2003.
[RFC3345] McPherson, D., Gill, V., Walton, D., and A. Retana, [RFC3345] McPherson, D., Gill, V., Walton, D., and A. Retana,
"Border Gateway Protocol (BGP) Persistent Route "Border Gateway Protocol (BGP) Persistent Route
Oscillation Condition", RFC 3345, August 2002. Oscillation Condition", RFC 3345, August 2002.
[RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", RFC
RFC 4272, January 2006. 4272, January 2006.
[RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous [RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous
System Confederations for BGP", RFC 5065, August 2007. System Confederations for BGP", RFC 5065, August 2007.
Authors' Addresses Authors' Addresses
Robert Raszuk (editor) Robert Raszuk (editor)
NTT MCL NTT MCL
101 S Ellsworth Avenue Suite 350 101 S Ellsworth Avenue Suite 350
San Mateo, CA 94401 San Mateo, CA 94401
US United States
Email: robert@raszuk.net EMail: robert@raszuk.net
Rex Fernando Rex Fernando
Cisco Systems Cisco Systems
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
US United States
EMail: rex@cisco.com
Email: rex@cisco.com
Keyur Patel Keyur Patel
Cisco Systems Cisco Systems
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
US United States
Email: keyupate@cisco.com EMail: keyupate@cisco.com
Danny McPherson Danny McPherson
Verisign Verisign, Inc.
21345 Ridgetop Circle 12061 Bluemont Way
Dulles, VA 20166 Reston, VA 20190
US United States
Email: dmcpherson@verisign.com EMail: dmcpherson@verisign.com
Kenji Kumaki Kenji Kumaki
KDDI Corporation KDDI Corporation
Garden Air Tower Garden Air Tower
Iidabashi, Chiyoda-ku, Tokyo 102-8460 Iidabashi, Chiyoda-ku, Tokyo 102-8460
Japan Japan
Email: ke-kumaki@kddi.com EMail: ke-kumaki@kddi.com
 End of changes. 166 change blocks. 
530 lines changed or deleted 515 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/