draft-ietf-grow-diverse-bgp-path-dist-01.txt   draft-ietf-grow-diverse-bgp-path-dist-02.txt 
GROW Working Group R. Raszuk, Ed. GROW Working Group R. Raszuk, Ed.
Internet-Draft R. Fernando Internet-Draft R. Fernando
Intended status: Informational K. Patel Intended status: Informational K. Patel
Expires: December 25, 2010 Cisco Systems Expires: January 9, 2011 Cisco Systems
D. McPherson D. McPherson
Arbor Networks Verisign
K. Kumaki K. Kumaki
KDDI Corporation KDDI Corporation
June 23, 2010 July 8, 2010
Distribution of diverse BGP paths. Distribution of diverse BGP paths.
draft-ietf-grow-diverse-bgp-path-dist-01 draft-ietf-grow-diverse-bgp-path-dist-02
Abstract Abstract
The BGP4 protocol specifies the selection and propagation of a single The BGP4 protocol specifies the selection and propagation of a single
best path for each prefix. As defined today BGP has no mechanisms to best path for each prefix. As defined today BGP has no mechanisms to
distribute paths other then best path between it's speakers. This distribute paths other then best path between its speakers. This
behaviour results in number of disadvantages for new applications and behaviour results in number of disadvantages for new applications and
services. services.
This document presents an alternative mechanism for solving the This document presents an alternative mechanism for solving the
problem based on the concept of parallel route reflector planes. It problem based on the concept of parallel route reflector planes.
also compares existing solutions and proposed ideas that enable Such planes can be build in parallel or they can co-exit on the
distribution of more paths than just the best path. current route reflection platforms. Document also compares existing
solutions and proposed ideas that enable distribution of more paths
than just the best path.
This proposal does not specify any changes to the BGP protocol This proposal does not specify any changes to the BGP protocol
definition. It does not require upgrades to provider edge or core definition. It does not require upgrades to provider edge or core
routers nor does it need network wide upgrades. The authors believe routers nor does it need network wide upgrades. The authors believe
that the GROW WG would be the best place for this work. that the GROW WG would be the best place for this work.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 9, 2011.
This Internet-Draft will expire on December 25, 2010.
Copyright Notice Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. History . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. History . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. BGP Add-Paths Proposal . . . . . . . . . . . . . . . . . . 3 2.1. BGP Add-Paths Proposal . . . . . . . . . . . . . . . . . . 4
3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Multi plane route reflection . . . . . . . . . . . . . . . . . 5 4. Multi plane route reflection . . . . . . . . . . . . . . . . . 6
4.1. Co-located best and backup path RRs . . . . . . . . . . . 8 4.1. Co-located best and backup path RRs . . . . . . . . . . . 9
4.2. Randomly located best and backup path RRs . . . . . . . . 9 4.2. Randomly located best and backup path RRs . . . . . . . . 10
4.3. Multi plane route servers for Internet Exchanges . . . . . 11 4.3. Multi plane route servers for Internet Exchanges . . . . . 13
5. Discussion on current models of IBGP route distribution . . . 12 5. Discussion on current models of IBGP route distribution . . . 13
5.1. Full Mesh . . . . . . . . . . . . . . . . . . . . . . . . 12 5.1. Full Mesh . . . . . . . . . . . . . . . . . . . . . . . . 13
5.2. Confederations . . . . . . . . . . . . . . . . . . . . . . 13 5.2. Confederations . . . . . . . . . . . . . . . . . . . . . . 15
5.3. Route reflectors . . . . . . . . . . . . . . . . . . . . . 14 5.3. Route reflectors . . . . . . . . . . . . . . . . . . . . . 15
6. Deployment considerations . . . . . . . . . . . . . . . . . . 14 6. Deployment considerations . . . . . . . . . . . . . . . . . . 15
7. Summary of benefits . . . . . . . . . . . . . . . . . . . . . 16 7. Summary of benefits . . . . . . . . . . . . . . . . . . . . . 17
8. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 16 8. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 18
9. Security considerations . . . . . . . . . . . . . . . . . . . 17 9. Security considerations . . . . . . . . . . . . . . . . . . . 18
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 17 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 19
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19
13.1. Normative References . . . . . . . . . . . . . . . . . . . 18 13.1. Normative References . . . . . . . . . . . . . . . . . . . 19
13.2. Informative References . . . . . . . . . . . . . . . . . . 18 13.2. Informative References . . . . . . . . . . . . . . . . . . 20
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21
1. Introduction 1. Introduction
Current BGP4 [RFC4271] protocol specification allows for the Current BGP4 [RFC4271] protocol specification allows for the
selection and propagation of only one best path for each prefix. The selection and propagation of only one best path for each prefix. The
BGP protocol as defined today has no mechanism to distribute other BGP protocol as defined today has no mechanism to distribute other
then best path between it's speakers. This behaviour results in a then best path between its speakers. This behaviour results in a
number of problems in the deployment of new applications and number of problems in the deployment of new applications and
services. services.
This document presents an alternative mechanism for solving the This document presents an alternative mechanism for solving the
problem based on the concept of parallel route reflector planes. It problem based on the concept of parallel route reflector planes. It
also compares existing solutions and proposed ideas that enable also compares existing solutions and proposed ideas that enable
distribution of more paths than just the best path. The parallel distribution of more paths than just the best path. The parallel
route reflector planes solution brings very significant benefits at a route reflector planes solution brings very significant benefits at a
negligible capex and opex deployment price as compared to the negligible capex and opex deployment price as compared to the
alternative techniques and is being considered by a number of network alternative techniques and is being considered by a number of network
operators for deployment in their networks. operators for deployment in their networks.
This proposal does not specify any changes to the BGP protocol This proposal does not specify any changes to the BGP protocol
definition. It does not require upgrades to provider edge or core definition. It does not require upgrades to provider edge or core
routers nor does it need network wide upgrades. The authors believe routers nor does it need network wide upgrades. The only upgrade
that the GROW WG would be the best place for this work. required is the new functionality on the new or current route
reflectors. The authors believe that the GROW WG would be the best
place for this work.
2. History 2. History
The need to disseminate more paths than just the best path is The need to disseminate more paths than just the best path is
primarily driven by two requirements. One of them is the problem of primarily driven by three requirements. First is the problem of BGP
BGP oscillations [I-D.ietf-idr-route-oscillation]. The second is the oscillations [I-D.ietf-idr-route-oscillation]. The second is the
desire for reduction of time of reachability restoration in the event desire for reduction of time of reachability restoration in the event
of network or network element's failure. These two reasons have lead of network or network element's failure. Third requirement is to
to the proposal of BGP add-paths [I-D.ietf-idr-add-paths]. enhance BGP load balancing capabilities. Those reasons have lead to
the proposal of BGP add-paths [I-D.ietf-idr-add-paths].
2.1. BGP Add-Paths Proposal 2.1. BGP Add-Paths Proposal
As it has been proven that distribution of only the best path of a As it has been proven that distribution of only the best path of a
route is not sufficient to meet the needs of continuously growing route is not sufficient to meet the needs of continuously growing
number of services carried over BGP the add-paths proposal was number of services carried over BGP the add-paths proposal was
submitted in 2002 to enable BGP to distribute more then one path. submitted in 2002 to enable BGP to distribute more then one path.
This is achieved by including as a part of the NLRI an additional This is achieved by including as a part of the NLRI an additional
four octet value called the Path Identifier. four octet value called the Path Identifier.
The implication of this change on a BGP implementation is that it The implication of this change on a BGP implementation is that it
must now maintain per path, instead of per prefix, peer advertisement must now maintain per path, instead of per prefix, peer advertisement
state to track which of the peers each path was advertised to. This state to track which of the peers each path was advertised to. This
new requirement has it's own memory and processing cost. Suffice to new requirement has its own memory and processing cost. Suffice to
say that by the middle of 2009 none of the commercial BGP say that by the end of 2009 none of the commercial BGP implementation
implementation can claim to support the new add-path behaviour in could claim to support the new add-path behaviour in production code,
production code, in part because of this resource overhead. in part because of this resource overhead.
An important observation is that distribution of more than one best An important observation is that distribution of more than one best
path by Autonomous System Border Routers (ASBRs) with multiple EBGP path by Autonomous System Border Routers (ASBRs) with multiple EBGP
peers attached to it where no "next hop self" is set may result in peers attached to it where no "next hop self" is set may result in
bestpath selection inconsistency within the autonomous system. bestpath selection inconsistency within the autonomous system.
Therefore it is also required to attach in the form of a new Therefore it is also required to attach in the form of a new
attribute the possible tie breakers and propagate those within the attribute the possible tie breakers and propagate those within the
domain. The example of such attribute for the purpose of fast domain. The example of such attribute for the purpose of fast
connectivity restoration to address that very case of ASBR injecting connectivity restoration to address that very case of ASBR injecting
multiple external paths into the IBGP mesh has been presented and multiple external paths into the IBGP mesh has been presented and
skipping to change at page 4, line 27 skipping to change at page 5, line 30
propagated information also best path selection is recommended to be propagated information also best path selection is recommended to be
modified to make sure that best and backup path selection within the modified to make sure that best and backup path selection within the
domain stays consistent. More discussion on this particular point domain stays consistent. More discussion on this particular point
will be contained in the deployment considerations section below. In will be contained in the deployment considerations section below. In
the proposed solution in this document we observe that in order to the proposed solution in this document we observe that in order to
address most of the applications just use of best external address most of the applications just use of best external
advertisement is required. For ASBRs which are peering to multiple advertisement is required. For ASBRs which are peering to multiple
upstream ASs setting "next hop self" is recommended. upstream ASs setting "next hop self" is recommended.
The add paths protocol extensions have to be implemented by all the The add paths protocol extensions have to be implemented by all the
routers within an AS in order for the system to work correctly. The routers within an AS in order for the system to work correctly. It
required code modifications include enhancements such as the Fast remains quite a research topic to analyze benefits or risk associated
with partial add-paths deployments. The risk becomes even greater in
networks not using some form of edge to edge encapsulation.
The required code modifications include enhancements such as the Fast
Connectivity Restoration Using BGP Add-path Connectivity Restoration Using BGP Add-path
[I-D.pmohapat-idr-fast-conn-restore]. The deployment of such [I-D.pmohapat-idr-fast-conn-restore]. The deployment of such
technology in an entire service provider network requires software technology in an entire service provider network requires software
and perhaps sometimes in the cases of End-of-Engineering or End-of- and perhaps sometimes in the cases of End-of-Engineering or End-of-
Life equipment even hardware upgrades. Such an operation may or may Life equipment even hardware upgrades. Such operation may or may not
not be economically feasible. Even if add-path functionality was be economically feasible. Even if add-path functionality was
available today on all commercial routing equipment and across all available today on all commercial routing equipment and across all
vendors, experience indicates that to achieve 100% deployment vendors, experience indicates that to achieve 100% deployment
coverage within any medium or large global network may easily take coverage within any medium or large global network may easily take
years. years.
While it needs to be clearly acknowledged that the add-path mechanism While it needs to be clearly acknowledged that the add-path mechanism
provides the most general way to address the problem of distributing provides the most general way to address the problem of distributing
more then one path between BGP speakers, this document provides a many paths between BGP speakers, this document provides a much easier
much easier to deploy solution that requires no modification to the to deploy solution that requires no modification to the BGP protocol
BGP protocol. The alternative method presented is capable of where only a few additional paths may be required. The alternative
addressing critical service provider requirements for disseminating method presented is capable of addressing critical service provider
more than a single path across an AS with a significantly lower requirements for disseminating more than a single path across an AS
deployment cost. with a significantly lower deployment cost.
3. Goals 3. Goals
The proposal described in this document is not intended to compete The proposal described in this document is not intended to compete
with add-paths. Instead if deployed it is to be used as a very easy with add-paths. Instead if deployed it is to be used as a very easy
method to accommodate the majority of applications which may require method to accommodate the majority of applications which may require
presence of alternative BGP exit points. presence of alternative BGP exit points.
It is presented to network operators as a possible choice and It is presented to network operators as a possible choice and
provides those operators who need additional paths today an provides those operators who need additional paths today an
skipping to change at page 7, line 19 skipping to change at page 8, line 19
ASBR will be available to RRs since the other peering ASBR will ASBR will be available to RRs since the other peering ASBR will
consider the IBGP path as best and will not announce (or if consider the IBGP path as best and will not announce (or if
already announced will withdraw) its own external path. The already announced will withdraw) its own external path. The
exception here is the use of BGP Best-External proposal which exception here is the use of BGP Best-External proposal which
will allow stated ASBR to still propagate to the RRs its own will allow stated ASBR to still propagate to the RRs its own
external path. Unfortunately RRs will not be able to distribute external path. Unfortunately RRs will not be able to distribute
it any further to other clients as only the overall best path it any further to other clients as only the overall best path
will be reflected. will be reflected.
The proposed solution is based on the use of additional route The proposed solution is based on the use of additional route
reflectors or new functionality enabled on the exisiting route reflectors or new functionality enabled on the existing route
reflectors that instead of distributing the best path for each route reflectors that instead of distributing the best path for each route
will distribute an alternative path other then best. The best path will distribute an alternative path other then best. The best path
(main) reflector plane distributes the best path for each route as it (main) reflector plane distributes the best path for each route as it
does today. The second plane distributes the second best path for does today. The second plane distributes the second best path for
each route and so on. Distribution of N paths for each route can be each route and so on. Distribution of N paths for each route can be
achieved by using N reflector planes. achieved by using N reflector planes.
Each plane of route reflectors is a logical entity and may or may not Each plane of route reflectors is a logical entity and may or may not
be co-located with the existing best path route reflectors. Adding a be co-located with the existing best path route reflectors. Adding a
route reflector plane to a network may be as easy as enabling a route reflector plane to a network may be as easy as enabling a
logical router partition, new BGP process or just a new configuration logical router partition, new BGP process or just a new configuration
knob on an existing route reflector and configuring an additional knob on an existing route reflector and configuring an additional
IBGP session from the current clients if required. There are no code IBGP session from the current clients if required. There are no code
changes required on the route reflector clients for this mechanism to changes required on the route reflector clients for this mechanism to
work. It is easy to observe that the installation of one or more work. It is easy to observe that the installation of one or more
additional route reflector control planes is much cheaper and an additional route reflector control planes is much cheaper and an
easier than the need of upgrading 100s of routers in the entire easier than the need of upgrading 100s of route reflector clients in
network to support different protocol encoding. the entire network to support different protocol encoding.
Diverse path route reflectors need the new ability to calculate and Diverse path route reflectors need the new ability to calculate and
propagate the Nth best path instead of the overall best path. An propagate the Nth best path instead of the overall best path. An
implementation is encouraged to enable this new functionality on a implementation is encouraged to enable this new functionality on a
per neighbor basis. per neighbor basis.
While this is an implementation detail, the code to calculate Nth While this is an implementation detail, the code to calculate Nth
best path is also required by other BGP solutions. For example in best path is also required by other BGP solutions. For example in
the application of fast connectivity restoration BGP must calculate a the application of fast connectivity restoration BGP must calculate a
backup path for installation into the RIB and FIB ahead of the actual backup path for installation into the RIB and FIB ahead of the actual
skipping to change at page 8, line 47 skipping to change at page 9, line 47
*** *** *** ***
ASBR1 ASBR2 ASBR1 ASBR2
EBGP EBGP
Figure2: Co-located 2nd best RR plane Figure2: Co-located 2nd best RR plane
The following is a list of configuration changes required to enable The following is a list of configuration changes required to enable
the 2nd best path route reflector plane: the 2nd best path route reflector plane:
1. Adding RR1' and RR2' either as logical or physical new control 1. Unless same RR1/RR2 platform is being used adding RR1' and RR2'
plane RRs in the same IGP points as RR1 and RR2 respectively either as logical or physical new control plane RRs in the same
IGP points as RR1 and RR2 respectively.
2. Enabling RR1' and RR2' for 2nd plane route reflection 2. Enabling best-external on ASBRs
3. Enabling best-external on ASBRs 3. Enabling RR1' and RR2' for 2nd plane route reflection.
Alternatively instructing existing RR1 and RR2 to calculate also
2nd best path.
4. Configuring ASBR-RR's IBGP sessions 4. Unless one of the existing RRs is turned to advertise only
diverse path to it's current clients configuring new ASBRs-RR'
IBGP sessions
The expected behaviour is that under any BGP condition the ASBR3 and The expected behaviour is that under any BGP condition the ASBR3 and
P routers will receive both paths P1 and P2 for destination D. The P routers will receive both paths P1 and P2 for destination D. The
availability of both paths will allow them to implement a number of availability of both paths will allow them to implement a number of
new services as listed in the applications section below. new services as listed in the applications section below.
As an alternative to fully meshing all RRs and RRs' an operator who As an alternative to fully meshing all RRs and RRs' an operator who
has a large number of reflectors deployed today may choose to peer has a large number of reflectors deployed today may choose to peer
newly introduced RRs' to a hierarchical RR' which would be an IBGP newly introduced RRs' to a hierarchical RR' which would be an IBGP
interconnect point within the 2nd plane as well as between planes. interconnect point within the 2nd plane as well as between planes.
One of the deployment model of this scenario can be achieved by One of the deployment model of this scenario can be achieved by
simple upgrade of the existing route reflectors without the need to simple upgrade of the existing route reflectors without the need to
deploy any new logical or physical platforms. Such upgrade would deploy any new logical or physical platforms. Such upgrade would
allow route reflectors to service both upgraded to add-paths peers as allow route reflectors to service both upgraded to add-paths peers as
well as those peers which can not be immediately upgraded while in well as those peers which can not be immediately upgraded while in
the same time allowing to distribute more then single best path. the same time allowing to distribute more then single best path. The
obvious protocol benefit of using existing RRs to distribute towards
their clients best and diverse bgp paths over different IBGP session
is the automatic assurance that such client would always get
different paths with their next hop being different.
The way to accomplish this would be to create a separate IBGP session The way to accomplish this would be to create a separate IBGP session
for each N-th BGP path. Such session should be preferably terminated for each N-th BGP path. Such session should be preferably terminated
at a different loopback address of the route reflector. At the BGP at a different loopback address of the route reflector. At the BGP
OPEN stage of each such session a different bgp_router_id should be OPEN stage of each such session a different bgp_router_id may be
used. Correspondingly route reflector should also allow its clients used. Correspondingly route reflector should also allow its clients
to use the same bgp_router_id on each such session. to use the same bgp_router_id on each such session.
4.2. Randomly located best and backup path RRs 4.2. Randomly located best and backup path RRs
Now let's consider a deployment case where an operator wishes to Now let's consider a deployment case where an operator wishes to
enable a 2nd RR' plane using only a single additional router in a enable a 2nd RR' plane using only a single additional router in a
different network location to his current route reflectors. different network location to his current route reflectors. This
model would be of particular use in networks where some form of end-
to-end encapsulation (IP or MPLS) is enabled between provider edge
routers.
Note that this model of operation assumes that the present best path Note that this model of operation assumes that the present best path
route reflectors are only control plane devices. If the route route reflectors are only control plane devices. If the route
reflector is in the data forwarding path then the implementation must reflector is in the data forwarding path then the implementation must
be able to clearly separate the Nth best-path selection from the be able to clearly separate the Nth best-path selection from the
selection of the paths to be used for data forwarding. The basic selection of the paths to be used for data forwarding. The basic
premise of this mode of deployment assumes that all reflector planes premise of this mode of deployment assumes that all reflector planes
have the same information to choose from which includes the same set have the same information to choose from which includes the same set
of BGP paths. It also requires the ability to skip the comparison of of BGP paths. It also requires the ability to ignore the step of
the IGP metric to reach the bgp next hop during best-path comparison of the IGP metric to reach the bgp next hop during best-
calculation. path calculation.
ASBR3 ASBR3
*** ***
* * * *
+------------* *-----------+ +------------* *-----------+
| AS1 * * | | AS1 * * |
| IBGP *** | | IBGP *** |
| | | |
| *** | | *** |
| * * | | * * |
skipping to change at page 10, line 35 skipping to change at page 11, line 44
+-----* *---------* *----+ +-----* *---------* *----+
* * * * * * * *
*** *** *** ***
ASBR1 ASBR2 ASBR1 ASBR2
EBGP EBGP
Figure3: Experimental deployment of 2nd best RR Figure3: Experimental deployment of 2nd best RR
The following is a list of configuration changes required to enable The following is a list of configuration changes required to enable
the 2nd best path route reflector RR' as a single platform: the 2nd best path route reflector RR' as a single platform or to
enable one of the existing control plane RRs for diverse-path
functionality:
1. Adding RR' logical or physical as new route reflector anywhere in 1. If needed adding RR' logical or physical as new route reflector
the network anywhere in the network
2. Enabling RR' for 2nd plane route reflection 2. Enabling best-external on ASBRs
3. Enabling best-external on ASBRs 3. Disabling IGP metric check in BGP best path on all route
reflectors.
4. Fully meshing newly added RRs' with the all other reflectors in 4. Enabling RR' or any of the existing RR for 2nd plane path
both planes. That condition does not apply if the newly added calculation
RR'(s) already have peering to all ASBRs/PEs.
5. Configuring ASBRs-RR' IBGP sessions 5. If required fully meshing newly added RRs' with the all other
reflectors in both planes. That condition does not apply if the
newly added RR'(s) already have peering to all ASBRs/PEs.
6. Disabling IGP metric check in BGP best path on all route 6. Unless one of the existing RRs is turned to advertise only
reflectors. diverse path to it's current clients configuring new ASBRs-RR'
IBGP sessions
In this scenario the operator has the flexibility to instroduce the In this scenario the operator has the flexibility to introduce the
new additional route reflector on any existing or new hardware in the new additional route reflector functionality on any existing or new
network. Any of the existing routers that are not already members of hardware in the network. Any of the existing routers that are not
the best path route reflector plane can be easily configured to serve already members of the best path route reflector plane can be easily
the 2nd plane either via using a logical / virtual router partition configured to serve the 2nd plane either via using a logical /
or by local implementation hooks. virtual router partition or by having their bgp implementation
compliant to this specification.
Even if the IGP metric is not taken into consideration when comparing Even if the IGP metric is not taken into consideration when comparing
paths during the bestpath calculation, an implementation still has to paths during the bestpath calculation, an implementation still has to
consider paths with unreachable nexthops as invalid. It is worth consider paths with unreachable nexthops as invalid. It is worth
pointing out that some implementations today already allow for pointing out that some implementations today already allow for
configuration which results in no IGP metric comparison during the configuration which results in no IGP metric comparison during the
best path calculation. best path calculation.
The additional planes of route reflectors do not need to be fully The additional planes of route reflectors do not need to be fully
redundant as the primary one does. If we are preparing for a single redundant as the primary one does. If we are preparing for a single
skipping to change at page 11, line 36 skipping to change at page 13, line 8
redundantly by installing not one, but two or more route reflectors redundantly by installing not one, but two or more route reflectors
serving each additional plane the additional robustness will be serving each additional plane the additional robustness will be
achieved. achieved.
As a result of this solution ASBR3 and other ASBRs peering to RR' As a result of this solution ASBR3 and other ASBRs peering to RR'
will be receiving the 2nd best path. will be receiving the 2nd best path.
Similarly to section 4.1 as an alternative to fully meshing all RRs & Similarly to section 4.1 as an alternative to fully meshing all RRs &
RRs' an operator who may have a large number of reflectors already RRs' an operator who may have a large number of reflectors already
deployed today may choose to peer newly introduced RRs' to a deployed today may choose to peer newly introduced RRs' to a
hierarchical RR' which would be an IBGP interconnect point within the hierarchical RR' which would be an IBGP interconnect point between
2nd plane as well as between planes. planes.
4.3. Multi plane route servers for Internet Exchanges 4.3. Multi plane route servers for Internet Exchanges
Another group of devices where the proposed multi-plane architecture Another group of devices where the proposed multi-plane architecture
may be of particular applicability are EBGP route servers used at the may be of particular applicability are EBGP route servers used at
majority of internet exchange points. many of internet exchange points.
In such cases 100s of ISPs are interconnected on a common LAN. In such cases 100s of ISPs are interconnected on a common LAN.
Instead of having 100s of direct EBGP sessions on each exchange Instead of having 100s of direct EBGP sessions on each exchange
client, a single peering is created to the transparent route server. client, a single peering is created to the transparent route server.
The route server can only propagate a single best path. Mandating The route server can only propagate a single best path. Mandating
the upgrade for 100s of different service providers in order to the upgrade for 100s of different service providers in order to
implement add-path may be much more difficult as compared to asking implement add-path may be much more difficult as compared to asking
them for provisioning one new EBGP session to an Nth best-path route them for provisioning one new EBGP session to an Nth best-path route
server plane. server plane. That will allow to distribute more then single best
BGP path from a given route server to such IX peer.
The solution proposed in this document fits very well with the The solution proposed in this document fits very well with the
requirement of having broader EBGP path diversity among the members requirement of having broader EBGP path diversity among the members
of any Internet Exchange Point. of any Internet Exchange Point.
5. Discussion on current models of IBGP route distribution 5. Discussion on current models of IBGP route distribution
In today's networks BGP4 operates as specified in [RFC4271] In today's networks BGP4 operates as specified in [RFC4271]
There are a number of technology choices for intra-AS BGP route There are a number of technology choices for intra-AS BGP route
skipping to change at page 14, line 33 skipping to change at page 16, line 8
6. Deployment considerations 6. Deployment considerations
The diverse BGP path dissemination proposal allows the distribution The diverse BGP path dissemination proposal allows the distribution
of more paths than just the best-path to route reflector or route of more paths than just the best-path to route reflector or route
server clients of today's BGP4 implementations. server clients of today's BGP4 implementations.
From the client's point of view receiving additional paths via From the client's point of view receiving additional paths via
separate IBGP sessions terminated at the new router reflector plane separate IBGP sessions terminated at the new router reflector plane
is functionally equivalent to constructing a full mesh peering is functionally equivalent to constructing a full mesh peering
without the problems that such a full mesh would come with (discussed without the problems that such a full mesh would come with set of
in section 2.1). problems as discussed in earlier section.
By precisely defining the number of reflector planes, network By precisely defining the number of reflector planes, network
operators have full control over the number of redundant paths in the operators have full control over the number of redundant paths in the
network. This number can be defined to address the needs of the network. This number can be defined to address the needs of the
service(s) being deployed. service(s) being deployed.
The Nth plane route reflectors should be acting as control plane The Nth plane route reflectors should be acting as control plane
devices. While they can be provisioned on the current production network entities. While they can be provisioned on the current
routers selected backup BGP paths should not be used directly in the production routers selected Nth best BGP paths should not be used
date plane. Use of the calculated Nth path by the RRs can lead to directly in the date plane with the exception of such paths being BGP
inconsistent best-path selection in the domain. For the purposes of multipath eligible and such functionality is enabled. On RRs being
local RIB / FIB installation, any router (including the RRs) which is in the data plane unless multipath is enabled 2nd best path is
in the data path must use the overall global best and Nth best paths. expected to be a backup path and should be installed as such into
local RIB/FIB.
The proposed architecture deployed along with the BGP best-external The proposed architecture deployed along with the BGP best-external
functionality covers all three cases where the classic BGP route functionality covers all three cases where the classic BGP route
reflection paradigm would fail to distribute alternate exit points reflection paradigm would fail to distribute alternate exit points
paths. paths.
1. ASBRs advertising their single best external paths with no local- 1. ASBRs advertising their single best external paths with no local-
preference or multi-exit-discriminator present. preference or multi-exit-discriminator present.
2. ASBRs advertising their single best external paths with local- 2. ASBRs advertising their single best external paths with local-
preference or multi-exit-discriminator present and with BGP best- preference or multi-exit-discriminator present and with BGP best-
external functionality enabled. external functionality enabled.
3. ASBRs with multiple external paths. 3. ASBRs with multiple external paths.
Let's discuss the last (3rd) case in more detail. This describes the Let's discuss the 3rd above case in more detail. This describes the
scenario of a single ASBR connected to multiple EBGP peers. In scenario of a single ASBR connected to multiple EBGP peers. In
practice this peering scenario is quite common. It is mostly due to practice this peering scenario is quite common. It is mostly due to
the geographic location of EBGP peers and the diversity of those the geographic location of EBGP peers and the diversity of those
peers (for example peering to multiple tier 1 ISPs etc...). It is peers (for example peering to multiple tier 1 ISPs etc...). It is
not designed for failure recovery scenarios as single failure of the not designed for failure recovery scenarios as single failure of the
ASBR would simultaneously result in loss of connectivity to all of ASBR would simultaneously result in loss of connectivity to all of
the peers. In most medium and large geographically distributed the peers. In most medium and large geographically distributed
networks there is always another ASBR or multiple ASBRs providing networks there is always another ASBR or multiple ASBRs providing
peering backups, typically in other geographically diverse locations peering backups, typically in other geographically diverse locations
in the network. in the network.
skipping to change at page 15, line 40 skipping to change at page 17, line 15
common reason for not setting next hop self is traditionally the common reason for not setting next hop self is traditionally the
associated drawback of loosing ability to signal the external associated drawback of loosing ability to signal the external
failures of peering ASBRs or links to those ASBRs by fast IGP failures of peering ASBRs or links to those ASBRs by fast IGP
flooding. Such potential drawback can be easily avoided by using flooding. Such potential drawback can be easily avoided by using
different peering address from the address used for next hop mapping different peering address from the address used for next hop mapping
as well as removing such next hop from IGP at the last possible BGP as well as removing such next hop from IGP at the last possible BGP
path failure. path failure.
Herein one may correctly observe that in the case of setting next hop Herein one may correctly observe that in the case of setting next hop
self on an ASBR, attributes of other external paths such ASBR is self on an ASBR, attributes of other external paths such ASBR is
peering with may be different from the attributes of it's best peering with may be different from the attributes of its best
external path. Therefore, not injecting all of those external paths external path. Therefore, not injecting all of those external paths
with their corresponding attribute can not be compared to equivalent with their corresponding attribute can not be compared to equivalent
paths for the same prefix coming from different ASBRs. paths for the same prefix coming from different ASBRs.
While such observation in principle is correct one should put things While such observation in principle is correct one should put things
in perspective of the overall goal which is to provide data plane in perspective of the overall goal which is to provide data plane
connectivity upon a single failure with minimal interruption/packet connectivity upon a single failure with minimal interruption/packet
loss. During such transient conditions, using even potentially loss. During such transient conditions, using even potentially
suboptimal exit points is reasonable, so long as forwarding suboptimal exit points is reasonable, so long as forwarding
information loops are not introduced. In the mean time BGP control information loops are not introduced. In the mean time BGP control
plane will on it's own re-advertise newly elected best external path, plane will on its own re-advertise newly elected best external path,
route reflector planes will calculate their Nth best paths and route reflector planes will calculate their Nth best paths and
propagate to it's clients. The result is that after seconds even if propagate to its clients. The result is that after seconds even if
potential sub-optimality were encountered it will be quickly and potential sub-optimality were encountered it will be quickly and
naturally healed. naturally healed.
7. Summary of benefits 7. Summary of benefits
The diverse BGP path dissemination proposal provides the following The diverse BGP path dissemination proposal provides the following
benefits when compared to the alternatives: benefits when compared to the alternatives:
1. No modifications to BGP4 protocol. 1. No modifications to BGP4 protocol.
2. No requirement for upgrades to edge and core routers. Backward 2. No requirement for upgrades to edge and core routers. Backward
compatible with the existing BGP deployments. compatible with the existing BGP deployments.
3. Can be easily enabled by introduction of a new route reflector / 3. Can be easily enabled by introduction of a new route reflector,
route server plane dedicated to the selection and distribution of route server plane dedicated to the selection and distribution of
Nth best-path. Nth best-path or just by new configuration of the upgraded
current route reflector(s).
4. Does not require major modification to BGP implementations in the 4. Does not require major modification to BGP implementations in the
entire network which will result in an unnecessary increase of entire network which will result in an unnecessary increase of
memory and CPU consumption due to the shift from today's per memory and CPU consumption due to the shift from today's per
prefix to a per path advertisement state tracking. prefix to a per path advertisement state tracking.
5. Can be safely deployed gradually through addition of a single 5. Can be safely deployed gradually on a RR cluster basis.
logical or physical route reflector with the new functionality
described in this document.
6. The proposed solution is equally applicable to any BGP address 6. The proposed solution is equally applicable to any BGP address
family as described in Multiprotocol Extensions for BGP-4 RFC4760 family as described in Multiprotocol Extensions for BGP-4 RFC4760
[RFC4760]. In particular it can be used "as is" without any [RFC4760]. In particular it can be used "as is" without any
modifications to both IPv4 and IPv6 address families. modifications to both IPv4 and IPv6 address families.
8. Applications 8. Applications
This section lists the most common applications which require This section lists the most common applications which require
presence of redundant BGP paths: presence of redundant BGP paths:
skipping to change at page 17, line 12 skipping to change at page 18, line 33
maintenane requirements as described in maintenane requirements as described in
[I-D.decraene-bgp-graceful-shutdown-requirements]. [I-D.decraene-bgp-graceful-shutdown-requirements].
2. Multi-path load balancing for both IBGP and EBGP. 2. Multi-path load balancing for both IBGP and EBGP.
3. BGP control plane churn reduction both intra-domain and inter- 3. BGP control plane churn reduction both intra-domain and inter-
domain. domain.
An important point to observe is that all of the above intra-domain An important point to observe is that all of the above intra-domain
applications based on the use of reflector planes but are also applications based on the use of reflector planes but are also
applicable in the inter-domain Internet exchange case. As discussed applicable in the inter-domain Internet exchange point examples. As
in section 4.3 an internet exchange can deploy shadow route server discussed in section 4.3 an internet exchange can conceptually deploy
slices each responsible for distribution of an Nth best path to it's shadow route server planes each responsible for distribution of an
EBGP peers. Nth best path to its EBGP peers. In practice it may just equal to
new short configuration and establishment of new BGP sessions to IX
peers.
9. Security considerations 9. Security considerations
The new mechanism for diverse BGP path dissemination proposed in this The new mechanism for diverse BGP path dissemination proposed in this
document does not introduce any new security concerns as compared to document does not introduce any new security concerns as compared to
base BGP4 specification [RFC4271]. base BGP4 specification [RFC4271].
10. IANA Considerations 10. IANA Considerations
The new mechanism for diverse BGP path dissemination does not require The new mechanism for diverse BGP path dissemination does not require
skipping to change at page 18, line 20 skipping to change at page 19, line 33
Isidor Kouvelas Isidor Kouvelas
Cisco Systems Cisco Systems
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
US US
Email: kouvelas@cisco.com Email: kouvelas@cisco.com
12. Acknowledgments 12. Acknowledgments
The authors would like to thank Bruno Decraene, Bart Peirens and Eric The authors would like to thank Bruno Decraene, Bart Peirens, Eric
Rosen for their valuable input. Rosen, Jim Uttaro, Renwei Li and George Wes for their valuable input.
The authors would also like to express special thank you to number of
operators who helped to optimize the provided solution to be as close
as possible to their daily operational practices. Especially many
thx goes to Ted Seely, Shan Amante, Benson Schliesser and Seiichi
Kawamura.
13. References 13. References
13.1. Normative References 13.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006. Protocol 4 (BGP-4)", RFC 4271, January 2006.
skipping to change at page 20, line 21 skipping to change at page 21, line 39
Keyur Patel Keyur Patel
Cisco Systems Cisco Systems
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
US US
Email: keyupate@cisco.com Email: keyupate@cisco.com
Danny McPherson Danny McPherson
Arbor Networks Verisign
21345 Ridgetop Circle
Email: danny@arbor.net Dulles, VA 20166
US
Email: dmcpherson@verisign.com
Kenji Kumaki Kenji Kumaki
KDDI Corporation KDDI Corporation
Garden Air Tower Garden Air Tower
Iidabashi, Chiyoda-ku, Tokyo 102-8460 Iidabashi, Chiyoda-ku, Tokyo 102-8460
Japan Japan
Email: ke-kumaki@kddi.com Email: ke-kumaki@kddi.com
 End of changes. 50 change blocks. 
116 lines changed or deleted 153 lines changed or added

This html diff was produced by rfcdiff 1.38. The latest version is available from http://tools.ietf.org/tools/rfcdiff/