draft-ietf-grow-diverse-bgp-path-dist-05.txt   draft-ietf-grow-diverse-bgp-path-dist-06.txt 
GROW Working Group R. Raszuk, Ed. GROW Working Group R. Raszuk, Ed.
Internet-Draft NTT MCL Internet-Draft NTT MCL
Intended status: Informational R. Fernando Intended status: Informational R. Fernando
Expires: March 18, 2012 K. Patel Expires: May 20, 2012 K. Patel
Cisco Systems Cisco Systems
D. McPherson D. McPherson
Verisign Verisign
K. Kumaki K. Kumaki
KDDI Corporation KDDI Corporation
September 15, 2011 November 17, 2011
Distribution of diverse BGP paths. Distribution of diverse BGP paths.
draft-ietf-grow-diverse-bgp-path-dist-05 draft-ietf-grow-diverse-bgp-path-dist-06
Abstract Abstract
The BGP4 protocol specifies the selection and propagation of a single The BGP4 protocol specifies the selection and propagation of a single
best path for each prefix. As defined today BGP has no mechanisms to best path for each prefix. As defined and widely deployed today BGP
distribute paths other then best path between its speakers. This has no mechanisms to distribute alternate paths which are not
behaviour results in number of disadvantages for new applications and considered best path between its speakers. This behaviour results in
services. number of disadvantages for new applications and services.
This document presents an alternative mechanism for solving the This document presents an alternative mechanism for solving the
problem based on the concept of parallel route reflector planes. problem based on the concept of parallel route reflector planes.
Such planes can be build in parallel or they can co-exit on the Such planes can be built in parallel or they can co-exist on the
current route reflection platforms. Document also compares existing current route reflection platforms. Document also compares existing
solutions and proposed ideas that enable distribution of more paths solutions and proposed ideas that enable distribution of more paths
than just the best path. than just the best path.
This proposal does not specify any changes to the BGP protocol This proposal does not specify any changes to the BGP protocol
definition. It does not require upgrades to provider edge or core definition. It does not require upgrades to provider edge or core
routers nor does it need network wide upgrades. The authors believe routers nor does it need network wide upgrades.
that the GROW WG would be the best place for this work.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 20, 2012.
This Internet-Draft will expire on March 18, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 21 skipping to change at page 3, line 21
4. Multi plane route reflection . . . . . . . . . . . . . . . . . 6 4. Multi plane route reflection . . . . . . . . . . . . . . . . . 6
4.1. Co-located best and backup path RRs . . . . . . . . . . . 9 4.1. Co-located best and backup path RRs . . . . . . . . . . . 9
4.2. Randomly located best and backup path RRs . . . . . . . . 11 4.2. Randomly located best and backup path RRs . . . . . . . . 11
4.3. Multi plane route servers for Internet Exchanges . . . . . 13 4.3. Multi plane route servers for Internet Exchanges . . . . . 13
5. Discussion on current models of IBGP route distribution . . . 14 5. Discussion on current models of IBGP route distribution . . . 14
5.1. Full Mesh . . . . . . . . . . . . . . . . . . . . . . . . 14 5.1. Full Mesh . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2. Confederations . . . . . . . . . . . . . . . . . . . . . . 15 5.2. Confederations . . . . . . . . . . . . . . . . . . . . . . 15
5.3. Route reflectors . . . . . . . . . . . . . . . . . . . . . 16 5.3. Route reflectors . . . . . . . . . . . . . . . . . . . . . 16
6. Deployment considerations . . . . . . . . . . . . . . . . . . 16 6. Deployment considerations . . . . . . . . . . . . . . . . . . 16
7. Summary of benefits . . . . . . . . . . . . . . . . . . . . . 18 7. Summary of benefits . . . . . . . . . . . . . . . . . . . . . 18
8. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 18 8. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 19
9. Security considerations . . . . . . . . . . . . . . . . . . . 19 9. Security considerations . . . . . . . . . . . . . . . . . . . 19
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 19 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 20
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
13.1. Normative References . . . . . . . . . . . . . . . . . . . 20 13.1. Normative References . . . . . . . . . . . . . . . . . . . 20
13.2. Informative References . . . . . . . . . . . . . . . . . . 21 13.2. Informative References . . . . . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction 1. Introduction
Current BGP4 [RFC4271] protocol specification allows for the Current BGP4 [RFC4271] protocol specification allows for the
selection and propagation of only one best path for each prefix. The selection and propagation of only one best path for each prefix. The
skipping to change at page 4, line 27 skipping to change at page 4, line 27
distribution of more paths than just the best path. The parallel distribution of more paths than just the best path. The parallel
route reflector planes solution brings very significant benefits at a route reflector planes solution brings very significant benefits at a
negligible capex and opex deployment price as compared to the negligible capex and opex deployment price as compared to the
alternative techniques and is being considered by a number of network alternative techniques and is being considered by a number of network
operators for deployment in their networks. operators for deployment in their networks.
This proposal does not specify any changes to the BGP protocol This proposal does not specify any changes to the BGP protocol
definition. It does not require upgrades to provider edge or core definition. It does not require upgrades to provider edge or core
routers nor does it need network wide upgrades. The only upgrade routers nor does it need network wide upgrades. The only upgrade
required is the new functionality on the new or current route required is the new functionality on the new or current route
reflectors. The authors believe that the GROW WG would be the best reflectors.
place for this work.
2. History 2. History
The need to disseminate more paths than just the best path is The need to disseminate more paths than just the best path is
primarily driven by three requirements. First is the problem of BGP primarily driven by three requirements. First is the problem of BGP
oscillations [I-D.ietf-idr-route-oscillation]. The second is the oscillations [I-D.ietf-idr-route-oscillation]. The second is the
desire for reduction of time of reachability restoration in the event desire for faster reachability restoration in the event of network or
of network or network element's failure. Third requirement is to network element's failure. Third requirement is to enhance BGP load
enhance BGP load balancing capabilities. Those reasons have lead to balancing capabilities. Those reasons have lead to the proposal of
the proposal of BGP add-paths [I-D.ietf-idr-add-paths]. BGP add-paths [I-D.ietf-idr-add-paths].
2.1. BGP Add-Paths Proposal 2.1. BGP Add-Paths Proposal
As it has been proven that distribution of only the best path of a As it has been proven that distribution of only the best path of a
route is not sufficient to meet the needs of continuously growing route is not sufficient to meet the needs of continuously growing
number of services carried over BGP the add-paths proposal was number of services carried over BGP the add-paths proposal was
submitted in 2002 to enable BGP to distribute more then one path. submitted in 2002 to enable BGP to distribute more then one path.
This is achieved by including as a part of the NLRI an additional This is achieved by including as a part of the NLRI an additional
four octet value called the Path Identifier. four octet value called the Path Identifier.
The implication of this change on a BGP implementation is that it The implication of this change on a BGP implementation is that it
must now maintain per path, instead of per prefix, peer advertisement must now maintain per path, instead of per prefix, peer advertisement
state to track which of the peers each path was advertised to. This state to track which of the peers each path was advertised to. This
new requirement has its own memory and processing cost. Suffice to new requirement has its own memory and processing cost. Suffice to
say that by the end of 2009 none of the commercial BGP implementation say that it took over 9 years for some commercial BGP implementation
could claimed to support the new add-path behaviour in production to support the new add-path behaviour in production code, in major
code, in major part due to this resource overhead. part due to this resource overhead.
An important observation is that distribution of more than one best An important observation is that distribution of more than one best
path by Autonomous System Border Routers (ASBRs) with multiple EBGP path by Autonomous System Border Routers (ASBRs) with multiple EBGP
peers attached to it where no "next hop self" is set may result in peers attached to it where no "next hop self" is set may result in
bestpath selection inconsistency within the autonomous system. bestpath selection inconsistency within the autonomous system.
Therefore it is also required to attach in the form of a new Therefore it is also required to attach in the form of a new
attribute the possible tie breakers and propagate those within the attribute the possible tie breakers and propagate those within the
domain. The example of such attribute for the purpose of fast domain. The example of such attribute for the purpose of fast
connectivity restoration to address that very case of ASBR injecting connectivity restoration to address that very case of ASBR injecting
multiple external paths into the IBGP mesh has been presented and multiple external paths into the IBGP mesh has been presented and
skipping to change at page 6, line 6 skipping to change at page 6, line 5
coverage within any medium or large global network may easily take coverage within any medium or large global network may easily take
years. years.
While it needs to be clearly acknowledged that the add-path mechanism While it needs to be clearly acknowledged that the add-path mechanism
provides the most general way to address the problem of distributing provides the most general way to address the problem of distributing
many paths between BGP speakers, this document provides a much easier many paths between BGP speakers, this document provides a much easier
to deploy solution that requires no modification to the BGP protocol to deploy solution that requires no modification to the BGP protocol
where only a few additional paths may be required. The alternative where only a few additional paths may be required. The alternative
method presented is capable of addressing critical service provider method presented is capable of addressing critical service provider
requirements for disseminating more than a single path across an AS requirements for disseminating more than a single path across an AS
with a significantly lower deployment cost. with a significantly lower deployment cost what in the light of set
general network scaling concerns documented in RFC4984 [RFC4271]
"Report from the IAB Workshop on Routing and Addressing" may provide
a significant advantage.
3. Goals 3. Goals
The proposal described in this document is not intended to compete The proposal described in this document is not intended to compete
with add-paths. Instead if deployed it is to be used as a very easy with add-paths. Instead if deployed it is to be used as a very easy
method to accommodate the majority of applications which may require method to accommodate the majority of applications which may require
presence of alternative BGP exit points. presence of alternative BGP exit points.
It is presented to network operators as a possible choice and It is presented to network operators as a possible choice and
provides those operators who need additional paths today an provides those operators who need additional paths today an
skipping to change at page 13, line 31 skipping to change at page 13, line 31
pointing out that some implementations today already allow for pointing out that some implementations today already allow for
configuration which results in no IGP metric comparison during the configuration which results in no IGP metric comparison during the
best path calculation. best path calculation.
The additional planes of route reflectors do not need to be fully The additional planes of route reflectors do not need to be fully
redundant as the primary one does. If we are preparing for a single redundant as the primary one does. If we are preparing for a single
network failure event, a failure of a non backed up N-th best-path network failure event, a failure of a non backed up N-th best-path
route reflector would not result in an connectivity outage of the route reflector would not result in an connectivity outage of the
actual data plane. The reason is that this would at most affect the actual data plane. The reason is that this would at most affect the
presence of a backup path (not an active one) on same parts of the presence of a backup path (not an active one) on same parts of the
network. If the operator chooses to build the N-th best path plane network. If the operator chooses to create the N-th best path plane
redundantly by installing not one, but two or more route reflectors redundantly by installing not one, but two or more route reflectors
serving each additional plane the additional robustness will be serving each additional plane the additional robustness will be
achieved. achieved.
As a result of this solution ASBR3 and other ASBRs peering to RR' As a result of this solution ASBR3 and other ASBRs peering to RR'
will be receiving the 2nd best path. will be receiving the 2nd best path.
Similarly to section 4.1 as an alternative to fully meshing all RRs & Similarly to section 4.1 as an alternative to fully meshing all RRs &
RRs' an operator who may have a large number of reflectors already RRs' an operator who may have a large number of reflectors already
deployed today may choose to peer newly introduced RRs' to a deployed today may choose to peer newly introduced RRs' to a
skipping to change at page 16, line 35 skipping to change at page 16, line 35
discussed later. discussed later.
The route reflection equivalent when interconnecting BGP speakers The route reflection equivalent when interconnecting BGP speakers
between domains is popularly called the Route Server and is globally between domains is popularly called the Route Server and is globally
deployed today in many internet exchange points. deployed today in many internet exchange points.
6. Deployment considerations 6. Deployment considerations
The diverse BGP path dissemination proposal allows the distribution The diverse BGP path dissemination proposal allows the distribution
of more paths than just the best-path to route reflector or route of more paths than just the best-path to route reflector or route
server clients of today's BGP4 implementations. server clients of today's BGP4 implementations. As deployment
recommendation it needs to be mentioned that fast connectivty
restoration as well as majority of intra-domain BGP level load
balancing needs can be accomodated with only two paths (overall best
as well as second best). Therefor as deployment recommendation this
document suggests use of N=2 with diverse-path.
From the client's point of view receiving additional paths via From the client's point of view receiving additional paths via
separate IBGP sessions terminated at the new router reflector plane separate IBGP sessions terminated at the new router reflector plane
is functionally equivalent to constructing a full mesh peering is functionally equivalent to constructing a full mesh peering
without the problems that such a full mesh would come with set of without the problems that such a full mesh would come with set of
problems as discussed in earlier section. problems as discussed in earlier section.
By precisely defining the number of reflector planes, network By precisely defining the number of reflector planes, network
operators have full control over the number of redundant paths in the operators have full control over the number of redundant paths in the
network. This number can be defined to address the needs of the network. This number can be defined to address the needs of the
skipping to change at page 17, line 9 skipping to change at page 17, line 14
The Nth plane route reflectors should be acting as control plane The Nth plane route reflectors should be acting as control plane
network entities. While they can be provisioned on the current network entities. While they can be provisioned on the current
production routers selected Nth best BGP paths should not be used production routers selected Nth best BGP paths should not be used
directly in the date plane with the exception of such paths being BGP directly in the date plane with the exception of such paths being BGP
multipath eligible and such functionality is enabled. On RRs being multipath eligible and such functionality is enabled. On RRs being
in the data plane unless multipath is enabled 2nd best path is in the data plane unless multipath is enabled 2nd best path is
expected to be a backup path and should be installed as such into expected to be a backup path and should be installed as such into
local RIB/FIB. local RIB/FIB.
The use of terminology of "planes" in this document is more of a
conceptual nature. In practice all paths are still kept in the
single table where normal best path is calculated. That means that
tools like looking glass should not observe any changes nor impact
when diverse-path has been enabled.
The proposed architecture deployed along with the BGP best-external The proposed architecture deployed along with the BGP best-external
functionality covers all three cases where the classic BGP route functionality covers all three cases where the classic BGP route
reflection paradigm would fail to distribute alternate exit points reflection paradigm would fail to distribute alternate exit points
paths. paths.
1. ASBRs advertising their single best external paths with no local- 1. ASBRs advertising their single best external paths with no local-
preference or multi-exit-discriminator present. preference or multi-exit-discriminator present.
2. ASBRs advertising their single best external paths with local- 2. ASBRs advertising their single best external paths with local-
preference or multi-exit-discriminator present and with BGP best- preference or multi-exit-discriminator present and with BGP best-
skipping to change at page 20, line 29 skipping to change at page 20, line 34
Isidor Kouvelas Isidor Kouvelas
Cisco Systems Cisco Systems
170 West Tasman Drive 170 West Tasman Drive
San Jose, CA 95134 San Jose, CA 95134
US US
Email: kouvelas@cisco.com Email: kouvelas@cisco.com
12. Acknowledgments 12. Acknowledgments
The authors would like to thank Bruno Decraene, Bart Peirens, Eric The authors would like to thank Bruno Decraene, Bart Peirens, Eric
Rosen, Jim Uttaro, Renwei Li and George Wes for their valuable input. Rosen, Jim Uttaro, Renwei Li and Wes George for their valuable input.
The authors would also like to express special thank you to number of The authors would also like to express special thank you to number of
operators who helped to optimize the provided solution to be as close operators who helped to optimize the provided solution to be as close
as possible to their daily operational practices. Especially many as possible to their daily operational practices. Especially many
thx goes to Ted Seely, Shan Amante, Benson Schliesser and Seiichi thx goes to Ted Seely, Shan Amante, Benson Schliesser and Seiichi
Kawamura. Kawamura.
13. References 13. References
13.1. Normative References 13.1. Normative References
skipping to change at page 21, line 22 skipping to change at page 21, line 28
[I-D.decraene-bgp-graceful-shutdown-requirements] [I-D.decraene-bgp-graceful-shutdown-requirements]
Decraene, B., Francois, P., pelsser, c., Ahmad, Z., and A. Decraene, B., Francois, P., pelsser, c., Ahmad, Z., and A.
Armengol, "Requirements for the graceful shutdown of BGP Armengol, "Requirements for the graceful shutdown of BGP
sessions", sessions",
draft-decraene-bgp-graceful-shutdown-requirements-01 (work draft-decraene-bgp-graceful-shutdown-requirements-01 (work
in progress), March 2009. in progress), March 2009.
[I-D.ietf-idr-add-paths] [I-D.ietf-idr-add-paths]
Walton, D., Chen, E., Retana, A., and J. Scudder, Walton, D., Chen, E., Retana, A., and J. Scudder,
"Advertisement of Multiple Paths in BGP", "Advertisement of Multiple Paths in BGP",
draft-ietf-idr-add-paths-05 (work in progress), July 2011. draft-ietf-idr-add-paths-06 (work in progress),
September 2011.
[I-D.ietf-idr-best-external] [I-D.ietf-idr-best-external]
Marques, P., Fernando, R., Chen, E., Mohapatra, P., and H. Marques, P., Fernando, R., Chen, E., Mohapatra, P., and H.
Gredler, "Advertisement of the best external route in Gredler, "Advertisement of the best external route in
BGP", draft-ietf-idr-best-external-04 (work in progress), BGP", draft-ietf-idr-best-external-04 (work in progress),
April 2011. April 2011.
[I-D.ietf-idr-route-oscillation] [I-D.ietf-idr-route-oscillation]
McPherson, D., "BGP Persistent Route Oscillation McPherson, D., "BGP Persistent Route Oscillation
Condition", draft-ietf-idr-route-oscillation-01 (work in Condition", draft-ietf-idr-route-oscillation-01 (work in
progress), February 2002. progress), February 2002.
[I-D.pmohapat-idr-fast-conn-restore] [I-D.pmohapat-idr-fast-conn-restore]
Mohapatra, P., Fernando, R., Filsfils, C., and R. Raszuk, Mohapatra, P., Fernando, R., Filsfils, C., and R. Raszuk,
"Fast Connectivity Restoration Using BGP Add-path", "Fast Connectivity Restoration Using BGP Add-path",
draft-pmohapat-idr-fast-conn-restore-01 (work in draft-pmohapat-idr-fast-conn-restore-02 (work in
progress), March 2011. progress), October 2011.
[I-D.raszuk-idr-ibgp-auto-mesh] [I-D.raszuk-idr-ibgp-auto-mesh]
Raszuk, R., "IBGP Auto Mesh", Raszuk, R., "IBGP Auto Mesh",
draft-raszuk-idr-ibgp-auto-mesh-00 (work in progress), draft-raszuk-idr-ibgp-auto-mesh-00 (work in progress),
June 2003. June 2003.
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route
Reflection: An Alternative to Full Mesh Internal BGP Reflection: An Alternative to Full Mesh Internal BGP
(IBGP)", RFC 4456, April 2006. (IBGP)", RFC 4456, April 2006.
[RFC4984] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB
Workshop on Routing and Addressing", RFC 4984,
September 2007.
[RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous [RFC5065] Traina, P., McPherson, D., and J. Scudder, "Autonomous
System Confederations for BGP", RFC 5065, August 2007. System Confederations for BGP", RFC 5065, August 2007.
Authors' Addresses Authors' Addresses
Robert Raszuk (editor) Robert Raszuk (editor)
NTT MCL NTT MCL
101 S Ellsworth Avenue Suite 350 101 S Ellsworth Avenue Suite 350
San Mateo, CA 94401 San Mateo, CA 94401
US US
 End of changes. 20 change blocks. 
30 lines changed or deleted 46 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/