draft-ietf-idr-bgp-optimal-route-reflection-22.txt | draft-ietf-idr-bgp-optimal-route-reflection-23.txt | |||
---|---|---|---|---|
IDR Working Group R. Raszuk, Ed. | IDR Working Group R. Raszuk, Ed. | |||
Internet-Draft NTT Network Innovations | Internet-Draft NTT Network Innovations | |||
Intended status: Standards Track C. Cassar | Intended status: Standards Track C. Cassar | |||
Expires: July 19, 2021 Tesla | Expires: November 13, 2021 Tesla | |||
E. Aman | E. Aman | |||
B. Decraene, Ed. | B. Decraene, Ed. | |||
Orange | Orange | |||
K. Wang | K. Wang | |||
Juniper Networks | Juniper Networks | |||
January 15, 2021 | May 12, 2021 | |||
BGP Optimal Route Reflection (BGP-ORR) | BGP Optimal Route Reflection (BGP-ORR) | |||
draft-ietf-idr-bgp-optimal-route-reflection-22 | draft-ietf-idr-bgp-optimal-route-reflection-23 | |||
Abstract | Abstract | |||
This document defines an extension to BGP route reflectors. On route | This document defines an extension to BGP route reflectors. On route | |||
reflectors, BGP route selection is modified in order to choose the | reflectors, BGP route selection is modified in order to choose the | |||
best path from the standpoint of their clients, rather than from the | best route from the standpoint of their clients, rather than from the | |||
standpoint of the route reflectors. Multiple types of granularity | standpoint of the route reflectors. Depending on the scaling and | |||
are proposed, from a per client BGP route selection or to a per peer | precision requirements, route selection can be specific for one | |||
group, depending on the scaling and precision requirements on route | client, common for a set of clients or common for all clients of a | |||
selection. This solution is particularly applicable in deployments | route reflector. This solution is particularly applicable in | |||
using centralized route reflectors, where choosing the best route | deployments using centralized route reflectors, where choosing the | |||
based on the route reflector IGP location is suboptimal. This | best route based on the route reflector's IGP location is suboptimal. | |||
facilitates, for example, best exit point policy (hot potato | This facilitates, for example, best exit point policy (hot potato | |||
routing). | routing). | |||
The solution relies upon all route reflectors learning all paths | The solution relies upon all route reflectors learning all paths | |||
which are eligible for consideration. Best path selection is | which are eligible for consideration. BGP Route Selection is | |||
performed in each route reflector based on the IGP cost from a | performed in the route reflectors based on the IGP cost from | |||
selected location in the link state IGP. | configured locations in the link state IGP. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on July 19, 2021. | This Internet-Draft will expire on November 13, 2021. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2021 IETF Trust and the persons identified as the | Copyright (c) 2021 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Definitions of Terms Used in This Memo . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
3. Modifications to BGP Best Path selection . . . . . . . . . . 5 | 3. Modifications to BGP Route Selection . . . . . . . . . . . . 4 | |||
3.1. Best Path Selection from a different IGP location . . . . 6 | 3.1. Route Selection from a different IGP location . . . . . . 5 | |||
3.1.1. Restriction when BGP next hop is BGP prefix . . . . . 7 | 3.1.1. Restriction when BGP next hop is a BGP prefix . . . . 6 | |||
3.2. Multiple Best Path Selections . . . . . . . . . . . . . . 7 | 3.2. Multiple Route Selections . . . . . . . . . . . . . . . . 6 | |||
4. Implementation considerations . . . . . . . . . . . . . . . . 7 | 4. Deployment Considerations . . . . . . . . . . . . . . . . . . 6 | |||
4.1. Likely Deployments and need for backup . . . . . . . . . 7 | 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | |||
5. CPU and Memory Scalability . . . . . . . . . . . . . . . . . 8 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | |||
6. Advantages and Deployment Considerations . . . . . . . . . . 8 | 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
7. Security Considerations . . . . . . . . . . . . . . . . . . . 9 | 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 8 | |||
10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 10 | 9.2. Informative References . . . . . . . . . . . . . . . . . 9 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 11 | ||||
11.2. Informative References . . . . . . . . . . . . . . . . . 11 | ||||
Appendix A. Appendix: alternative solutions with limited | ||||
applicability . . . . . . . . . . . . . . . . . . . 12 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | ||||
1. Definitions of Terms Used in This Memo | ||||
NLRI - Network Layer Reachability Information | ||||
RIB - Routing Information Base | ||||
AS - Autonomous System number | ||||
VRF - Virtual Routing and Forwarding instance | ||||
PE - Provider Edge router | ||||
RR - Route Reflector | ||||
POP - Point Of Presence | ||||
L3VPN - Layer 3 Virtual Private Network [RFC4364] | ||||
6PE - IPv6 Provider Edge [RFC4798] | ||||
IGP - Interior Gateway Protocol | ||||
SPT - Shortest Path Tree | ||||
best path - the route chosen by the decision process detailed in | ||||
[RFC4271] section 9.1.2 and its subsections | ||||
best path computation - the decision process detailed in [RFC4271] | ||||
section 9.1.2 and its subsections | ||||
best path algorithm - the decision process detailed in [RFC4271] | ||||
section 9.1.2 and its subsections | ||||
best path selection - the decision process detailed in [RFC4271] | ||||
section 9.1.2 and its subsections | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in BCP | ||||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
2. Introduction | 1. Introduction | |||
There are three types of BGP deployments within Autonomous Systems | There are three types of BGP deployments within Autonomous Systems | |||
today: full mesh, confederations and route reflection. BGP route | today: full mesh, confederations and route reflection. BGP route | |||
reflection [RFC4456] is the most popular way to distribute BGP routes | reflection [RFC4456] is the most popular way to distribute BGP routes | |||
between BGP speakers belonging to the same Autonomous System. | between BGP speakers belonging to the same Autonomous System. | |||
However, in some situations, this method suffers from non-optimal | However, in some situations, this method suffers from non-optimal | |||
path selection. | path selection. | |||
[RFC4456] asserts that, because the IGP cost to a given point in the | [RFC4456] asserts that, because the IGP cost to a given point in the | |||
network will vary across routers, "the route reflection approach may | network will vary across routers, "the route reflection approach may | |||
skipping to change at page 4, line 18 ¶ | skipping to change at page 3, line 22 ¶ | |||
dictates otherwise. As a consequence of the route reflection method, | dictates otherwise. As a consequence of the route reflection method, | |||
the choice of exit point for a route reflector and its clients will | the choice of exit point for a route reflector and its clients will | |||
be the exit point that is optimal for the route reflector - not | be the exit point that is optimal for the route reflector - not | |||
necessarily the one that is optimal for its clients. | necessarily the one that is optimal for its clients. | |||
Section 11 of [RFC4456] describes a deployment approach and a set of | Section 11 of [RFC4456] describes a deployment approach and a set of | |||
constraints which, if satisfied, would result in the deployment of | constraints which, if satisfied, would result in the deployment of | |||
route reflection yielding the same results as the IBGP full mesh | route reflection yielding the same results as the IBGP full mesh | |||
approach. This deployment approach makes route reflection compatible | approach. This deployment approach makes route reflection compatible | |||
with the application of hot potato routing policy. In accordance | with the application of hot potato routing policy. In accordance | |||
with these design rules, route reflectors have traditionally often | with these design rules, route reflectors have often been deployed in | |||
been deployed in the forwarding path and carefully placed on the POP | the forwarding path and carefully placed on the POP to core | |||
to core boundaries. | boundaries. | |||
The evolving model of intra-domain network design has enabled | The evolving model of intra-domain network design has enabled | |||
deployments of route reflectors outside of the forwarding path. | deployments of route reflectors outside of the forwarding path. | |||
Initially this model was only employed for new address families, e.g. | Initially this model was only employed for new services, e.g. IP | |||
L3VPNs and L2VPNs, however it has been gradually extended to other | VPNs [RFC4364], however it has been gradually extended to other BGP | |||
BGP address families including IPv4 and IPv6 Internet using either | services including IPv4 and IPv6 Internet. In such environments, hot | |||
native routing or 6PE. In such environments, hot potato routing | potato routing policy remains desirable. | |||
policy remains desirable. | ||||
Route reflectors outside of the forwarding path can be placed on the | Route reflectors outside of the forwarding path can be placed on the | |||
POP to core boundaries, but they are often placed in arbitrary | POP to core boundaries, but they are often placed in arbitrary | |||
locations in the core of large networks. | locations in the core of large networks. | |||
Such deployments suffer from a critical drawback in the context of | Such deployments suffer from a critical drawback in the context of | |||
best path selection: A route reflector with knowledge of multiple | BGP Route Selection: A route reflector with knowledge of multiple | |||
paths for a given prefix will typically pick its best path and only | paths for a given prefix will typically pick its best path and only | |||
advertise that best path to its clients. If the best path for a | advertise that best path to its clients. If the best path for a | |||
prefix is selected on the basis of an IGP tie-break, the path | prefix is selected on the basis of an IGP tie-break, the path | |||
advertised will be the exit point closest to the route reflector. | advertised will be the exit point closest to the route reflector. | |||
However, the clients are in a different place in the network topology | However, the clients are in a different place in the network topology | |||
than the route reflector. In networks where the route reflectors are | than the route reflector. In networks where the route reflectors are | |||
not in the forwarding path, this difference will be even more acute. | not in the forwarding path, this difference will be even more acute. | |||
In addition, there are deployment scenarios where service providers | In addition, there are deployment scenarios where service providers | |||
want to have more control in choosing the exit points for clients | want to have more control in choosing the exit points for clients | |||
based on other factors, such as traffic type, traffic load, etc. | based on other factors, such as traffic type, traffic load, etc. | |||
This further complicates the issue and makes it less likely for the | This further complicates the issue and makes it less likely for the | |||
route reflector to select the best path from the client's | route reflector to select the best path from the client's | |||
perspective. It follows that the best path chosen by the route | perspective. It follows that the best path chosen by the route | |||
reflector is not necessarily the same as the path which would have | reflector is not necessarily the same as the path which would have | |||
been chosen by the client if the client had considered the same set | been chosen by the client if the client had considered the same set | |||
of candidate paths as the route reflector. | of candidate paths as the route reflector. | |||
3. Modifications to BGP Best Path selection | 2. Terminology | |||
This memo makes use of the terms defined in [RFC4271] and [RFC4456]. | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in BCP | ||||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
3. Modifications to BGP Route Selection | ||||
The core of this solution is the ability for an operator to specify | The core of this solution is the ability for an operator to specify | |||
the IGP location for which the route reflector should calculate | the IGP location for which the route reflector calculates interior | |||
routes. This can be done on a per route reflector basis, per peer/ | cost for the NEXT_HOP. The IGP location is defined as a node in the | |||
update group basis, or per peer basis. This ability enables the | IGP topology and may be configured on a per route reflector basis, | |||
per set of clients, or per client basis. This ability enables the | ||||
route reflector to send to a given set of clients routes with | route reflector to send to a given set of clients routes with | |||
shortest distance to the next hops from the position of the selected | shortest distance to the next hops from the position of the selected | |||
IGP location. This provides for freedom of route reflector physical | IGP location. This provides for freedom of route reflector physical | |||
location, and allows transient or permanent migration of this network | location, and allows transient or permanent migration of this network | |||
control plane function to an arbitrary location. | control plane function to an arbitrary location. | |||
The choice of specific granularity (route reflector, peer/update | The choice of specific granularity (route reflector, set of clients, | |||
group, or peer) is configured by the network operator. An | or client) is configured by the network operator. An implementation | |||
implementation is considered compliant with this document if it | is considered compliant with this document if it supports at least | |||
supports at least one listed grouping of IGP location. | one listed grouping of IGP location. | |||
For purposes of route selection, the perspective of a client can | For purposes of route selection, the perspective of a client can | |||
differ from that of a route reflector or another client in two | differ from that of a route reflector or another client in two | |||
distinct ways: | distinct ways: | |||
o it can, and usually will, have a different position in the IGP | o it has a different position in the IGP topology, and | |||
topology, and | ||||
o it can have a different routing policy. | o it can have a different routing policy. | |||
These factors correspond to the issues described earlier. | These factors correspond to the issues described earlier. | |||
This document defines, on BGP Route Reflectors [RFC4456], two changes | This document defines, for BGP Route Reflectors [RFC4456], two | |||
to the BGP Best Path selection algorithm: | changes to the BGP Route Selection algorithm: | |||
o The first change, introduced in Section 3.1, is related to the IGP | o The first change, introduced in Section 3.1, is related to the IGP | |||
cost to the BGP Next Hop in the BGP decision process. The change | cost to the BGP Next Hop in the BGP decision process. The change | |||
consists in using the IGP cost from a different IGP location than | consists in using the IGP cost from a different IGP location than | |||
the route reflector itself. | the route reflector itself. | |||
o The second change, introduced in Section 3.2, is to extend the | o The second change, introduced in Section 3.2, is to extend the | |||
granularity of the BGP decision process, to allow for running | granularity of the BGP decision process, to allow for running | |||
multiple decisions process using different perspective or | multiple decisions processes using different perspective or | |||
policies. | policies. | |||
A route reflector can implement either or both of the modifications | ||||
in order to allow it to choose the best path for its clients that the | ||||
clients themselves would have chosen given the same set of candidate | ||||
paths. | ||||
A significant advantage of these approaches is that the route | A significant advantage of these approaches is that the route | |||
reflector clients do not need to run new software or hardware. | reflector clients do not need to be modified. | |||
3.1. Best Path Selection from a different IGP location | 3.1. Route Selection from a different IGP location | |||
In this approach, optimal refers to the decision made during best | In this approach, optimal refers to the decision where the interior | |||
path selection at the IGP metric to BGP next hop comparison step. It | cost of a route is determined during step e) of [RFC4271] section | |||
does not apply to path selection preference based on other policy | 9.1.2.2 "Breaking Ties (Phase 2)". It does not apply to path | |||
steps and provisions. | selection preference based on other policy steps and provisions. | |||
In addition to the change specified in [RFC4456] section 9, the BGP | In addition to the change specified in [RFC4456] section 9, [RFC4271] | |||
Decision Process tie-breaking rules ([RFC4271] section 9.1.2.2) are | section 9.1.2.2 is modified as follows. | |||
modified as follows. | ||||
The below text in step e) | The below text in step e) | |||
e) Remove from consideration any routes with less-preferred | e) Remove from consideration any routes with less-preferred | |||
interior cost. The interior cost of a route is determined by | interior cost. The interior cost of a route is determined by | |||
calculating the metric to the NEXT_HOP for the route using the | calculating the metric to the NEXT_HOP for the route using the | |||
Routing Table. | Routing Table. | |||
...is replaced by this new text: | ...is replaced by this new text: | |||
skipping to change at page 6, line 40 ¶ | skipping to change at page 5, line 46 ¶ | |||
calculating the metric from the selected IGP location to the | calculating the metric from the selected IGP location to the | |||
NEXT_HOP for the route using the shortest IGP path tree rooted at | NEXT_HOP for the route using the shortest IGP path tree rooted at | |||
the selected IGP location. | the selected IGP location. | |||
In order to be able to compute the shortest path tree rooted at the | In order to be able to compute the shortest path tree rooted at the | |||
selected IGP locations, knowledge of the IGP topology for the area/ | selected IGP locations, knowledge of the IGP topology for the area/ | |||
level that includes each of those locations is needed. This | level that includes each of those locations is needed. This | |||
knowledge can be gained with the use of the link state IGP such as | knowledge can be gained with the use of the link state IGP such as | |||
IS-IS [ISO10589] or OSPF [RFC2328] [RFC5340] or via BGP-LS [RFC7752]. | IS-IS [ISO10589] or OSPF [RFC2328] [RFC5340] or via BGP-LS [RFC7752]. | |||
The configuration of the IGP location is outside of the scope of this | The way the IGP location is configured is outside the scope of this | |||
document. The operator may configure it manually, an implementation | document. The operator may configure it manually, an implementation | |||
may automate it based on heuristics, or it can be computed centrally | may automate it based on heuristics, or it can be computed centrally | |||
and configured by an external system. | and configured by an external system. One or more backup locations | |||
SHOULD be allowed to be specified for redundancy. | ||||
This solution does not require any change (BGP or IGP) on the | ||||
clients, as all required changes are limited to the route reflector. | ||||
This solution applies to NLRIs of all address families that can be | ||||
route reflected. | ||||
3.1.1. Restriction when BGP next hop is BGP prefix | 3.1.1. Restriction when BGP next hop is a BGP prefix | |||
In situations where the BGP next hop is a BGP prefix itself, the IGP | In situations where the BGP next hop is a BGP prefix itself, the IGP | |||
metric of a route used for its resolution SHOULD be the final IGP | metric of a route used for its resolution SHOULD be the final IGP | |||
cost to reach such next hop. Implementations which can not inform | cost to reach such next hop. Implementations which can not inform | |||
BGP of the final IGP metric to a recursive next hop SHOULD treat such | BGP of the final IGP metric to a recursive next hop MUST treat such | |||
paths as least preferred during next hop metric comparison. However | paths as least preferred during next hop metric comparison. However | |||
such paths SHOULD still be considered valid for best path selection. | such paths MUST still be considered valid for BGP Phase 2 Route | |||
Selection. | ||||
3.2. Multiple Best Path Selections | 3.2. Multiple Route Selections | |||
BGP Route Reflector as per [RFC4456] runs a single best path | BGP Route Reflector as per [RFC4456] runs a single BGP Decision | |||
selection. Optimal route reflection may require calculation of | Process. Optimal route reflection may require multiple BGP Decision | |||
multiple best path selections or subsets of best path selection in | Processes or subsets of the Decision Process in order to consider | |||
order to consider different IGP locations or BGP policies for | different IGP locations or BGP policies for different sets of | |||
different sets of clients. | clients. | |||
If the required routing optimization is limited to the IGP cost to | If the required routing optimization is limited to the IGP cost to | |||
the BGP Next-Hop, only step e) as defined [RFC4271] section 9.1.2.2, | the BGP Next-Hop, only step e) and below as defined [RFC4271] section | |||
needs to be duplicated. | 9.1.2.2, needs to be run multiple times. | |||
If the routing optimization requires the use of different BGP | If the routing optimization requires the use of different BGP | |||
policies for different sets of clients, a larger part of the decision | policies for different sets of clients, a larger part of the decision | |||
process needs to be duplicated, up to the whole decision process as | process needs to be run multiple times, up to the whole decision | |||
defined in section 9.1 of [RFC4271]. This is for example the case | process as defined in section 9.1 of [RFC4271]. This is for example | |||
when there is a need to use different policies to compute different | the case when there is a need to use different policies to compute | |||
degree of preference during Phase 1. This is needed for use cases | different degree of preference during Phase 1. This is needed for | |||
involving traffic engineering or dedicating certain exit points for | use cases involving traffic engineering or dedicating certain exit | |||
certain clients. In the latter case, the user MAY specify and apply | points for certain clients. In the latter case, the user may specify | |||
a general policy on the route reflector for a set of clients. For a | and apply a general policy on the route reflector for a set of | |||
given set of clients, the policy SHOULD in that case allow the | clients. Regular path selection, including IGP perspective for a set | |||
operator to select different candidate exit points for different | of clients as per Section 3.1, is then applied to the candidate paths | |||
address families. Regular path selection, including IGP perspective | to select the final paths to advertise to the clients. | |||
for a set of clients as per Section 3.1, is then applied to the | ||||
candidate paths to select the final paths to advertise to the | ||||
clients. | ||||
4. Implementation considerations | ||||
4.1. Likely Deployments and need for backup | ||||
With IGP based optimal route reflection, even though the IGP location | ||||
could be specified on a per route reflector basis or per peer/update | ||||
group basis or per peer basis, in reality, it's most likely to be | ||||
specified per peer/update group basis. All clients with the same or | ||||
similar IGP location can be grouped into the same peer/update group. | ||||
An IGP location is then specified for the peer/update group. The | ||||
location is usually specified as the location of one of the clients | ||||
from the peer group or an ABR to the area where clients are located. | ||||
Also, one or more backup locations SHOULD be allowed to be specified | ||||
for redundancy. Implementations may wish to take advantage of peer | ||||
group mechanisms in order to provide for better scalability of | ||||
optimal route reflector client groups with similar properties. | ||||
5. CPU and Memory Scalability | ||||
For IGP based optimal route reflection, determining the shortest path | ||||
and associated cost between any two arbitrary points in a network | ||||
based on the IGP topology learned by a router is expected to add some | ||||
extra cost in terms of CPU resources. However, current SPF tree | ||||
generation code is implemented efficiently in a number of | ||||
implementations, and therefore this is not expected to be a major | ||||
drawback. The number of SPTs computed is expected to be of the order | ||||
of the number of clients of a route reflector whenever a topology | ||||
change is detected. It is expected to be higher but comparable to | ||||
some existing deployed features such as (Remote) Loop Free Alternate | ||||
which computes a (r)SPT per IGP neighbor. | ||||
For policy based optimal route reflection, there will be some | ||||
overhead to apply the policy to select the candidate paths. This | ||||
overhead is comparable to existing BGP export policies and therefore | ||||
should be manageable. | ||||
By the nature of route reflection, the number of clients can be split | ||||
arbitrarily by the deployment of more route reflectors for a given | ||||
number of clients. While this is not expected to be necessary in | ||||
existing networks with best in class route reflectors available | ||||
today, this avenue to scaling up the route reflection infrastructure | ||||
is available. | ||||
If we consider the overall network wide cost/benefit factor, the only | ||||
alternative to achieve the same level of optimality would require | ||||
significantly increasing state on the edges of the network. This | ||||
will consume CPU and memory resources on all BGP speakers in the | ||||
network. Building this client perspective into the route reflectors | ||||
seems appropriate. | ||||
6. Advantages and Deployment Considerations | A route reflector can implement either or both of the modifications | |||
in order to allow it to choose the best path for its clients that the | ||||
clients themselves would have chosen given the same set of candidate | ||||
paths. | ||||
The solutions described provide a model for integrating the client | 4. Deployment Considerations | |||
perspective into the best path computation for route reflectors. | ||||
More specifically, the choice of BGP path factors in either the IGP | ||||
cost between the client and the nexthop (rather than the IGP cost | ||||
from the route reflector to the nexthop) or other user configured | ||||
policies. | ||||
The achievement of optimal routing relies upon all route reflectors | BGP Optimal Route Reflection provides a model for integrating the | |||
learning all paths that are eligible for consideration. In order to | client perspective into the BGP Route Selection decision function for | |||
satisfy this requirement, path diversity enhancing mechanisms such as | route reflectors. More specifically, the choice of BGP path factors | |||
BGP add-path [RFC7911] may need to be deployed between route | in either the IGP cost between the client and the NEXT_HOP (rather | |||
reflectors. | than the IGP cost from the route reflector to the NEXT_HOP) or other | |||
user configured policies. | ||||
Implementations considered compliant with this document allow the | The achievement of optimal routing between clients of different | |||
configuration of a logical location from which the best path will be | clusters relies upon all route reflectors learning all paths that are | |||
computed, on the basis of either a peer, a peer group, or an entire | eligible for consideration. In order to satisfy this requirement, | |||
routing instance. | BGP add-path [RFC7911] needs to be deployed between route reflectors. | |||
These solutions can be deployed in traditional hop-by-hop forwarding | This solution can be deployed in traditional hop-by-hop forwarding | |||
networks as well as in end-to-end tunneled environments. In networks | networks as well as in end-to-end tunneled environments. In networks | |||
where there are multiple route reflectors and hop-by-hop forwarding | where there are multiple route reflectors and hop-by-hop forwarding | |||
without encapsulation, such optimizations SHOULD be enabled in a | without encapsulation, such optimizations SHOULD be enabled in a | |||
consistent way on all route reflectors. Otherwise, clients may | consistent way on all route reflectors. Otherwise, clients may | |||
receive an inconsistent view of the network, in turn leading to | receive an inconsistent view of the network, in turn leading to | |||
intra-domain forwarding loops. | intra-domain forwarding loops. | |||
As discussed in section 11 of [RFC4456], the IGP locations of BGP | ||||
route reflectors is important and has routing implications. This | ||||
equally applies to the choice of the IGP locations configured on | ||||
optimal route reflectors. After selecting suitable IGP locations, an | ||||
operator may let one or multiple route reflectors handle route | ||||
selection for all of them. The operator may alternatively deploy one | ||||
or multiple route reflector for each IGP location or create any | ||||
design in between. This choice may depend on operational model | ||||
(centralized vs per region), acceptable blast radius in case of | ||||
failure, acceptable number of IBGP sessions for the mesh between the | ||||
route reflectors, performance and configuration granularity of the | ||||
equipment. | ||||
With this approach, an ISP can effect a hot potato routing policy | With this approach, an ISP can effect a hot potato routing policy | |||
even if route reflection has been moved out of the forwarding plane, | even if route reflection has been moved out of the forwarding plane, | |||
and hop-by-hop switching has been replaced by end-to-end MPLS or IP | and hop-by-hop switching has been replaced by end-to-end MPLS or IP | |||
encapsulation. | encapsulation. Compared with a deployment of ADD-PATH on all | |||
routers, BGP ORR reduces the amount of state which needs to be pushed | ||||
As per above, these approaches reduce the amount of state which needs | to the edge of the network in order to perform hot potato routing. | |||
to be pushed to the edge of the network in order to perform hot | ||||
potato routing. The memory and CPU resources required at the edge of | ||||
the network to provide hot potato routing using these approaches is | ||||
lower than what would be required to achieve the same level of | ||||
optimality by pushing and retaining all available paths (potentially | ||||
10s) per each prefix at the edge. | ||||
The solutions above allow for a fast and safe transition to a BGP | Modifying the IGP location of BGP ORR does not interfere with | |||
control plane using centralized route reflection, without | policies enforced before IGP tie-breaking (step e) in the BGP | |||
compromising an operator's closest exit operational principle. This | Decision Process Route. | |||
enables edge-to-edge LSP/IP encapsulation for traffic to IPv4 and | ||||
IPv6 prefixes. | ||||
Regarding Best Path Selection from a different IGP location, it | Calculating routes for different IGP locations requires multiple SPF | |||
should be self evident that this solution does not interfere with | calculations and multiple (subsets of) BGP Decision Processes, which | |||
policies enforced above IGP tie-breaking in the BGP best path | requires more computing resources. This document allows for | |||
algorithm. | different granularity such as one Decision Process per route | |||
reflector, per set of clients or per client. A more fine grained | ||||
granularity may translate into more optimal hot potato routing at the | ||||
cost of more computing power. The ability to run fine grained | ||||
computations depends on the platform/hardware deployed, the number of | ||||
clients, the number of BGP routes and the size of the IGP topology. | ||||
In essence, sizing considerations are similar to the deployments of | ||||
BGP Route Reflector. | ||||
7. Security Considerations | 5. Security Considerations | |||
Similarly to [RFC4456], this extension to BGP does not change the | Similarly to [RFC4456], this extension to BGP does not change the | |||
underlying security issues inherent in the existing IBGP [RFC4456]. | underlying security issues inherent in the existing IBGP. | |||
It however enables the deployment of base BGP Route Reflection as | ||||
described in [RFC4456] to be possible using virtual compute | ||||
environments without any negative consequence on the BGP routing path | ||||
optimality. | ||||
This document does not introduce requirements for any new protection | This document does not introduce requirements for any new protection | |||
measures, but it also does not relax best operational practices for | measures. | |||
keeping the IGP network stable or to pace rate of policy based IGP | ||||
cost to next hops such that it does not have any substantial effect | ||||
on BGP path changes and their propagation to route reflection | ||||
clients. | ||||
8. IANA Considerations | 6. IANA Considerations | |||
This document does not request any IANA allocations. | This document does not request any IANA allocations. | |||
9. Acknowledgments | 7. Acknowledgments | |||
Authors would like to thank Keyur Patel, Eric Rosen, Clarence | Authors would like to thank Keyur Patel, Eric Rosen, Clarence | |||
Filsfils, Uli Bornhauser, Russ White, Jakob Heitz, Mike Shand, Jon | Filsfils, Uli Bornhauser, Russ White, Jakob Heitz, Mike Shand, Jon | |||
Mitchell, John Scudder, Jeff Haas, Martin Djernaes, Daniele | Mitchell, John Scudder, Jeff Haas, Martin Djernaes, Daniele | |||
Ceccarelli, Kieran Milne, Job Snijders and Randy Bush for their | Ceccarelli, Kieran Milne, Job Snijders and Randy Bush for their | |||
valuable input. | valuable input. | |||
10. Contributors | 8. Contributors | |||
Following persons substantially contributed to the current format of | Following persons substantially contributed to the current format of | |||
the document: | the document: | |||
Stephane Litkowski | Stephane Litkowski | |||
Cisco System | Cisco System | |||
slitkows.ietf@gmail.com | slitkows.ietf@gmail.com | |||
Adam Chappell | Adam Chappell | |||
GTT Communications, Inc. | GTT Communications, Inc. | |||
Aspira Business Centre | Aspira Business Centre | |||
Bucharova 2928/14a | Bucharova 2928/14a | |||
158 00 Prague 13 Stodulky | 158 00 Prague 13 Stodulky | |||
Czech Republic | Czech Republic | |||
adam.chappell@gtt.net | adam.chappell@gtt.net | |||
11. References | 9. References | |||
11.1. Normative References | 9.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A | [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A | |||
Border Gateway Protocol 4 (BGP-4)", RFC 4271, | Border Gateway Protocol 4 (BGP-4)", RFC 4271, | |||
DOI 10.17487/RFC4271, January 2006, | DOI 10.17487/RFC4271, January 2006, | |||
<https://www.rfc-editor.org/info/rfc4271>. | <https://www.rfc-editor.org/info/rfc4271>. | |||
[RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route | [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route | |||
Reflection: An Alternative to Full Mesh Internal BGP | Reflection: An Alternative to Full Mesh Internal BGP | |||
(IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, | (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, | |||
<https://www.rfc-editor.org/info/rfc4456>. | <https://www.rfc-editor.org/info/rfc4456>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
11.2. Informative References | 9.2. Informative References | |||
[ISO10589] | [ISO10589] | |||
International Organization for Standardization, | International Organization for Standardization, | |||
"Intermediate system to Intermediate system intra-domain | "Intermediate system to Intermediate system intra-domain | |||
routeing information exchange protocol for use in | routeing information exchange protocol for use in | |||
conjunction with the protocol for providing the | conjunction with the protocol for providing the | |||
connectionless-mode Network Service (ISO 8473)", ISO/ | connectionless-mode Network Service (ISO 8473)", ISO/ | |||
IEC 10589:2002, Second Edition, Nov 2002. | IEC 10589:2002, Second Edition, Nov 2002. | |||
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, | [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, | |||
DOI 10.17487/RFC2328, April 1998, | DOI 10.17487/RFC2328, April 1998, | |||
<https://www.rfc-editor.org/info/rfc2328>. | <https://www.rfc-editor.org/info/rfc2328>. | |||
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private | [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private | |||
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February | Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February | |||
2006, <https://www.rfc-editor.org/info/rfc4364>. | 2006, <https://www.rfc-editor.org/info/rfc4364>. | |||
[RFC4798] De Clercq, J., Ooms, D., Prevost, S., and F. Le Faucheur, | ||||
"Connecting IPv6 Islands over IPv4 MPLS Using IPv6 | ||||
Provider Edge Routers (6PE)", RFC 4798, | ||||
DOI 10.17487/RFC4798, February 2007, | ||||
<https://www.rfc-editor.org/info/rfc4798>. | ||||
[RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF | [RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF | |||
for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008, | for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008, | |||
<https://www.rfc-editor.org/info/rfc5340>. | <https://www.rfc-editor.org/info/rfc5340>. | |||
[RFC6774] Raszuk, R., Ed., Fernando, R., Patel, K., McPherson, D., | ||||
and K. Kumaki, "Distribution of Diverse BGP Paths", | ||||
RFC 6774, DOI 10.17487/RFC6774, November 2012, | ||||
<https://www.rfc-editor.org/info/rfc6774>. | ||||
[RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and | [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and | |||
S. Ray, "North-Bound Distribution of Link-State and | S. Ray, "North-Bound Distribution of Link-State and | |||
Traffic Engineering (TE) Information Using BGP", RFC 7752, | Traffic Engineering (TE) Information Using BGP", RFC 7752, | |||
DOI 10.17487/RFC7752, March 2016, | DOI 10.17487/RFC7752, March 2016, | |||
<https://www.rfc-editor.org/info/rfc7752>. | <https://www.rfc-editor.org/info/rfc7752>. | |||
[RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, | [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, | |||
"Advertisement of Multiple Paths in BGP", RFC 7911, | "Advertisement of Multiple Paths in BGP", RFC 7911, | |||
DOI 10.17487/RFC7911, July 2016, | DOI 10.17487/RFC7911, July 2016, | |||
<https://www.rfc-editor.org/info/rfc7911>. | <https://www.rfc-editor.org/info/rfc7911>. | |||
Appendix A. Appendix: alternative solutions with limited applicability | ||||
One possible valid solution or workaround to the best path selection | ||||
problem requires sending all domain external paths from the route | ||||
reflector to all its clients. This approach suffers the significant | ||||
drawback of pushing a large amount of BGP state and churn to all edge | ||||
routers. Many networks receive full Internet routing information in | ||||
a large number of locations. This could easily result in tens of | ||||
paths for each prefix that would need to be distributed to clients. | ||||
Notwithstanding this drawback, there are a number of reasons for | ||||
sending more than just the single best path to the clients. Improved | ||||
path diversity at the edge is a requirement for fast connectivity | ||||
restoration, and a requirement for effective BGP level load | ||||
balancing. | ||||
In practical terms, add/diverse path deployments [RFC7911] [RFC6774] | ||||
are expected to result in the distribution of 2, 3, or n (where n is | ||||
a small number) good paths rather than all domain external paths. | ||||
When the route reflector chooses one set of n paths and distributes | ||||
them to all its route reflector clients, those n paths may not be the | ||||
right n paths for all clients. In the context of the problem | ||||
described above, those n paths will not necessarily include the | ||||
closest exit point out of the network for each route reflector | ||||
client. The mechanisms proposed in this document are likely to be | ||||
complementary to mechanisms aimed at improving path diversity. | ||||
Another possibility to optimize exit point selection is the | ||||
implementation of distributed route reflector functionality at key | ||||
IGP locations in order to ensure that these locations see their | ||||
viewpoints respected in exit selection. Typically, however, this | ||||
requires the installation of physical nodes to implement the | ||||
reflection, and if exit policy subsequently changes, the reflector | ||||
placement and position can become inappropriate. | ||||
To counter the burden of physical installation, it is possible to | ||||
build a logical overlay of tunnels with appropriate IGP metrics in | ||||
order to simulate closeness to key locations required to implement | ||||
exit policy. There is significant complexity overhead in this | ||||
approach, however, enough so to typically make it undesirable. | ||||
Trends in control plane decoupling are causing a shift from | ||||
traditional routers to compute virtualization platforms, or even | ||||
third-party cloud platforms. As a result, without this proposal, | ||||
operators are left with a difficult choice for the distribution and | ||||
reflection of address families with significant exit diversity: | ||||
o centralized path selection, and tolerate the associated suboptimal | ||||
paths, or | ||||
o defer selection to end clients, but lose potential route scale | ||||
capacity | ||||
The latter can be a viable option, but it is clearly a decision that | ||||
needs to be made on an application and address family basis, with | ||||
strong consideration for the number of available paths per prefix | ||||
(which may even vary per prefix range, depending on peering policy, | ||||
e.g. consider bilateral peerings versus onward transit arrangements) | ||||
Authors' Addresses | Authors' Addresses | |||
Robert Raszuk (editor) | Robert Raszuk (editor) | |||
NTT Network Innovations | NTT Network Innovations | |||
Email: robert@raszuk.net | Email: robert@raszuk.net | |||
Christian Cassar | Christian Cassar | |||
Tesla | Tesla | |||
43 Avro Way | 43 Avro Way | |||
End of changes. 52 change blocks. | ||||
318 lines changed or deleted | 150 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |