draft-ietf-grow-bgp-med-considerations-05.txt | rfc4451.txt | |||
---|---|---|---|---|
INTERNET-DRAFT Danny McPherson | ||||
Arbor Networks, Inc. | Network Working Group D. McPherson | |||
Vijay Gill | Request for Comments: 4451 Arbor Networks, Inc. | |||
Category: Informational V. Gill | ||||
AOL | AOL | |||
Category Informational | March 2006 | |||
Expires: June 2006 December 2005 | ||||
BGP MULTI_EXIT_DISC (MED) Considerations | BGP MULTI_EXIT_DISC (MED) Considerations | |||
<draft-ietf-grow-bgp-med-considerations-05.txt> | ||||
Status of this Memo | ||||
By submitting this Internet-Draft, each author represents that any | Status of This Memo | |||
applicable patent or other IPR claims of which he or she is aware have | ||||
been or will be disclosed, and any of which he or she becomes aware | ||||
will be disclosed, in accordance with Section 6 of BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF), its areas, and its working groups. Note that other | ||||
groups may also distribute working documents as Internet-Drafts. | ||||
Internet-Drafts are draft documents valid for a maximum of six | ||||
months and may be updated, replaced, or obsoleted by other documents | ||||
at any time. It is inappropriate to use Internet-Drafts as reference | ||||
material or to cite them other than as "work in progress". | ||||
The list of current Internet-Drafts can be accessed at | ||||
http://www.ietf.org/1id-abstracts.html | ||||
The list of Internet-Draft Shadow Directories can be accessed at | This memo provides information for the Internet community. It does | |||
http://www.ietf.org/shadow.html. | not specify an Internet standard of any kind. Distribution of this | |||
memo is unlimited. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (C) The Internet Society (2005). All Rights Reserved. | Copyright (C) The Internet Society (2006). | |||
Abstract | Abstract | |||
The BGP MULTI_EXIT_DISC (MED) attribute provides a mechanism for BGP | The BGP MULTI_EXIT_DISC (MED) attribute provides a mechanism for BGP | |||
speakers to convey to an adjacent AS the optimal entry point into the | speakers to convey to an adjacent AS the optimal entry point into the | |||
local AS. While BGP MEDs function correctly in many scenarios, there | local AS. While BGP MEDs function correctly in many scenarios, a | |||
are a number of issues which may arise when utilizing MEDs in dynamic | number of issues may arise when utilizing MEDs in dynamic or complex | |||
or complex topologies. | topologies. | |||
This document discusses implementation and deployment considerations | This document discusses implementation and deployment considerations | |||
regarding BGP MEDs and provides information which implementors and | regarding BGP MEDs and provides information with which implementers | |||
network operators should be familiar with. | and network operators should be familiar. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction ....................................................3 | |||
2. Specification of Requirements. . . . . . . . . . . . . . . . . 4 | 2. Specification of Requirements ...................................3 | |||
2.1. About the MULTI_EXIT_DISC (MED) Attribute . . . . . . . . . 4 | 2.1. About the MULTI_EXIT_DISC (MED) Attribute ..................3 | |||
2.2. MEDs and Potatos. . . . . . . . . . . . . . . . . . . . . . 6 | 2.2. MEDs and Potatoes ..........................................5 | |||
3. Implementation and Protocol Considerations . . . . . . . . . . 7 | 3. Implementation and Protocol Considerations ......................6 | |||
3.1. MULTI_EXIT_DISC is a Optional Non-Transitive | 3.1. MULTI_EXIT_DISC Is an Optional Non-Transitive Attribute ....6 | |||
Attribute. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 3.2. MED Values and Preferences .................................6 | |||
3.2. MED Values and Preferences. . . . . . . . . . . . . . . . . 7 | 3.3. Comparing MEDs between Different Autonomous Systems ........7 | |||
3.3. Comparing MEDs Between Different Autonomous | 3.4. MEDs, Route Reflection, and AS Confederations for BGP ......7 | |||
Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 3.5. Route Flap Damping and MED Churn ...........................8 | |||
3.4. MEDs, Route Reflection and AS Confederations | 3.6. Effects of MEDs on Update Packing Efficiency ...............9 | |||
for BGP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 3.7. Temporal Route Selection ...................................9 | |||
3.5. Route Flap Damping and MED Churn. . . . . . . . . . . . . . 9 | 4. Deployment Considerations ......................................10 | |||
3.6. Effects of MEDs on Update Packing Efficiency. . . . . . . . 10 | 4.1. Comparing MEDs between Different Autonomous Systems .......10 | |||
3.7. Temporal Route Selection. . . . . . . . . . . . . . . . . . 11 | 4.2. Effects of Aggregation on MEDs ............................11 | |||
4. Deployment Considerations. . . . . . . . . . . . . . . . . . . 11 | 5. Security Considerations ........................................11 | |||
4.1. Comparing MEDs Between Different Autonomous | 6. Acknowledgements ...............................................11 | |||
Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 7. References .....................................................12 | |||
4.2. Effects of Aggregation on MEDs` . . . . . . . . . . . . . . 12 | 7.1. Normative References ......................................12 | |||
5. IANA Considerations. . . . . . . . . . . . . . . . . . . . . . 12 | 7.2. Informative References ....................................12 | |||
6. Security Considerations. . . . . . . . . . . . . . . . . . . . 12 | ||||
7. Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . 13 | ||||
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | ||||
8.1. Normative References. . . . . . . . . . . . . . . . . . . . 15 | ||||
8.2. Informative References. . . . . . . . . . . . . . . . . . . 16 | ||||
9. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 16 | ||||
1. Introduction | 1. Introduction | |||
The BGP MED attribute provides a mechanism for BGP speakers to convey | The BGP MED attribute provides a mechanism for BGP speakers to convey | |||
to an adjacent AS the optimal entry point into the local AS. While | to an adjacent AS the optimal entry point into the local AS. While | |||
BGP MEDs function correctly in many scenarios, there are a number of | BGP MEDs function correctly in many scenarios, a number of issues may | |||
issues which may arise when utilizing MEDs in dynamic or complex | arise when utilizing MEDs in dynamic or complex topologies. | |||
topologies. | ||||
While reading this document it's important to keep in mind that the | While reading this document, note that the goal is to discuss both | |||
goal is to discuss both implementation and deployment considerations | implementation and deployment considerations regarding BGP MEDs. In | |||
regarding BGP MEDs. In addition, the intention is to provide | addition, the intention is to provide guidance that both implementors | |||
guidance that both implementors and network operators should be | and network operators should be familiar with. In some instances, | |||
familiar with. In some instances implementation advice varies from | implementation advice varies from deployment advice. | |||
deployment advice. | ||||
2. Specification of Requirements | 2. Specification of Requirements | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in [RFC 2119]. | document are to be interpreted as described in [RFC 2119]. | |||
2.1. About the MULTI_EXIT_DISC (MED) Attribute | 2.1. About the MULTI_EXIT_DISC (MED) Attribute | |||
The BGP MULTI_EXIT_DISC (MED) attribute, formerly known as the | The BGP MULTI_EXIT_DISC (MED) attribute, formerly known as the | |||
INTER_AS_METRIC, is currently defined in section 5.1.4 of [BGP4], as | INTER_AS_METRIC, is currently defined in section 5.1.4 of [BGP4], as | |||
follows: | follows: | |||
The MULTI_EXIT_DISC is an optional non-transitive attribute which | The MULTI_EXIT_DISC is an optional non-transitive attribute that | |||
is intended to be used on external (inter-AS) links to discriminate | is intended to be used on external (inter-AS) links to | |||
among multiple exit or entry points to the same neighboring AS. | discriminate among multiple exit or entry points to the same | |||
The value of the MULTI_EXIT_DISC attribute is a four octet unsigned | neighboring AS. The value of the MULTI_EXIT_DISC attribute is a | |||
number which is called a metric. All other factors being equal, the | four-octet unsigned number, called a metric. All other factors | |||
exit point with lower metric SHOULD be preferred. If received over | being equal, the exit point with the lower metric SHOULD be | |||
EBGP, the MULTI_EXIT_DISC attribute MAY be propagated over IBGP to | preferred. If received over External BGP (EBGP), the | |||
other BGP speakers within the same AS (see also 9.1.2.2). The | MULTI_EXIT_DISC attribute MAY be propagated over Internal BGP | |||
MULTI_EXIT_DISC attribute received from a neighboring AS MUST NOT | (IBGP) to other BGP speakers within the same AS (see also | |||
be propagated to other neighboring ASs. | 9.1.2.2). The MULTI_EXIT_DISC attribute received from a | |||
neighboring AS MUST NOT be propagated to other neighboring ASes. | ||||
A BGP speaker MUST implement a mechanism based on local | A BGP speaker MUST implement a mechanism (based on local | |||
configuration which allows the MULTI_EXIT_DISC attribute to be | configuration) that allows the MULTI_EXIT_DISC attribute to be | |||
removed from a route. If a BGP speaker is configured to remove the | removed from a route. If a BGP speaker is configured to remove | |||
MULTI_EXIT_DISC attribute from a route, then this removal MUST be | the MULTI_EXIT_DISC attribute from a route, then this removal MUST | |||
done prior to determining the degree of preference of the route and | be done prior to determining the degree of preference of the route | |||
performing route selection (Decision Process phases 1 and 2). | and prior to performing route selection (Decision Process phases 1 | |||
and 2). | ||||
An implementation MAY also (based on local configuration) alter the | An implementation MAY also (based on local configuration) alter | |||
value of the MULTI_EXIT_DISC attribute received over EBGP. If a | the value of the MULTI_EXIT_DISC attribute received over EBGP. If | |||
BGP speaker is configured to alter the value of the MULTI_EXIT_DISC | a BGP speaker is configured to alter the value of the | |||
attribute received over EBGP, then altering the value MUST be done | MULTI_EXIT_DISC attribute received over EBGP, then altering the | |||
prior to determining the degree of preference of the route and | value MUST be done prior to determining the degree of preference | |||
performing route selection (Decision Process phases 1 and 2). See | of the route and prior to performing route selection (Decision | |||
Section 9.1.2.2 of BGP4] for necessary restrictions on this. | Process phases 1 and 2). See Section 9.1.2.2 for necessary | |||
restrictions on this. | ||||
Section 9.1.2.2 (c) of [BGP4] defines the following route selection | Section 9.1.2.2 (c) of [BGP4] defines the following route selection | |||
criteria regarding MEDs: | criteria regarding MEDs: | |||
c) Remove from consideration routes with less-preferred | c) Remove from consideration routes with less-preferred | |||
MULTI_EXIT_DISC attributes. MULTI_EXIT_DISC is only comparable | MULTI_EXIT_DISC attributes. MULTI_EXIT_DISC is only comparable | |||
between routes learned from the same neighboring AS (the neighbor- | between routes learned from the same neighboring AS (the | |||
ing AS is determined from the AS_PATH attribute). Routes which do | neighboring AS is determined from the AS_PATH attribute). | |||
not have the MULTI_EXIT_DISC attribute are considered to have the | Routes that do not have the MULTI_EXIT_DISC attribute are | |||
lowest possible MULTI_EXIT_DISC value. | considered to have the lowest possible MULTI_EXIT_DISC value. | |||
This is also described in the following procedure: | This is also described in the following procedure: | |||
for m = all routes still under consideration | for m = all routes still under consideration | |||
for n = all routes still under consideration | for n = all routes still under consideration | |||
if (neighborAS(m) == neighborAS(n)) and (MED(n) < MED(m)) | if (neighborAS(m) == neighborAS(n)) and (MED(n) < MED(m)) | |||
remove route m from consideration | remove route m from consideration | |||
In the pseudo-code above, MED(n) is a function which returns the | In the pseudo-code above, MED(n) is a function that returns the | |||
value of route n's MULTI_EXIT_DISC attribute. If route n has no | value of route n's MULTI_EXIT_DISC attribute. If route n has | |||
MULTI_EXIT_DISC attribute, the function returns the lowest possi- | no MULTI_EXIT_DISC attribute, the function returns the lowest | |||
ble MULTI_EXIT_DISC value, i.e. 0. | possible MULTI_EXIT_DISC value (i.e., 0). | |||
If a MULTI_EXIT_DISC attribute is removed before re-advertising a | Similarly, neighborAS(n) is a function that returns the | |||
route into IBGP, then comparison based on the received EBGP | neighbor AS from which the route was received. If the route is | |||
MULTI_EXIT_DISC attribute MAY still be performed. If an | learned via IBGP, and the other IBGP speaker didn't originate | |||
implementation chooses to remove MULTI_EXIT_DISC, then the optional | the route, it is the neighbor AS from which the other IBGP | |||
comparison on MULTI_EXIT_DISC if performed at all MUST be performed | speaker learned the route. If the route is learned via IBGP, | |||
only among EBGP learned routes. The best EBGP learned route may | and the other IBGP speaker either (a) originated the route, or | |||
then be compared with IBGP learned routes after the removal of the | (b) created the route by aggregation and the AS_PATH attribute | |||
MULTI_EXIT_DISC attribute. If MULTI_EXIT_DISC is removed from a | of the aggregate route is either empty or begins with an | |||
subset of EBGP learned routes and the selected "best" EBGP learned | AS_SET, it is the local AS. | |||
route will not have MULTI_EXIT_DISC removed, then the | ||||
MULTI_EXIT_DISC must be used in the comparison with IBGP learned | ||||
routes. For IBGP learned routes the MULTI_EXIT_DISC MUST be used in | ||||
route comparisons which reach this step in the Decision Process. | ||||
Including the MULTI_EXIT_DISC of an EBGP learned route in the | If a MULTI_EXIT_DISC attribute is removed before re-advertising | |||
comparison with an IBGP learned route, then removing the | a route into IBGP, then comparison based on the received EBGP | |||
MULTI_EXIT_DISC attribute and advertising the route has been proven | MULTI_EXIT_DISC attribute MAY still be performed. If an | |||
to cause route loops. | implementation chooses to remove MULTI_EXIT_DISC, then the | |||
optional comparison on MULTI_EXIT_DISC, if performed, MUST be | ||||
performed only among EBGP-learned routes. The best EBGP- | ||||
learned route may then be compared with IBGP-learned routes | ||||
after the removal of the MULTI_EXIT_DISC attribute. If | ||||
MULTI_EXIT_DISC is removed from a subset of EBGP-learned | ||||
routes, and the selected "best" EBGP-learned route will not | ||||
have MULTI_EXIT_DISC removed, then the MULTI_EXIT_DISC must be | ||||
used in the comparison with IBGP-learned routes. For IBGP- | ||||
learned routes, the MULTI_EXIT_DISC MUST be used in route | ||||
comparisons that reach this step in the Decision Process. | ||||
Including the MULTI_EXIT_DISC of an EBGP-learned route in the | ||||
comparison with an IBGP-learned route, then removing the | ||||
MULTI_EXIT_DISC attribute, and advertising the route has been | ||||
proven to cause route loops. | ||||
2.2. MEDs and Potatos | 2.2. MEDs and Potatoes | |||
In a situation where traffic flows between a pair of hosts, each | Let's consider a situation where traffic flows between a pair of | |||
connected to different transit networks, which are themselves | hosts, each connected to a different transit network, which is in | |||
interconnected at two or more locations, each transit network has the | itself interconnected at two or more locations. Each transit network | |||
choice of either sending traffic to the closest peering to the | has the choice of either sending traffic to the closest peering to | |||
adjacent transit network or passing traffic to the interconnection | the adjacent transit network or passing traffic to the | |||
location which advertises the least cost path to the destination | interconnection location that advertises the least-cost path to the | |||
host. | destination host. | |||
The former method is called "hot potato routing" (or closest-exit) | The former method is called "hot potato routing" (or closest-exit) | |||
because like a hot potato held in bare hands, whoever has it tries to | because like a hot potato held in bare hands, whoever has it tries to | |||
get rid of it quickly. Hot potato routing is accomplished by not | get rid of it quickly. Hot potato routing is accomplished by not | |||
passing the EGBP learned MED into IBGP. This minimizes transit | passing the EBGP-learned MED into IBGP. This minimizes transit | |||
traffic for the provider routing the traffic. Far less common is | traffic for the provider routing the traffic. Far less common is | |||
"cold potato routing" (or best-exit) where the transit provider uses | "cold potato routing" (or best-exit) where the transit provider uses | |||
their own transit capacity to get the traffic to the point that | its own transit capacity to get the traffic to the point that | |||
adjacent transit provider advertised as being closest to the | adjacent transit provider advertised as being closest to the | |||
destination. Cold potato routing is accomplished by passing the EBGP | destination. Cold potato routing is accomplished by passing the | |||
learned MED into IBGP. | EBGP-learned MED into IBGP. | |||
If one transit provider uses hot potato routing and another uses cold | If one transit provider uses hot potato routing and another uses cold | |||
potato, traffic between the two tends to be more symmetric. However, | potato, traffic between the two tends to be more symmetric. However, | |||
if both providers employ cold potato routing, or both providers | if both providers employ cold potato routing or hot potato routing | |||
employ hot potato routing between their networks, it's likely that a | between their networks, it's likely that a larger amount of asymmetry | |||
larger amount of asymmetry would exist. | would exist. | |||
Depending on the business relationships, if one provider has more | Depending on the business relationships, if one provider has more | |||
capacity or a significantly less congested backbone network, then | capacity or a significantly less congested backbone network, then | |||
that provider may use cold potato routing. An example of widespread | that provider may use cold potato routing. An example of widespread | |||
use of cold potato routing was the NSF funded NSFNET backbone and NSF | use of cold potato routing was the NSF-funded NSFNET backbone and | |||
funded regional networks in the mid 1990s. | NSF-funded regional networks in the mid-1990s. | |||
In some cases a provider may use hot potato routing for some | In some cases, a provider may use hot potato routing for some | |||
destinations for a given peer AS and cold potato routing for others. | destinations for a given peer AS and cold potato routing for others. | |||
An example of this is the different treatment of commercial and | An example of this is the different treatment of commercial and | |||
research traffic in the NSFNET in the mid 1990s. Today many | research traffic in the NSFNET in the mid-1990s. Today, many | |||
commercial networks exchange MEDs with customers but not bilateral | commercial networks exchange MEDs with customers but not with | |||
peers. However, commercial use of MEDs varies widely, from | bilateral peers. However, commercial use of MEDs varies widely, from | |||
ubiquitous use of MEDs to no use of MEDs at all. | ubiquitous use to none at all. | |||
In addition, many deployments of MEDs today are likely behaving | In addition, many deployments of MEDs today are likely behaving | |||
differently (e.g., resulting is sub-optimal routing) than the network | differently (e.g., resulting in sub-optimal routing) than the network | |||
operator intended, thereby resulting not in hot or cold potatos, but | operator intended, which results not in hot or cold potatoes, but | |||
mashed potatos! More information on unintended behavior resulting | mashed potatoes! More information on unintended behavior resulting | |||
from MEDs is provided throughout this document. | from MEDs is provided throughout this document. | |||
3. Implementation and Protocol Considerations | 3. Implementation and Protocol Considerations | |||
There are a number of implementation and protocol peculiarities | There are a number of implementation and protocol peculiarities | |||
relating to MEDs that have been discovered that may affect network | relating to MEDs that have been discovered that may affect network | |||
behavior. The following sections provide information on these | behavior. The following sections provide information on these | |||
issues. | issues. | |||
3.1. MULTI_EXIT_DISC is a Optional Non-Transitive Attribute | 3.1. MULTI_EXIT_DISC Is an Optional Non-Transitive Attribute | |||
MULTI_EXIT_DISC is a non-transitive optional attribute whose | MULTI_EXIT_DISC is a non-transitive optional attribute whose | |||
advertisement to both IBGP and EBGP peers is discretionary. As a | advertisement to both IBGP and EBGP peers is discretionary. As a | |||
result, some implementations enable sending of MEDs to IBGP peers by | result, some implementations enable sending of MEDs to IBGP peers by | |||
default, while others do not. This behavior may result in sub- | default, while others do not. This behavior may result in sub- | |||
optimal route selection within an AS. In addition, some | optimal route selection within an AS. In addition, some | |||
implementations send MEDs to EBGP peers by default, while others do | implementations send MEDs to EBGP peers by default, while others do | |||
not. This behavior may result in sub-optimal inter-domain route | not. This behavior may result in sub-optimal inter-domain route | |||
selection. | selection. | |||
3.2. MED Values and Preferences | 3.2. MED Values and Preferences | |||
Some implementations consider an MED value of zero as less preferable | Some implementations consider an MED value of zero less preferable | |||
than no MED value. This behavior resulted in path selection | than no MED value. This behavior resulted in path selection | |||
inconsistencies within an AS. The current draft version of the BGP | inconsistencies within an AS. The current version of the BGP | |||
specification [BGP4] removes ambiguities that existed in [RFC 1771] | specification [BGP4] removes ambiguities that existed in [RFC1771] by | |||
by stating that if route n has no MULTI_EXIT_DISC attribute, the | stating that if route n has no MULTI_EXIT_DISC attribute, the lowest | |||
lowest possible MULTI_EXIT_DISC value (i.e. 0) should be assigned to | possible MULTI_EXIT_DISC value (i.e., 0) should be assigned to the | |||
the attribute. | attribute. | |||
It is apparent that different implementations and different versions | It is apparent that different implementations and different versions | |||
of the BGP draft specification have been all over the map with | of the BGP specification have been all over the map with | |||
interpretation of missing-MED. For example, earlier versions of the | interpretation of missing-MED. For example, earlier versions of the | |||
specification called for a missing MED to be assigned the highest | specification called for a missing MED to be assigned the highest | |||
possible MED value (i.e., 2^32-1). | possible MED value (i.e., 2^32-1). | |||
In addition, some implementations have been shown to internally | In addition, some implementations have been shown to internally | |||
employ a maximum possible MED value (2^32-1) as an "infinity" metric | employ a maximum possible MED value (2^32-1) as an "infinity" metric | |||
(i.e., the MED value is used to tag routes as unfeasible), and would | (i.e., the MED value is used to tag routes as unfeasible); upon | |||
upon on receiving an update with an MED value of 2^32-1 rewrite the | receiving an update with an MED value of 2^32-1, they would rewrite | |||
value to 2^32-2. Subsequently, the new MED value would be propagated | the value to 2^32-2. Subsequently, the new MED value would be | |||
and could result in routing inconsistencies or unintended path | propagated and could result in routing inconsistencies or unintended | |||
selections. | path selections. | |||
As a result of implementation inconsistencies and protocol revision | As a result of implementation inconsistencies and protocol revision | |||
variances, many network operators today explicitly reset (i.e., set | variances, many network operators today explicitly reset (i.e., set | |||
to zero or some other 'fixed' value) all MED values on ingress to | to zero or some other 'fixed' value) all MED values on ingress to | |||
conform to their internal routing policies (i.e., to include policy | conform to their internal routing policies (i.e., to include policy | |||
that requires that MED values of 0 and 2^32-1 not be used in | that requires that MED values of 0 and 2^32-1 not be used in | |||
configurations, whether the MEDs are directly computed or | configurations, whether the MEDs are directly computed or | |||
configured), so as to not have to rely on all their routers having | configured), so as not to have to rely on all their routers having | |||
the same missing-MED behavior. | the same missing-MED behavior. | |||
Because implementations don't normally provide a mechanism to disable | Because implementations don't normally provide a mechanism to disable | |||
MED comparisons in the decision algorithm, "not using MEDs" usually | MED comparisons in the decision algorithm, "not using MEDs" usually | |||
entails explicitly setting all MEDs to some fixed value upon ingress | entails explicitly setting all MEDs to some fixed value upon ingress | |||
to the routing domain. By assigning a fixed MED value consistently | to the routing domain. By assigning a fixed MED value consistently | |||
to all routes across the network, MEDs are a effectively a non-issue | to all routes across the network, MEDs are a effectively a non-issue | |||
in the decision algorithm. | in the decision algorithm. | |||
3.3. Comparing MEDs Between Different Autonomous Systems | 3.3. Comparing MEDs between Different Autonomous Systems | |||
The MED was intended to be used on external (inter-AS) links to | The MED was intended to be used on external (inter-AS) links to | |||
discriminate among multiple exit or entry points to the same | discriminate among multiple exit or entry points to the same | |||
neighboring AS. However, a large number of MED applications now | neighboring AS. However, a large number of MED applications now | |||
employ MEDs for the purpose of determining route preference between | employ MEDs for the purpose of determining route preference between | |||
like routes received from different autonomous systems. | like routes received from different autonomous systems. | |||
A large number of implementations provide the capability to enable | A large number of implementations provide the capability to enable | |||
comparison of MEDs between routes received from different neighboring | comparison of MEDs between routes received from different neighboring | |||
autonomous systems. While this capability has demonstrated some | autonomous systems. While this capability has demonstrated some | |||
benefit (e.g., that described in [RFC 3345]), operators should be | benefit (e.g., that described in [RFC3345]), operators should be wary | |||
wary of the potential side effects with enabling such a function. | of the potential side effects of enabling such a function. The | |||
The deployment section below provides some examples as to why this | deployment section below provides some examples as to why this may | |||
may result in undesirable behavior. | result in undesirable behavior. | |||
3.4. MEDs, Route Reflection and AS Confederations for BGP | 3.4. MEDs, Route Reflection, and AS Confederations for BGP | |||
In particular configurations, the BGP scaling mechanisms defined in | In particular configurations, the BGP scaling mechanisms defined in | |||
"BGP Route Reflection - An Alternative to Full Mesh IBGP" [RFC 2796] | "BGP Route Reflection - An Alternative to Full Mesh IBGP" [RFC 2796] | |||
and "Autonomous System Confederations for BGP" [RFC 3065] will | and "Autonomous System Confederations for BGP" [RFC 3065] will | |||
introduce persistent BGP route oscillation [RFC 3345]. The problem | introduce persistent BGP route oscillation [RFC3345]. The problem is | |||
is inherent in the way BGP works: a conflict exists between | inherent in the way BGP works: a conflict exists between information | |||
information hiding/hierarchy and the non-hierarchical selection | hiding/hierarchy and the non-hierarchical selection process imposed | |||
process imposed by lack of total ordering caused by the MED rules. | by lack of total ordering caused by the MED rules. Given current | |||
Given current practices, we see the problem most frequently manifest | practices, we see the problem manifest itself most frequently in the | |||
itself in the context of MED + route reflectors or confederations. | context of MED + route reflectors or confederations. | |||
One potential way to avoid this is by configuring inter-Member-AS or | One potential way to avoid this is by configuring inter-Member-AS or | |||
inter-cluster IGP metrics higher than intra-Member-AS IGP metrics | inter-cluster IGP metrics higher than intra-Member-AS IGP metrics | |||
and/or using other tie breaking policies to avoid BGP route selection | and/or using other tie-breaking policies to avoid BGP route selection | |||
based on incomparable MEDs. Of course, IGP metric constraints may be | based on incomparable MEDs. Of course, IGP metric constraints may be | |||
unreasonably onerous for some applications. | unreasonably onerous for some applications. | |||
Not comparing MEDs between multiple paths for a prefix learned from | Not comparing MEDs between multiple paths for a prefix learned from | |||
different adjacent autonomous systems, as discussed in section 2.3), | different adjacent autonomous systems, as discussed in section 2.3, | |||
or not utilizing MEDs at all, significantly decreases the probability | or not utilizing MEDs at all, significantly decreases the probability | |||
of introducing potential route oscillation conditions into the | of introducing potential route oscillation conditions into the | |||
network. | network. | |||
Although perhaps "legal" as far as current specifications are | Although perhaps "legal" as far as current specifications are | |||
concerned, modifying MED attributes received on any type of IBGP | concerned, modifying MED attributes received on any type of IBGP | |||
session (e.g., standard IBGP, AS confederations EIBGP, route | session (e.g., standard IBGP, EBGP sessions between Member-ASes of a | |||
reflection, etc..) is not recommended. | BGP confederation, route reflection, etc.) is not recommended. | |||
3.5. Route Flap Damping and MED Churn | 3.5. Route Flap Damping and MED Churn | |||
MEDs are often derived dynamically from IGP metrics or additive costs | MEDs are often derived dynamically from IGP metrics or additive costs | |||
associated with an IGP metric to a given BGP NEXT_HOP. This | associated with an IGP metric to a given BGP NEXT_HOP. This | |||
typically provides an efficient model for ensuring that the BGP MED | typically provides an efficient model for ensuring that the BGP MED | |||
advertised to peers used to represent the best path to a given | advertised to peers, used to represent the best path to a given | |||
destination within the network is aligned with that of the IGP within | destination within the network, is aligned with that of the IGP | |||
a given AS. | within a given AS. | |||
The consequence with dynamically derived IGP-based MEDs is that | The consequence with dynamically derived IGP-based MEDs is that | |||
instability within an AS, or even on a single given link within the | instability within an AS, or even on a single given link within the | |||
AS, can result in wide-spread BGP instability or BGP route | AS, can result in widespread BGP instability or BGP route | |||
advertisement churn that propagates across multiple domains. In | advertisement churn that propagates across multiple domains. In | |||
short, if your MED "flaps" every time your IGP metric flaps, you're | short, if your MED "flaps" every time your IGP metric flaps, your | |||
routes are likely going to be suppressed as a result of BGP Route | routes are likely going to be suppressed as a result of BGP Route | |||
Flap Damping [RFC 2439]. | Flap Damping [RFC 2439]. | |||
Employment of MEDs may compound the adverse effects of BGP flap | Employment of MEDs may compound the adverse effects of BGP flap- | |||
dampening behavior because it many cause routes to be re- advertised | dampening behavior because it may cause routes to be re-advertised | |||
solely to reflect an internal topology change. | solely to reflect an internal topology change. | |||
Many implementations don't have a practical problem with IGP | Many implementations don't have a practical problem with IGP | |||
flapping, they either latch their IGP metric upon first advertisement | flapping; they either latch their IGP metric upon first advertisement | |||
or they employ some internal suppression mechanism. Some | or employ some internal suppression mechanism. Some implementations | |||
implementations regard BGP attribute changes as less significant than | regard BGP attribute changes as less significant than route | |||
route withdrawals and announcements to attempt to mitigate the impact | withdrawals and announcements to attempt to mitigate the impact of | |||
of this type of event. | this type of event. | |||
3.6. Effects of MEDs on Update Packing Efficiency | 3.6. Effects of MEDs on Update Packing Efficiency | |||
Multiple unfeasible routes can be advertised in a single BGP Update | Multiple unfeasible routes can be advertised in a single BGP Update | |||
message. The BGP4 protocol also permits advertisement of multiple | message. The BGP4 protocol also permits advertisement of multiple | |||
prefixes with a common set of path attributes to be advertised in a | prefixes with a common set of path attributes to be advertised in a | |||
single update message, this is commonly referred to as "update | single update message. This is commonly referred to as "update | |||
packing". When possible, update packing is recommended as it | packing". When possible, update packing is recommended as it | |||
provides a mechanism for more efficient behavior in a number of | provides a mechanism for more efficient behavior in a number of | |||
areas, to include: | areas, including the following: | |||
o Reduction in system overhead due to generation or receipt of | o Reduction in system overhead due to generation or receipt of | |||
fewer Update messages. | fewer Update messages. | |||
o Reduction in network overhead as a result of fewer packets and | o Reduction in network overhead as a result of fewer packets and | |||
lower bandwidth consumption. | lower bandwidth consumption. | |||
o Allows processing of path attributes and searches for matching | o Less frequent processing of path attributes and searches for | |||
sets in your AS_PATH database (if you have one) less frequently. | matching sets in your AS_PATH database (if you have one). | |||
Consistent ordering of the path attributes allows for ease of | Consistent ordering of the path attributes allows for ease of | |||
matching in the database as you don't have different | matching in the database as you don't have different | |||
representations | representations of the same data. | |||
of the same data. | ||||
Update packing requires that all feasible routes within a single | Update packing requires that all feasible routes within a single | |||
update message share a common attribute set, to include a common | update message share a common attribute set, to include a common | |||
MULTI_EXIT_DISC value. As such, potential wide-scale variance in MED | MULTI_EXIT_DISC value. As such, potential wide-scale variance in MED | |||
values introduces another variable and may resulted in a marked | values introduces another variable and may result in a marked | |||
decrease in update packing efficiency. | decrease in update packing efficiency. | |||
3.7. Temporal Route Selection | 3.7. Temporal Route Selection | |||
Some implementations have had bugs which lead to temporal behavior in | Some implementations had bugs that led to temporal behavior in | |||
MED-based best path selection. These usually involved methods used | MED-based best path selection. These usually involved methods to | |||
to store the oldest route along with ordering routes for MED in | store the oldest route and to order routes for MED, which caused | |||
earlier implementations that cause non-deterministic behavior on | non-deterministic behavior as to whether or not the oldest route | |||
whether the oldest route would truly be selected or not. | would truly be selected. | |||
The reasoning for this is that older paths are presumably more | The reasoning for this is that older paths are presumably more | |||
stable, and thus more preferable. However, temporal behavior in | stable, and thus preferable. However, temporal behavior in route | |||
route selection results in non-deterministic behavior, and as such, | selection results in non-deterministic behavior and, as such, is | |||
is often undesirable. | often undesirable. | |||
4. Deployment Considerations | 4. Deployment Considerations | |||
It has been discussed that accepting MEDs from other autonomous | It has been discussed that accepting MEDs from other autonomous | |||
systems have the potential to cause traffic flow churns in the | systems has the potential to cause traffic flow churns in the | |||
network. Some implementations only ratchet down the MED and never | network. Some implementations only ratchet down the MED and never | |||
move it back up to prevent excessive churn. | move it back up to prevent excessive churn. | |||
However, if a session is reset, the MEDs being advertised have the | However, if a session is reset, the MEDs being advertised have the | |||
potential of changing. If an network is relying on received MEDs to | potential of changing. If a network is relying on received MEDs to | |||
route traffic properly, the traffic patterns have the potential for | route traffic properly, the traffic patterns have the potential for | |||
changing dramatically, potentially resulting in congestion on the | changing dramatically, potentially resulting in congestion on the | |||
network. Essentially, accepting and routing traffic based on MEDs | network. Essentially, accepting and routing traffic based on MEDs | |||
allows other people to traffic engineer your network. This may or may | allows other people to traffic engineer your network. This may or | |||
not be acceptable to you. | may not be acceptable to you. | |||
As previously discussed, many network operators choose to reset MED | As previously discussed, many network operators choose to reset MED | |||
values on ingress. In addition, many operators explicitly do not | values on ingress. In addition, many operators explicitly do not | |||
employ MED values of 0 or 2^32-1 in order to avoid inconsistencies | employ MED values of 0 or 2^32-1 in order to avoid inconsistencies | |||
with implementations and various revisions of the BGP specification. | with implementations and various revisions of the BGP specification. | |||
4.1. Comparing MEDs Between Different Autonomous Systems | 4.1. Comparing MEDs between Different Autonomous Systems | |||
Although the MED was meant to only be used when comparing paths | Although the MED was meant to be used only when comparing paths | |||
received from different external peers in the same AS, many | received from different external peers in the same AS, many | |||
implementations provide the capability to compare MEDs between | implementations provide the capability to compare MEDs between | |||
different autonomous systems as well. AS operators often use | different autonomous systems as well. AS operators often use | |||
LOCAL_PREF to select the external preferences (primary, secondary | LOCAL_PREF to select the external preferences (primary, secondary | |||
upstreams, peers, customers, etc.), using MED instead of LOCAL_PREF | upstreams, peers, customers, etc.), using MED instead of LOCAL_PREF | |||
would possibility lead to an inconsistent distribution of best routes | would possibly lead to an inconsistent distribution of best routes, | |||
as MED is compared only after the AS_PATH length. | as MED is compared only after the AS_PATH length. | |||
Though this may seem a fine idea for some configurations, care must | Though this may seem like a fine idea for some configurations, care | |||
be taken when comparing MEDs between different autonomous systems. | must be taken when comparing MEDs between different autonomous | |||
BGP speakers often derive MED values by obtaining the IGP metric | systems. BGP speakers often derive MED values by obtaining the IGP | |||
associated with reaching a given BGP NEXT_HOP within the local AS. | metric associated with reaching a given BGP NEXT_HOP within the local | |||
This allows MEDs to reasonably reflect IGP topologies when | AS. This allows MEDs to reasonably reflect IGP topologies when | |||
advertising routes to peers. While this is fine when comparing MEDs | advertising routes to peers. While this is fine when comparing MEDs | |||
between multiple paths learned from a single AS, it can result in | between multiple paths learned from a single AS, it can result in | |||
potentially "weighted" decisions when comparing MEDs between | potentially "weighted" decisions when comparing MEDs between | |||
different autonomous systems. This is most typically the case when | different autonomous systems. This is most typically the case when | |||
the autonomous systems use different mechanisms to derive IGP | the autonomous systems use different mechanisms to derive IGP metrics | |||
metrics, BGP MEDs, or perhaps even use different IGP protocols with | or BGP MEDs, or when they perhaps even use different IGP protocols | |||
vastly contrasting metric spaces (e.g., OSPF v. traditional metric | with vastly contrasting metric spaces (e.g., OSPF vs. traditional | |||
space in IS-IS). | metric space in IS-IS). | |||
4.2. Effects of Aggregation on MEDs` | 4.2. Effects of Aggregation on MEDs | |||
Another MED deployment consideration involves the impact that | Another MED deployment consideration involves the impact that | |||
aggregation of BGP routing information has on MEDs. Aggregates are | aggregation of BGP routing information has on MEDs. Aggregates are | |||
often generated from multiple locations in an AS in order to | often generated from multiple locations in an AS in order to | |||
accommodate stability, redundancy and other network design goals. | accommodate stability, redundancy, and other network design goals. | |||
When MEDs are derived from IGP metrics associated with said | When MEDs are derived from IGP metrics associated with said | |||
aggregates the MED value advertised to peers can result in very | aggregates, the MED value advertised to peers can result in very | |||
suboptimal routing. | suboptimal routing. | |||
5. IANA Considerations | 5. Security Considerations | |||
This document introduces no new IANA considerations. | ||||
6. Security Considerations | ||||
The MED was purposely designed to be a "weak" metric that would only | The MED was purposely designed to be a "weak" metric that would only | |||
be used late in the best-path decision process. The BGP working | be used late in the best-path decision process. The BGP working | |||
group was concerned that any metric specified by a remote operator | group was concerned that any metric specified by a remote operator | |||
would only affect routing in a local AS IF no other preference was | would only affect routing in a local AS if no other preference was | |||
specified. A paramount goal of the design of the MED was to ensure | specified. A paramount goal of the design of the MED was to ensure | |||
that peers could not "shed" or "absorb" traffic for networks that | that peers could not "shed" or "absorb" traffic for networks that | |||
they advertise. As such, accepting MEDs from peers may in some sense | they advertise. As such, accepting MEDs from peers may in some sense | |||
increase a network's susceptibility to exploitation by peers. | increase a network's susceptibility to exploitation by peers. | |||
7. Acknowledgments | 6. Acknowledgements | |||
Thanks to John Scudder for applying his usual keen eye and | Thanks to John Scudder for applying his usual keen eye and | |||
constructive insight. Also, thanks to Curtis Villamizar, JR Mitchell | constructive insight. Also, thanks to Curtis Villamizar, JR | |||
and Pekka Savola for their valuable feedback. | Mitchell, and Pekka Savola for their valuable feedback. | |||
8. References | ||||
8.1. Normative References | 7. References | |||
[RFC 1519] Fuller, V., Li. T., Yu J., and K. Varadhan, "Classless | 7.1. Normative References | |||
Inter-Domain Routing (CIDR): an Address Assignment and | ||||
Aggregation Strategy", RFC 1519, September 1993. | ||||
[RFC 1771] Rekhter, Y., and T. Li, "A Border Gateway Protocol 4 | [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP- | |||
(BGP-4)", RFC 1771, March 1995. | 4)", RFC 1771, March 1995. | |||
[RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
[RFC 2796] Bates, T., Chandra, R., Chen, E., "BGP Route Reflection | [RFC2796] Bates, T., Chandra, R., and E. Chen, "BGP Route Reflection | |||
- An Alternative to Full Mesh IBGP", RFC 2796, April | - An Alternative to Full Mesh IBGP", RFC 2796, April 2000. | |||
2000. | ||||
[RFC 3065] Traina, P., McPherson, D., Scudder, J.. "Autonomous System | [RFC3065] Traina, P., McPherson, D., and J. Scudder, "Autonomous | |||
Confederations for BGP", RFC 3065, February 2001. | System Confederations for BGP", RFC 3065, February 2001. | |||
[BGP4] Rekhter, Y., T. Li., and Hares. S, Editors, "A Border | [BGP4] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway | |||
Gateway Protocol 4 (BGP-4)", BGP Draft, Work in Progress. | Protocol 4 (BGP-4)", RFC 4271, January 2006. | |||
8.2. Informative References | 7.2. Informative References | |||
[RFC 2439] Villamizar, C. and Chandra, R., "BGP Route Flap Damping", | [RFC2439] Villamizar, C., Chandra, R., and R. Govindan, "BGP Route | |||
RFC 2439, November 1998. | Flap Damping", RFC 2439, November 1998. | |||
[RFC 3345] McPherson, D., Gill, V., Walton, D., and Retana, A, "BGP | [RFC3345] McPherson, D., Gill, V., Walton, D., and A. Retana, | |||
Persistent Route Oscillation Condition", RFC 3345, | "Border Gateway Protocol (BGP) Persistent Route | |||
August 2002. | Oscillation Condition", RFC 3345, August 2002. | |||
9. Authors' Addresses | Authors' Addresses | |||
Danny McPherson | Danny McPherson | |||
Arbor Networks | Arbor Networks | |||
Email: danny@arbor.net | ||||
EMail: danny@arbor.net | ||||
Vijay Gill | Vijay Gill | |||
AOL | AOL | |||
Email: VijayGill9@aol.com | ||||
Intellectual Property Statement | EMail: VijayGill9@aol.com | |||
Full Copyright Statement | ||||
Copyright (C) The Internet Society (2006). | ||||
This document is subject to the rights, licenses and restrictions | ||||
contained in BCP 78, and except as set forth therein, the authors | ||||
retain all their rights. | ||||
This document and the information contained herein are provided on an | ||||
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | ||||
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET | ||||
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, | ||||
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE | ||||
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | ||||
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | ||||
Intellectual Property | ||||
The IETF takes no position regarding the validity or scope of any | The IETF takes no position regarding the validity or scope of any | |||
Intellectual Property Rights or other rights that might be claimed to | Intellectual Property Rights or other rights that might be claimed to | |||
pertain to the implementation or use of the technology described in | pertain to the implementation or use of the technology described in | |||
this document or the extent to which any license under such rights | this document or the extent to which any license under such rights | |||
might or might not be available; nor does it represent that it has | might or might not be available; nor does it represent that it has | |||
made any independent effort to identify any such rights. Information | made any independent effort to identify any such rights. Information | |||
on the procedures with respect to rights in RFC documents can be | on the procedures with respect to rights in RFC documents can be | |||
found in BCP 78 and BCP 79. | found in BCP 78 and BCP 79. | |||
skipping to change at page 17, line 7 | skipping to change at page 13, line 45 | |||
such proprietary rights by implementers or users of this | such proprietary rights by implementers or users of this | |||
specification can be obtained from the IETF on-line IPR repository at | specification can be obtained from the IETF on-line IPR repository at | |||
http://www.ietf.org/ipr. | http://www.ietf.org/ipr. | |||
The IETF invites any interested party to bring to its attention any | The IETF invites any interested party to bring to its attention any | |||
copyrights, patents or patent applications, or other proprietary | copyrights, patents or patent applications, or other proprietary | |||
rights that may cover technology that may be required to implement | rights that may cover technology that may be required to implement | |||
this standard. Please address the information to the IETF at | this standard. Please address the information to the IETF at | |||
ietf-ipr@ietf.org. | ietf-ipr@ietf.org. | |||
Disclaimer of Validity | Acknowledgement | |||
This document and the information contained herein are provided on an | ||||
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | ||||
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET | ||||
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, | ||||
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE | ||||
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | ||||
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | ||||
Copyright Statement | ||||
Copyright (C) The Internet Society (2005). This document is subject | ||||
to the rights, licenses and restrictions contained in BCP 78, and | ||||
except as set forth therein, the authors retain all their rights. | ||||
Acknowledgment | ||||
Funding for the RFC Editor function is currently provided by the | Funding for the RFC Editor function is provided by the IETF | |||
Internet Society. | Administrative Support Activity (IASA). | |||
End of changes. 84 change blocks. | ||||
268 lines changed or deleted | 248 lines changed or added | |||
This html diff was produced by rfcdiff 1.29, available from http://www.levkowetz.com/ietf/tools/rfcdiff/ |