draft-ietf-grow-bgp-med-considerations-05.txt   rfc4451.txt 
INTERNET-DRAFT Danny McPherson
Arbor Networks, Inc. Network Working Group D. McPherson
Vijay Gill Request for Comments: 4451 Arbor Networks, Inc.
Category: Informational V. Gill
AOL AOL
Category Informational March 2006
Expires: June 2006 December 2005
BGP MULTI_EXIT_DISC (MED) Considerations BGP MULTI_EXIT_DISC (MED) Considerations
<draft-ietf-grow-bgp-med-considerations-05.txt>
Status of this Memo
By submitting this Internet-Draft, each author represents that any Status of This Memo
applicable patent or other IPR claims of which he or she is aware have
been or will be disclosed, and any of which he or she becomes aware
will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress".
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at This memo provides information for the Internet community. It does
http://www.ietf.org/shadow.html. not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2005). All Rights Reserved. Copyright (C) The Internet Society (2006).
Abstract Abstract
The BGP MULTI_EXIT_DISC (MED) attribute provides a mechanism for BGP The BGP MULTI_EXIT_DISC (MED) attribute provides a mechanism for BGP
speakers to convey to an adjacent AS the optimal entry point into the speakers to convey to an adjacent AS the optimal entry point into the
local AS. While BGP MEDs function correctly in many scenarios, there local AS. While BGP MEDs function correctly in many scenarios, a
are a number of issues which may arise when utilizing MEDs in dynamic number of issues may arise when utilizing MEDs in dynamic or complex
or complex topologies. topologies.
This document discusses implementation and deployment considerations This document discusses implementation and deployment considerations
regarding BGP MEDs and provides information which implementors and regarding BGP MEDs and provides information with which implementers
network operators should be familiar with. and network operators should be familiar.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction ....................................................3
2. Specification of Requirements. . . . . . . . . . . . . . . . . 4 2. Specification of Requirements ...................................3
2.1. About the MULTI_EXIT_DISC (MED) Attribute . . . . . . . . . 4 2.1. About the MULTI_EXIT_DISC (MED) Attribute ..................3
2.2. MEDs and Potatos. . . . . . . . . . . . . . . . . . . . . . 6 2.2. MEDs and Potatoes ..........................................5
3. Implementation and Protocol Considerations . . . . . . . . . . 7 3. Implementation and Protocol Considerations ......................6
3.1. MULTI_EXIT_DISC is a Optional Non-Transitive 3.1. MULTI_EXIT_DISC Is an Optional Non-Transitive Attribute ....6
Attribute. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2. MED Values and Preferences .................................6
3.2. MED Values and Preferences. . . . . . . . . . . . . . . . . 7 3.3. Comparing MEDs between Different Autonomous Systems ........7
3.3. Comparing MEDs Between Different Autonomous 3.4. MEDs, Route Reflection, and AS Confederations for BGP ......7
Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.5. Route Flap Damping and MED Churn ...........................8
3.4. MEDs, Route Reflection and AS Confederations 3.6. Effects of MEDs on Update Packing Efficiency ...............9
for BGP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.7. Temporal Route Selection ...................................9
3.5. Route Flap Damping and MED Churn. . . . . . . . . . . . . . 9 4. Deployment Considerations ......................................10
3.6. Effects of MEDs on Update Packing Efficiency. . . . . . . . 10 4.1. Comparing MEDs between Different Autonomous Systems .......10
3.7. Temporal Route Selection. . . . . . . . . . . . . . . . . . 11 4.2. Effects of Aggregation on MEDs ............................11
4. Deployment Considerations. . . . . . . . . . . . . . . . . . . 11 5. Security Considerations ........................................11
4.1. Comparing MEDs Between Different Autonomous 6. Acknowledgements ...............................................11
Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 7. References .....................................................12
4.2. Effects of Aggregation on MEDs` . . . . . . . . . . . . . . 12 7.1. Normative References ......................................12
5. IANA Considerations. . . . . . . . . . . . . . . . . . . . . . 12 7.2. Informative References ....................................12
6. Security Considerations. . . . . . . . . . . . . . . . . . . . 12
7. Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . 13
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
8.1. Normative References. . . . . . . . . . . . . . . . . . . . 15
8.2. Informative References. . . . . . . . . . . . . . . . . . . 16
9. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 16
1. Introduction 1. Introduction
The BGP MED attribute provides a mechanism for BGP speakers to convey The BGP MED attribute provides a mechanism for BGP speakers to convey
to an adjacent AS the optimal entry point into the local AS. While to an adjacent AS the optimal entry point into the local AS. While
BGP MEDs function correctly in many scenarios, there are a number of BGP MEDs function correctly in many scenarios, a number of issues may
issues which may arise when utilizing MEDs in dynamic or complex arise when utilizing MEDs in dynamic or complex topologies.
topologies.
While reading this document it's important to keep in mind that the While reading this document, note that the goal is to discuss both
goal is to discuss both implementation and deployment considerations implementation and deployment considerations regarding BGP MEDs. In
regarding BGP MEDs. In addition, the intention is to provide addition, the intention is to provide guidance that both implementors
guidance that both implementors and network operators should be and network operators should be familiar with. In some instances,
familiar with. In some instances implementation advice varies from implementation advice varies from deployment advice.
deployment advice.
2. Specification of Requirements 2. Specification of Requirements
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC 2119]. document are to be interpreted as described in [RFC 2119].
2.1. About the MULTI_EXIT_DISC (MED) Attribute 2.1. About the MULTI_EXIT_DISC (MED) Attribute
The BGP MULTI_EXIT_DISC (MED) attribute, formerly known as the The BGP MULTI_EXIT_DISC (MED) attribute, formerly known as the
INTER_AS_METRIC, is currently defined in section 5.1.4 of [BGP4], as INTER_AS_METRIC, is currently defined in section 5.1.4 of [BGP4], as
follows: follows:
The MULTI_EXIT_DISC is an optional non-transitive attribute which The MULTI_EXIT_DISC is an optional non-transitive attribute that
is intended to be used on external (inter-AS) links to discriminate is intended to be used on external (inter-AS) links to
among multiple exit or entry points to the same neighboring AS. discriminate among multiple exit or entry points to the same
The value of the MULTI_EXIT_DISC attribute is a four octet unsigned neighboring AS. The value of the MULTI_EXIT_DISC attribute is a
number which is called a metric. All other factors being equal, the four-octet unsigned number, called a metric. All other factors
exit point with lower metric SHOULD be preferred. If received over being equal, the exit point with the lower metric SHOULD be
EBGP, the MULTI_EXIT_DISC attribute MAY be propagated over IBGP to preferred. If received over External BGP (EBGP), the
other BGP speakers within the same AS (see also 9.1.2.2). The MULTI_EXIT_DISC attribute MAY be propagated over Internal BGP
MULTI_EXIT_DISC attribute received from a neighboring AS MUST NOT (IBGP) to other BGP speakers within the same AS (see also
be propagated to other neighboring ASs. 9.1.2.2). The MULTI_EXIT_DISC attribute received from a
neighboring AS MUST NOT be propagated to other neighboring ASes.
A BGP speaker MUST implement a mechanism based on local A BGP speaker MUST implement a mechanism (based on local
configuration which allows the MULTI_EXIT_DISC attribute to be configuration) that allows the MULTI_EXIT_DISC attribute to be
removed from a route. If a BGP speaker is configured to remove the removed from a route. If a BGP speaker is configured to remove
MULTI_EXIT_DISC attribute from a route, then this removal MUST be the MULTI_EXIT_DISC attribute from a route, then this removal MUST
done prior to determining the degree of preference of the route and be done prior to determining the degree of preference of the route
performing route selection (Decision Process phases 1 and 2). and prior to performing route selection (Decision Process phases 1
and 2).
An implementation MAY also (based on local configuration) alter the An implementation MAY also (based on local configuration) alter
value of the MULTI_EXIT_DISC attribute received over EBGP. If a the value of the MULTI_EXIT_DISC attribute received over EBGP. If
BGP speaker is configured to alter the value of the MULTI_EXIT_DISC a BGP speaker is configured to alter the value of the
attribute received over EBGP, then altering the value MUST be done MULTI_EXIT_DISC attribute received over EBGP, then altering the
prior to determining the degree of preference of the route and value MUST be done prior to determining the degree of preference
performing route selection (Decision Process phases 1 and 2). See of the route and prior to performing route selection (Decision
Section 9.1.2.2 of BGP4] for necessary restrictions on this. Process phases 1 and 2). See Section 9.1.2.2 for necessary
restrictions on this.
Section 9.1.2.2 (c) of [BGP4] defines the following route selection Section 9.1.2.2 (c) of [BGP4] defines the following route selection
criteria regarding MEDs: criteria regarding MEDs:
c) Remove from consideration routes with less-preferred c) Remove from consideration routes with less-preferred
MULTI_EXIT_DISC attributes. MULTI_EXIT_DISC is only comparable MULTI_EXIT_DISC attributes. MULTI_EXIT_DISC is only comparable
between routes learned from the same neighboring AS (the neighbor- between routes learned from the same neighboring AS (the
ing AS is determined from the AS_PATH attribute). Routes which do neighboring AS is determined from the AS_PATH attribute).
not have the MULTI_EXIT_DISC attribute are considered to have the Routes that do not have the MULTI_EXIT_DISC attribute are
lowest possible MULTI_EXIT_DISC value. considered to have the lowest possible MULTI_EXIT_DISC value.
This is also described in the following procedure: This is also described in the following procedure:
for m = all routes still under consideration for m = all routes still under consideration
for n = all routes still under consideration for n = all routes still under consideration
if (neighborAS(m) == neighborAS(n)) and (MED(n) < MED(m)) if (neighborAS(m) == neighborAS(n)) and (MED(n) < MED(m))
remove route m from consideration remove route m from consideration
In the pseudo-code above, MED(n) is a function which returns the In the pseudo-code above, MED(n) is a function that returns the
value of route n's MULTI_EXIT_DISC attribute. If route n has no value of route n's MULTI_EXIT_DISC attribute. If route n has
MULTI_EXIT_DISC attribute, the function returns the lowest possi- no MULTI_EXIT_DISC attribute, the function returns the lowest
ble MULTI_EXIT_DISC value, i.e. 0. possible MULTI_EXIT_DISC value (i.e., 0).
If a MULTI_EXIT_DISC attribute is removed before re-advertising a Similarly, neighborAS(n) is a function that returns the
route into IBGP, then comparison based on the received EBGP neighbor AS from which the route was received. If the route is
MULTI_EXIT_DISC attribute MAY still be performed. If an learned via IBGP, and the other IBGP speaker didn't originate
implementation chooses to remove MULTI_EXIT_DISC, then the optional the route, it is the neighbor AS from which the other IBGP
comparison on MULTI_EXIT_DISC if performed at all MUST be performed speaker learned the route. If the route is learned via IBGP,
only among EBGP learned routes. The best EBGP learned route may and the other IBGP speaker either (a) originated the route, or
then be compared with IBGP learned routes after the removal of the (b) created the route by aggregation and the AS_PATH attribute
MULTI_EXIT_DISC attribute. If MULTI_EXIT_DISC is removed from a of the aggregate route is either empty or begins with an
subset of EBGP learned routes and the selected "best" EBGP learned AS_SET, it is the local AS.
route will not have MULTI_EXIT_DISC removed, then the
MULTI_EXIT_DISC must be used in the comparison with IBGP learned
routes. For IBGP learned routes the MULTI_EXIT_DISC MUST be used in
route comparisons which reach this step in the Decision Process.
Including the MULTI_EXIT_DISC of an EBGP learned route in the If a MULTI_EXIT_DISC attribute is removed before re-advertising
comparison with an IBGP learned route, then removing the a route into IBGP, then comparison based on the received EBGP
MULTI_EXIT_DISC attribute and advertising the route has been proven MULTI_EXIT_DISC attribute MAY still be performed. If an
to cause route loops. implementation chooses to remove MULTI_EXIT_DISC, then the
optional comparison on MULTI_EXIT_DISC, if performed, MUST be
performed only among EBGP-learned routes. The best EBGP-
learned route may then be compared with IBGP-learned routes
after the removal of the MULTI_EXIT_DISC attribute. If
MULTI_EXIT_DISC is removed from a subset of EBGP-learned
routes, and the selected "best" EBGP-learned route will not
have MULTI_EXIT_DISC removed, then the MULTI_EXIT_DISC must be
used in the comparison with IBGP-learned routes. For IBGP-
learned routes, the MULTI_EXIT_DISC MUST be used in route
comparisons that reach this step in the Decision Process.
Including the MULTI_EXIT_DISC of an EBGP-learned route in the
comparison with an IBGP-learned route, then removing the
MULTI_EXIT_DISC attribute, and advertising the route has been
proven to cause route loops.
2.2. MEDs and Potatos 2.2. MEDs and Potatoes
In a situation where traffic flows between a pair of hosts, each Let's consider a situation where traffic flows between a pair of
connected to different transit networks, which are themselves hosts, each connected to a different transit network, which is in
interconnected at two or more locations, each transit network has the itself interconnected at two or more locations. Each transit network
choice of either sending traffic to the closest peering to the has the choice of either sending traffic to the closest peering to
adjacent transit network or passing traffic to the interconnection the adjacent transit network or passing traffic to the
location which advertises the least cost path to the destination interconnection location that advertises the least-cost path to the
host. destination host.
The former method is called "hot potato routing" (or closest-exit) The former method is called "hot potato routing" (or closest-exit)
because like a hot potato held in bare hands, whoever has it tries to because like a hot potato held in bare hands, whoever has it tries to
get rid of it quickly. Hot potato routing is accomplished by not get rid of it quickly. Hot potato routing is accomplished by not
passing the EGBP learned MED into IBGP. This minimizes transit passing the EBGP-learned MED into IBGP. This minimizes transit
traffic for the provider routing the traffic. Far less common is traffic for the provider routing the traffic. Far less common is
"cold potato routing" (or best-exit) where the transit provider uses "cold potato routing" (or best-exit) where the transit provider uses
their own transit capacity to get the traffic to the point that its own transit capacity to get the traffic to the point that
adjacent transit provider advertised as being closest to the adjacent transit provider advertised as being closest to the
destination. Cold potato routing is accomplished by passing the EBGP destination. Cold potato routing is accomplished by passing the
learned MED into IBGP. EBGP-learned MED into IBGP.
If one transit provider uses hot potato routing and another uses cold If one transit provider uses hot potato routing and another uses cold
potato, traffic between the two tends to be more symmetric. However, potato, traffic between the two tends to be more symmetric. However,
if both providers employ cold potato routing, or both providers if both providers employ cold potato routing or hot potato routing
employ hot potato routing between their networks, it's likely that a between their networks, it's likely that a larger amount of asymmetry
larger amount of asymmetry would exist. would exist.
Depending on the business relationships, if one provider has more Depending on the business relationships, if one provider has more
capacity or a significantly less congested backbone network, then capacity or a significantly less congested backbone network, then
that provider may use cold potato routing. An example of widespread that provider may use cold potato routing. An example of widespread
use of cold potato routing was the NSF funded NSFNET backbone and NSF use of cold potato routing was the NSF-funded NSFNET backbone and
funded regional networks in the mid 1990s. NSF-funded regional networks in the mid-1990s.
In some cases a provider may use hot potato routing for some In some cases, a provider may use hot potato routing for some
destinations for a given peer AS and cold potato routing for others. destinations for a given peer AS and cold potato routing for others.
An example of this is the different treatment of commercial and An example of this is the different treatment of commercial and
research traffic in the NSFNET in the mid 1990s. Today many research traffic in the NSFNET in the mid-1990s. Today, many
commercial networks exchange MEDs with customers but not bilateral commercial networks exchange MEDs with customers but not with
peers. However, commercial use of MEDs varies widely, from bilateral peers. However, commercial use of MEDs varies widely, from
ubiquitous use of MEDs to no use of MEDs at all. ubiquitous use to none at all.
In addition, many deployments of MEDs today are likely behaving In addition, many deployments of MEDs today are likely behaving
differently (e.g., resulting is sub-optimal routing) than the network differently (e.g., resulting in sub-optimal routing) than the network
operator intended, thereby resulting not in hot or cold potatos, but operator intended, which results not in hot or cold potatoes, but
mashed potatos! More information on unintended behavior resulting mashed potatoes! More information on unintended behavior resulting
from MEDs is provided throughout this document. from MEDs is provided throughout this document.
3. Implementation and Protocol Considerations 3. Implementation and Protocol Considerations
There are a number of implementation and protocol peculiarities There are a number of implementation and protocol peculiarities
relating to MEDs that have been discovered that may affect network relating to MEDs that have been discovered that may affect network
behavior. The following sections provide information on these behavior. The following sections provide information on these
issues. issues.
3.1. MULTI_EXIT_DISC is a Optional Non-Transitive Attribute 3.1. MULTI_EXIT_DISC Is an Optional Non-Transitive Attribute
MULTI_EXIT_DISC is a non-transitive optional attribute whose MULTI_EXIT_DISC is a non-transitive optional attribute whose
advertisement to both IBGP and EBGP peers is discretionary. As a advertisement to both IBGP and EBGP peers is discretionary. As a
result, some implementations enable sending of MEDs to IBGP peers by result, some implementations enable sending of MEDs to IBGP peers by
default, while others do not. This behavior may result in sub- default, while others do not. This behavior may result in sub-
optimal route selection within an AS. In addition, some optimal route selection within an AS. In addition, some
implementations send MEDs to EBGP peers by default, while others do implementations send MEDs to EBGP peers by default, while others do
not. This behavior may result in sub-optimal inter-domain route not. This behavior may result in sub-optimal inter-domain route
selection. selection.
3.2. MED Values and Preferences 3.2. MED Values and Preferences
Some implementations consider an MED value of zero as less preferable Some implementations consider an MED value of zero less preferable
than no MED value. This behavior resulted in path selection than no MED value. This behavior resulted in path selection
inconsistencies within an AS. The current draft version of the BGP inconsistencies within an AS. The current version of the BGP
specification [BGP4] removes ambiguities that existed in [RFC 1771] specification [BGP4] removes ambiguities that existed in [RFC1771] by
by stating that if route n has no MULTI_EXIT_DISC attribute, the stating that if route n has no MULTI_EXIT_DISC attribute, the lowest
lowest possible MULTI_EXIT_DISC value (i.e. 0) should be assigned to possible MULTI_EXIT_DISC value (i.e., 0) should be assigned to the
the attribute. attribute.
It is apparent that different implementations and different versions It is apparent that different implementations and different versions
of the BGP draft specification have been all over the map with of the BGP specification have been all over the map with
interpretation of missing-MED. For example, earlier versions of the interpretation of missing-MED. For example, earlier versions of the
specification called for a missing MED to be assigned the highest specification called for a missing MED to be assigned the highest
possible MED value (i.e., 2^32-1). possible MED value (i.e., 2^32-1).
In addition, some implementations have been shown to internally In addition, some implementations have been shown to internally
employ a maximum possible MED value (2^32-1) as an "infinity" metric employ a maximum possible MED value (2^32-1) as an "infinity" metric
(i.e., the MED value is used to tag routes as unfeasible), and would (i.e., the MED value is used to tag routes as unfeasible); upon
upon on receiving an update with an MED value of 2^32-1 rewrite the receiving an update with an MED value of 2^32-1, they would rewrite
value to 2^32-2. Subsequently, the new MED value would be propagated the value to 2^32-2. Subsequently, the new MED value would be
and could result in routing inconsistencies or unintended path propagated and could result in routing inconsistencies or unintended
selections. path selections.
As a result of implementation inconsistencies and protocol revision As a result of implementation inconsistencies and protocol revision
variances, many network operators today explicitly reset (i.e., set variances, many network operators today explicitly reset (i.e., set
to zero or some other 'fixed' value) all MED values on ingress to to zero or some other 'fixed' value) all MED values on ingress to
conform to their internal routing policies (i.e., to include policy conform to their internal routing policies (i.e., to include policy
that requires that MED values of 0 and 2^32-1 not be used in that requires that MED values of 0 and 2^32-1 not be used in
configurations, whether the MEDs are directly computed or configurations, whether the MEDs are directly computed or
configured), so as to not have to rely on all their routers having configured), so as not to have to rely on all their routers having
the same missing-MED behavior. the same missing-MED behavior.
Because implementations don't normally provide a mechanism to disable Because implementations don't normally provide a mechanism to disable
MED comparisons in the decision algorithm, "not using MEDs" usually MED comparisons in the decision algorithm, "not using MEDs" usually
entails explicitly setting all MEDs to some fixed value upon ingress entails explicitly setting all MEDs to some fixed value upon ingress
to the routing domain. By assigning a fixed MED value consistently to the routing domain. By assigning a fixed MED value consistently
to all routes across the network, MEDs are a effectively a non-issue to all routes across the network, MEDs are a effectively a non-issue
in the decision algorithm. in the decision algorithm.
3.3. Comparing MEDs Between Different Autonomous Systems 3.3. Comparing MEDs between Different Autonomous Systems
The MED was intended to be used on external (inter-AS) links to The MED was intended to be used on external (inter-AS) links to
discriminate among multiple exit or entry points to the same discriminate among multiple exit or entry points to the same
neighboring AS. However, a large number of MED applications now neighboring AS. However, a large number of MED applications now
employ MEDs for the purpose of determining route preference between employ MEDs for the purpose of determining route preference between
like routes received from different autonomous systems. like routes received from different autonomous systems.
A large number of implementations provide the capability to enable A large number of implementations provide the capability to enable
comparison of MEDs between routes received from different neighboring comparison of MEDs between routes received from different neighboring
autonomous systems. While this capability has demonstrated some autonomous systems. While this capability has demonstrated some
benefit (e.g., that described in [RFC 3345]), operators should be benefit (e.g., that described in [RFC3345]), operators should be wary
wary of the potential side effects with enabling such a function. of the potential side effects of enabling such a function. The
The deployment section below provides some examples as to why this deployment section below provides some examples as to why this may
may result in undesirable behavior. result in undesirable behavior.
3.4. MEDs, Route Reflection and AS Confederations for BGP 3.4. MEDs, Route Reflection, and AS Confederations for BGP
In particular configurations, the BGP scaling mechanisms defined in In particular configurations, the BGP scaling mechanisms defined in
"BGP Route Reflection - An Alternative to Full Mesh IBGP" [RFC 2796] "BGP Route Reflection - An Alternative to Full Mesh IBGP" [RFC 2796]
and "Autonomous System Confederations for BGP" [RFC 3065] will and "Autonomous System Confederations for BGP" [RFC 3065] will
introduce persistent BGP route oscillation [RFC 3345]. The problem introduce persistent BGP route oscillation [RFC3345]. The problem is
is inherent in the way BGP works: a conflict exists between inherent in the way BGP works: a conflict exists between information
information hiding/hierarchy and the non-hierarchical selection hiding/hierarchy and the non-hierarchical selection process imposed
process imposed by lack of total ordering caused by the MED rules. by lack of total ordering caused by the MED rules. Given current
Given current practices, we see the problem most frequently manifest practices, we see the problem manifest itself most frequently in the
itself in the context of MED + route reflectors or confederations. context of MED + route reflectors or confederations.
One potential way to avoid this is by configuring inter-Member-AS or One potential way to avoid this is by configuring inter-Member-AS or
inter-cluster IGP metrics higher than intra-Member-AS IGP metrics inter-cluster IGP metrics higher than intra-Member-AS IGP metrics
and/or using other tie breaking policies to avoid BGP route selection and/or using other tie-breaking policies to avoid BGP route selection
based on incomparable MEDs. Of course, IGP metric constraints may be based on incomparable MEDs. Of course, IGP metric constraints may be
unreasonably onerous for some applications. unreasonably onerous for some applications.
Not comparing MEDs between multiple paths for a prefix learned from Not comparing MEDs between multiple paths for a prefix learned from
different adjacent autonomous systems, as discussed in section 2.3), different adjacent autonomous systems, as discussed in section 2.3,
or not utilizing MEDs at all, significantly decreases the probability or not utilizing MEDs at all, significantly decreases the probability
of introducing potential route oscillation conditions into the of introducing potential route oscillation conditions into the
network. network.
Although perhaps "legal" as far as current specifications are Although perhaps "legal" as far as current specifications are
concerned, modifying MED attributes received on any type of IBGP concerned, modifying MED attributes received on any type of IBGP
session (e.g., standard IBGP, AS confederations EIBGP, route session (e.g., standard IBGP, EBGP sessions between Member-ASes of a
reflection, etc..) is not recommended. BGP confederation, route reflection, etc.) is not recommended.
3.5. Route Flap Damping and MED Churn 3.5. Route Flap Damping and MED Churn
MEDs are often derived dynamically from IGP metrics or additive costs MEDs are often derived dynamically from IGP metrics or additive costs
associated with an IGP metric to a given BGP NEXT_HOP. This associated with an IGP metric to a given BGP NEXT_HOP. This
typically provides an efficient model for ensuring that the BGP MED typically provides an efficient model for ensuring that the BGP MED
advertised to peers used to represent the best path to a given advertised to peers, used to represent the best path to a given
destination within the network is aligned with that of the IGP within destination within the network, is aligned with that of the IGP
a given AS. within a given AS.
The consequence with dynamically derived IGP-based MEDs is that The consequence with dynamically derived IGP-based MEDs is that
instability within an AS, or even on a single given link within the instability within an AS, or even on a single given link within the
AS, can result in wide-spread BGP instability or BGP route AS, can result in widespread BGP instability or BGP route
advertisement churn that propagates across multiple domains. In advertisement churn that propagates across multiple domains. In
short, if your MED "flaps" every time your IGP metric flaps, you're short, if your MED "flaps" every time your IGP metric flaps, your
routes are likely going to be suppressed as a result of BGP Route routes are likely going to be suppressed as a result of BGP Route
Flap Damping [RFC 2439]. Flap Damping [RFC 2439].
Employment of MEDs may compound the adverse effects of BGP flap Employment of MEDs may compound the adverse effects of BGP flap-
dampening behavior because it many cause routes to be re- advertised dampening behavior because it may cause routes to be re-advertised
solely to reflect an internal topology change. solely to reflect an internal topology change.
Many implementations don't have a practical problem with IGP Many implementations don't have a practical problem with IGP
flapping, they either latch their IGP metric upon first advertisement flapping; they either latch their IGP metric upon first advertisement
or they employ some internal suppression mechanism. Some or employ some internal suppression mechanism. Some implementations
implementations regard BGP attribute changes as less significant than regard BGP attribute changes as less significant than route
route withdrawals and announcements to attempt to mitigate the impact withdrawals and announcements to attempt to mitigate the impact of
of this type of event. this type of event.
3.6. Effects of MEDs on Update Packing Efficiency 3.6. Effects of MEDs on Update Packing Efficiency
Multiple unfeasible routes can be advertised in a single BGP Update Multiple unfeasible routes can be advertised in a single BGP Update
message. The BGP4 protocol also permits advertisement of multiple message. The BGP4 protocol also permits advertisement of multiple
prefixes with a common set of path attributes to be advertised in a prefixes with a common set of path attributes to be advertised in a
single update message, this is commonly referred to as "update single update message. This is commonly referred to as "update
packing". When possible, update packing is recommended as it packing". When possible, update packing is recommended as it
provides a mechanism for more efficient behavior in a number of provides a mechanism for more efficient behavior in a number of
areas, to include: areas, including the following:
o Reduction in system overhead due to generation or receipt of o Reduction in system overhead due to generation or receipt of
fewer Update messages. fewer Update messages.
o Reduction in network overhead as a result of fewer packets and o Reduction in network overhead as a result of fewer packets and
lower bandwidth consumption. lower bandwidth consumption.
o Allows processing of path attributes and searches for matching o Less frequent processing of path attributes and searches for
sets in your AS_PATH database (if you have one) less frequently. matching sets in your AS_PATH database (if you have one).
Consistent ordering of the path attributes allows for ease of Consistent ordering of the path attributes allows for ease of
matching in the database as you don't have different matching in the database as you don't have different
representations representations of the same data.
of the same data.
Update packing requires that all feasible routes within a single Update packing requires that all feasible routes within a single
update message share a common attribute set, to include a common update message share a common attribute set, to include a common
MULTI_EXIT_DISC value. As such, potential wide-scale variance in MED MULTI_EXIT_DISC value. As such, potential wide-scale variance in MED
values introduces another variable and may resulted in a marked values introduces another variable and may result in a marked
decrease in update packing efficiency. decrease in update packing efficiency.
3.7. Temporal Route Selection 3.7. Temporal Route Selection
Some implementations have had bugs which lead to temporal behavior in Some implementations had bugs that led to temporal behavior in
MED-based best path selection. These usually involved methods used MED-based best path selection. These usually involved methods to
to store the oldest route along with ordering routes for MED in store the oldest route and to order routes for MED, which caused
earlier implementations that cause non-deterministic behavior on non-deterministic behavior as to whether or not the oldest route
whether the oldest route would truly be selected or not. would truly be selected.
The reasoning for this is that older paths are presumably more The reasoning for this is that older paths are presumably more
stable, and thus more preferable. However, temporal behavior in stable, and thus preferable. However, temporal behavior in route
route selection results in non-deterministic behavior, and as such, selection results in non-deterministic behavior and, as such, is
is often undesirable. often undesirable.
4. Deployment Considerations 4. Deployment Considerations
It has been discussed that accepting MEDs from other autonomous It has been discussed that accepting MEDs from other autonomous
systems have the potential to cause traffic flow churns in the systems has the potential to cause traffic flow churns in the
network. Some implementations only ratchet down the MED and never network. Some implementations only ratchet down the MED and never
move it back up to prevent excessive churn. move it back up to prevent excessive churn.
However, if a session is reset, the MEDs being advertised have the However, if a session is reset, the MEDs being advertised have the
potential of changing. If an network is relying on received MEDs to potential of changing. If a network is relying on received MEDs to
route traffic properly, the traffic patterns have the potential for route traffic properly, the traffic patterns have the potential for
changing dramatically, potentially resulting in congestion on the changing dramatically, potentially resulting in congestion on the
network. Essentially, accepting and routing traffic based on MEDs network. Essentially, accepting and routing traffic based on MEDs
allows other people to traffic engineer your network. This may or may allows other people to traffic engineer your network. This may or
not be acceptable to you. may not be acceptable to you.
As previously discussed, many network operators choose to reset MED As previously discussed, many network operators choose to reset MED
values on ingress. In addition, many operators explicitly do not values on ingress. In addition, many operators explicitly do not
employ MED values of 0 or 2^32-1 in order to avoid inconsistencies employ MED values of 0 or 2^32-1 in order to avoid inconsistencies
with implementations and various revisions of the BGP specification. with implementations and various revisions of the BGP specification.
4.1. Comparing MEDs Between Different Autonomous Systems 4.1. Comparing MEDs between Different Autonomous Systems
Although the MED was meant to only be used when comparing paths Although the MED was meant to be used only when comparing paths
received from different external peers in the same AS, many received from different external peers in the same AS, many
implementations provide the capability to compare MEDs between implementations provide the capability to compare MEDs between
different autonomous systems as well. AS operators often use different autonomous systems as well. AS operators often use
LOCAL_PREF to select the external preferences (primary, secondary LOCAL_PREF to select the external preferences (primary, secondary
upstreams, peers, customers, etc.), using MED instead of LOCAL_PREF upstreams, peers, customers, etc.), using MED instead of LOCAL_PREF
would possibility lead to an inconsistent distribution of best routes would possibly lead to an inconsistent distribution of best routes,
as MED is compared only after the AS_PATH length. as MED is compared only after the AS_PATH length.
Though this may seem a fine idea for some configurations, care must Though this may seem like a fine idea for some configurations, care
be taken when comparing MEDs between different autonomous systems. must be taken when comparing MEDs between different autonomous
BGP speakers often derive MED values by obtaining the IGP metric systems. BGP speakers often derive MED values by obtaining the IGP
associated with reaching a given BGP NEXT_HOP within the local AS. metric associated with reaching a given BGP NEXT_HOP within the local
This allows MEDs to reasonably reflect IGP topologies when AS. This allows MEDs to reasonably reflect IGP topologies when
advertising routes to peers. While this is fine when comparing MEDs advertising routes to peers. While this is fine when comparing MEDs
between multiple paths learned from a single AS, it can result in between multiple paths learned from a single AS, it can result in
potentially "weighted" decisions when comparing MEDs between potentially "weighted" decisions when comparing MEDs between
different autonomous systems. This is most typically the case when different autonomous systems. This is most typically the case when
the autonomous systems use different mechanisms to derive IGP the autonomous systems use different mechanisms to derive IGP metrics
metrics, BGP MEDs, or perhaps even use different IGP protocols with or BGP MEDs, or when they perhaps even use different IGP protocols
vastly contrasting metric spaces (e.g., OSPF v. traditional metric with vastly contrasting metric spaces (e.g., OSPF vs. traditional
space in IS-IS). metric space in IS-IS).
4.2. Effects of Aggregation on MEDs` 4.2. Effects of Aggregation on MEDs
Another MED deployment consideration involves the impact that Another MED deployment consideration involves the impact that
aggregation of BGP routing information has on MEDs. Aggregates are aggregation of BGP routing information has on MEDs. Aggregates are
often generated from multiple locations in an AS in order to often generated from multiple locations in an AS in order to
accommodate stability, redundancy and other network design goals. accommodate stability, redundancy, and other network design goals.
When MEDs are derived from IGP metrics associated with said When MEDs are derived from IGP metrics associated with said
aggregates the MED value advertised to peers can result in very aggregates, the MED value advertised to peers can result in very
suboptimal routing. suboptimal routing.
5. IANA Considerations 5. Security Considerations
This document introduces no new IANA considerations.
6. Security Considerations
The MED was purposely designed to be a "weak" metric that would only The MED was purposely designed to be a "weak" metric that would only
be used late in the best-path decision process. The BGP working be used late in the best-path decision process. The BGP working
group was concerned that any metric specified by a remote operator group was concerned that any metric specified by a remote operator
would only affect routing in a local AS IF no other preference was would only affect routing in a local AS if no other preference was
specified. A paramount goal of the design of the MED was to ensure specified. A paramount goal of the design of the MED was to ensure
that peers could not "shed" or "absorb" traffic for networks that that peers could not "shed" or "absorb" traffic for networks that
they advertise. As such, accepting MEDs from peers may in some sense they advertise. As such, accepting MEDs from peers may in some sense
increase a network's susceptibility to exploitation by peers. increase a network's susceptibility to exploitation by peers.
7. Acknowledgments 6. Acknowledgements
Thanks to John Scudder for applying his usual keen eye and Thanks to John Scudder for applying his usual keen eye and
constructive insight. Also, thanks to Curtis Villamizar, JR Mitchell constructive insight. Also, thanks to Curtis Villamizar, JR
and Pekka Savola for their valuable feedback. Mitchell, and Pekka Savola for their valuable feedback.
8. References
8.1. Normative References 7. References
[RFC 1519] Fuller, V., Li. T., Yu J., and K. Varadhan, "Classless 7.1. Normative References
Inter-Domain Routing (CIDR): an Address Assignment and
Aggregation Strategy", RFC 1519, September 1993.
[RFC 1771] Rekhter, Y., and T. Li, "A Border Gateway Protocol 4 [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP-
(BGP-4)", RFC 1771, March 1995. 4)", RFC 1771, March 1995.
[RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC 2796] Bates, T., Chandra, R., Chen, E., "BGP Route Reflection [RFC2796] Bates, T., Chandra, R., and E. Chen, "BGP Route Reflection
- An Alternative to Full Mesh IBGP", RFC 2796, April - An Alternative to Full Mesh IBGP", RFC 2796, April 2000.
2000.
[RFC 3065] Traina, P., McPherson, D., Scudder, J.. "Autonomous System [RFC3065] Traina, P., McPherson, D., and J. Scudder, "Autonomous
Confederations for BGP", RFC 3065, February 2001. System Confederations for BGP", RFC 3065, February 2001.
[BGP4] Rekhter, Y., T. Li., and Hares. S, Editors, "A Border [BGP4] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Gateway Protocol 4 (BGP-4)", BGP Draft, Work in Progress. Protocol 4 (BGP-4)", RFC 4271, January 2006.
8.2. Informative References 7.2. Informative References
[RFC 2439] Villamizar, C. and Chandra, R., "BGP Route Flap Damping", [RFC2439] Villamizar, C., Chandra, R., and R. Govindan, "BGP Route
RFC 2439, November 1998. Flap Damping", RFC 2439, November 1998.
[RFC 3345] McPherson, D., Gill, V., Walton, D., and Retana, A, "BGP [RFC3345] McPherson, D., Gill, V., Walton, D., and A. Retana,
Persistent Route Oscillation Condition", RFC 3345, "Border Gateway Protocol (BGP) Persistent Route
August 2002. Oscillation Condition", RFC 3345, August 2002.
9. Authors' Addresses Authors' Addresses
Danny McPherson Danny McPherson
Arbor Networks Arbor Networks
Email: danny@arbor.net
EMail: danny@arbor.net
Vijay Gill Vijay Gill
AOL AOL
Email: VijayGill9@aol.com
Intellectual Property Statement EMail: VijayGill9@aol.com
Full Copyright Statement
Copyright (C) The Internet Society (2006).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79. found in BCP 78 and BCP 79.
skipping to change at page 17, line 7 skipping to change at page 13, line 45
such proprietary rights by implementers or users of this such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr. http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at this standard. Please address the information to the IETF at
ietf-ipr@ietf.org. ietf-ipr@ietf.org.
Disclaimer of Validity Acknowledgement
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the Funding for the RFC Editor function is provided by the IETF
Internet Society. Administrative Support Activity (IASA).
 End of changes. 84 change blocks. 
268 lines changed or deleted 248 lines changed or added

This html diff was produced by rfcdiff 1.29, available from http://www.levkowetz.com/ietf/tools/rfcdiff/