draft-ietf-idr-bgp-analysis-03.txt   draft-ietf-idr-bgp-analysis-04.txt 
INTERNET-DRAFT D. Meyer
INTERNET-DRAFT David Meyer draft-ietf-idr-bgp-analysis-04.txt K. Patel
draft-ietf-idr-bgp-analysis-03.txt Keyur Patel
Category Informational Category Informational
Expires: November 2003 May 2003 Expires: March 2004 September 2003
BGP-4 Protocol Analysis BGP-4 Protocol Analysis
<draft-ietf-idr-bgp-analysis-03.txt> <draft-ietf-idr-bgp-analysis-04.txt>
Status of this Document Status of this Document
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 1, line 32 skipping to change at page 1, line 31
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This document is a product of an individual. Comments are solicited This document is a product of the IDR Working Group. Comments should
and should be addressed to the author(s). be addressed to the authors, or the mailing list at idr@ietf.org.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2003). All Rights Reserved. Copyright (C) The Internet Society (2003). All Rights Reserved.
Abstract Abstract
The purpose of this report is to document how the requirements for The purpose of this report is to document how the requirements for
advancing a routing protocol from Draft Standard to full Standard advancing a routing protocol from Draft Standard to full Standard
have been satisfied by Border Gateway Protocol version 4 (BGP-4). have been satisfied by Border Gateway Protocol version 4 (BGP-4).
skipping to change at page 3, line 12 skipping to change at page 3, line 12
summarizes the key features of BGP protocol, and analyzes the summarizes the key features of BGP protocol, and analyzes the
protocol with respect to scaling and performance. protocol with respect to scaling and performance.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Key Features and algorithms of the BGP protocol. . . . . . . . 4 2. Key Features and algorithms of the BGP protocol. . . . . . . . 4
2.1. Key Features. . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. Key Features. . . . . . . . . . . . . . . . . . . . . . . . 4
2.2. BGP Algorithms. . . . . . . . . . . . . . . . . . . . . . . 5 2.2. BGP Algorithms. . . . . . . . . . . . . . . . . . . . . . . 5
2.3. BGP Finite State Machine (FSM). . . . . . . . . . . . . . . 6 2.3. BGP Finite State Machine (FSM). . . . . . . . . . . . . . . 6
3. BGP Capabilities . . . . . . . . . . . . . . . . . . . . . . . 7 3. BGP Capabilities . . . . . . . . . . . . . . . . . . . . . . . 8
4. BGP Persistent Peer Oscillations . . . . . . . . . . . . . . . 8 4. BGP Persistent Peer Oscillations . . . . . . . . . . . . . . . 8
5. BGP Performance characteristics and Scalability. . . . . . . . 8 5. Implementation Guidelines. . . . . . . . . . . . . . . . . . . 8
5.1. Link bandwidth and CPU utilization. . . . . . . . . . . . . 8 6. BGP Performance characteristics and Scalability. . . . . . . . 9
5.1.1. CPU utilization. . . . . . . . . . . . . . . . . . . . . 9 6.1. Link bandwidth and CPU utilization. . . . . . . . . . . . . 9
5.1.2. Memory requirements. . . . . . . . . . . . . . . . . . . 11 6.1.1. CPU utilization. . . . . . . . . . . . . . . . . . . . . 10
6. BGP Policy Expressiveness and its Implications . . . . . . . . 12 6.1.2. Memory requirements. . . . . . . . . . . . . . . . . . . 11
6.1. Existence of Unique Stable Routings . . . . . . . . . . . . 13 7. BGP Policy Expressiveness and its Implications . . . . . . . . 12
6.2. Existence of Stable Routings. . . . . . . . . . . . . . . . 14 7.1. Existence of Unique Stable Routings . . . . . . . . . . . . 13
7. Applicability. . . . . . . . . . . . . . . . . . . . . . . . . 15 7.2. Existence of Stable Routings. . . . . . . . . . . . . . . . 15
8. Intellectual Property. . . . . . . . . . . . . . . . . . . . . 16 8. Applicability. . . . . . . . . . . . . . . . . . . . . . . . . 16
9. Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . 16 9. Intellectual Property. . . . . . . . . . . . . . . . . . . . . 17
10. Security Considerations . . . . . . . . . . . . . . . . . . . 17 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 17
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 11. Security Considerations . . . . . . . . . . . . . . . . . . . 18
12. References. . . . . . . . . . . . . . . . . . . . . . . . . . 17 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18
12.1. Informative References . . . . . . . . . . . . . . . . . . 17 13. References. . . . . . . . . . . . . . . . . . . . . . . . . . 18
13. Author's Addresses. . . . . . . . . . . . . . . . . . . . . . 18 13.1. Informative References . . . . . . . . . . . . . . . . . . 18
14. Full Copyright Statement. . . . . . . . . . . . . . . . . . . 19 14. Author's Addresses. . . . . . . . . . . . . . . . . . . . . . 19
15. Full Copyright Statement. . . . . . . . . . . . . . . . . . . 19
1. Introduction 1. Introduction
BGP-4 is an inter-autonomous system routing protocol designed for BGP-4 is an inter-autonomous system routing protocol designed for
TCP/IP internets. Version 1 of the BGP protocol was published in RFC TCP/IP internets. Version 1 of the BGP protocol was published in RFC
1105 [RFC1105]. Since then BGP versions 2, 3, and 4 have been 1105 [RFC1105]. Since then BGP versions 2, 3, and 4 have been
developed. Version 2 was documented in RFC 1163 [RFC1163]. Version 3 developed. Version 2 was documented in RFC 1163 [RFC1163]. Version
is documented in RFC 1267 [RFC1267]. Version 4 is documented in the 3 is documented in RFC 1267 [RFC1267]. Version 4 is documented in
[BGP4] (version 4 of BGP will hereafter be referred to as BGP). The the [BGP4] (version 4 of BGP will hereafter be referred to as BGP).
changes between versions are explained in Appendix A of [BGP4]. The changes between versions are explained in Appendix A of [BGP4].
Possible applications of BGP in the Internet are documented in RFC Possible applications of BGP in the Internet are documented in RFC
1772 [RFC1772]. 1772 [RFC1772].
BGP introduced support for Classless InterDomain Routing [CIDR].
Since earlier versions of BGP lacked the support for CIDR, they are
considered obsolete and unusable in today's Internet.
2. Key Features and algorithms of the BGP protocol 2. Key Features and algorithms of the BGP protocol
This section summarizes the key features and algorithms of the BGP This section summarizes the key features and algorithms of the BGP
protocol. BGP is an inter-autonomous system routing protocol; it is protocol. BGP is an inter-autonomous system routing protocol; it is
designed to be used between multiple autonomous systems. BGP assumes designed to be used between multiple autonomous systems. BGP assumes
that routing within an autonomous system is done by an intra- that routing within an autonomous system is done by an intra-
autonomous system routing protocol. BGP does not make any assumptions autonomous system routing protocol. BGP also assumes that data
about intra-autonomous system routing protocols deployed within the packets are routed from source towards destination independent of the
various autonomous systems. Specifically, BGP does not require all source. BGP does not make any assumptions about intra-autonomous
autonomous systems to run the same intra-autonomous system routing system routing protocols deployed within the various autonomous
protocol (i.e., interior gateway protocol or IGP). systems. Specifically, BGP does not require all autonomous systems
to run the same intra-autonomous system routing protocol (i.e.,
interior gateway protocol or IGP).
Finally, note that BGP is a real inter-autonomous system routing Finally, note that BGP is a real inter-autonomous system routing
protocol, and as such it imposes no constraints on the underlying protocol, and as such it imposes no constraints on the underlying
Internet topology. The information exchanged via BGP is sufficient to Internet topology. The information exchanged via BGP is sufficient
construct a graph of autonomous systems connectivity from which to construct a graph of autonomous systems connectivity from which
routing loops may be pruned and many routing policy decisions at the routing loops may be pruned and many routing policy decisions at the
autonomous system level may be enforced. autonomous system level may be enforced.
2.1. Key Features 2.1. Key Features
The key features of the protocol are the notion of path attributes The key features of the protocol are the notion of path attributes
and aggregation of network layer reachability information (NLRI). and aggregation of network layer reachability information (NLRI).
Path attributes provide BGP with flexibility and extensibility. Path Path attributes provide BGP with flexibility and extensibility. Path
attributes are partitioned into well-known and optional. The attributes are partitioned into well-known and optional. The
provision for optional attributes allows experimentation that may provision for optional attributes allows experimentation that may
involve a group of BGP routers without affecting the rest of the involve a group of BGP routers without affecting the rest of the
Internet. New optional attributes can be added to the protocol in Internet. New optional attributes can be added to the protocol in
much the same way that new options are added to, say, the Telnet much the same way that new options are added to, say, the Telnet
protocol [RFC854]. protocol [RFC854].
One of the most important path attributes is the Autonomous System One of the most important path attributes is the Autonomous System
Path, or AS_PATH. AS reachability information traverses the Internet, Path, or AS_PATH. AS reachability information traverses the
this information is augmented by the list of autonomous systems that Internet, this information is augmented by the list of autonomous
have been traversed thus far, forming the AS_PATH. The AS_PATH allows systems that have been traversed thus far, forming the AS_PATH. The
straightforward suppression of the looping of routing information. In AS_PATH allows straightforward suppression of the looping of routing
addition, the AS_PATH serves as a powerful and versatile mechanism information. In addition, the AS_PATH serves as a powerful and
for policy-based routing. versatile mechanism for policy-based routing.
BGP enhances the AS_PATH attribute to include sets of autonomous BGP enhances the AS_PATH attribute to include sets of autonomous
systems as well as lists via the AS_SET attribute. This extended systems as well as lists via the AS_SET attribute. This extended
format allows generated aggregate routes to carry path information format allows generated aggregate routes to carry path information
from the more specific routes used to generate the aggregate. It from the more specific routes used to generate the aggregate. It
should be noted however, that as of this writing, AS_SETs are rarely should be noted however, that as of this writing, AS_SETs are rarely
used in the Internet [ROUTEVIEWS]. used in the Internet [ROUTEVIEWS].
2.2. BGP Algorithms 2.2. BGP Algorithms
BGP uses an algorithm that is neither a pure distance vector BGP uses an algorithm that is neither a pure distance vector
algorithm or a pure link state algorithm. It is instead a modified algorithm or a pure link state algorithm. It is instead a modified
distance vector algorithm that uses path information to avoid distance vector algorithm referred to as a "Path Vector" algorithm
traditional distance vector problems. Each route within BGP pairs that uses path information to avoid traditional distance vector
destination with path information to that destination. Path problems. Each route within BGP pairs destination with path
information (also known as AS_PATH information) is stored within the information to that destination. Path information (also known as
AS_PATH attribute in BGP. This allows BGP to reconstruct large AS_PATH information) is stored within the AS_PATH attribute in BGP.
portions of overall topology whenever required. This allows BGP to reconstruct large portions of overall topology
whenever required.
BGP uses an incremental update strategy in order to conserve BGP uses an incremental update strategy in order to conserve
bandwidth and processing power. That is, after initial exchange of bandwidth and processing power. That is, after initial exchange of
complete routing information, a pair of BGP routers exchanges only complete routing information, a pair of BGP routers exchanges only
changes to that information. Such an incremental update design changes to that information. Such an incremental update design
requires reliable transport between a pair of BGP routers to function requires reliable transport between a pair of BGP routers to function
correctly. BGP solves this problem by using TCP for reliable correctly. BGP solves this problem by using TCP for reliable
transport. transport.
In addition to incremental updates, BGP has added the concept of In addition to incremental updates, BGP has added the concept of
skipping to change at page 7, line 41 skipping to change at page 8, line 8
Automatic stop: Local system automatically stops the Automatic stop: Local system automatically stops the
BGP connection. BGP connection.
Both, Manual Start and Manual Stop are mandatory BGP events. All Both, Manual Start and Manual Stop are mandatory BGP events. All
other events are optional. other events are optional.
3. BGP Capabilities 3. BGP Capabilities
The BGP Capability mechanism [RFC2842] provides an easy and flexible The BGP Capability mechanism [RFC2842] provides an easy and flexible
way to introduce new features within the protocol. In particular, the way to introduce new features within the protocol. In particular,
BGP capability mechanism allows peers to negotiate various optional the BGP capability mechanism allows peers to negotiate various
features during startup. This allows the base BGP protocol to contain optional features during startup. This allows the base BGP protocol
only essential functionality, while at the same time providing a to contain only essential functionality, while at the same time
flexible mechanism for signaling protocol extensions. providing a flexible mechanism for signaling protocol extensions.
4. BGP Persistent Peer Oscillations 4. BGP Persistent Peer Oscillations
Ideally, whenever a BGP speaker detects an error in any peer Ideally, whenever a BGP speaker detects an error in any peer
connection, it shuts down the peer and changes its FSM state to IDLE. connection, it shuts down the peer and changes its FSM state to IDLE.
BGP speaker requires a Start event to re-initiate its idle peer BGP speaker requires a Start event to re-initiate its idle peer
connection. If the error remains persistent and BGP speaker generates connection. If the error remains persistent and BGP speaker
Start event automatically then it may result in persistent peer generates Start event automatically then it may result in persistent
flapping. However, although peer oscillation is found to be wide- peer flapping. However, although peer oscillation is found to be
spread in BGP implementations, methods for preventing persistent peer wide-spread in BGP implementations, methods for preventing persistent
oscillations are outside the scope of base BGP protocol peer oscillations are outside the scope of base BGP protocol
specification. specification.
5. BGP Performance characteristics and Scalability 5. Implementation Guidelines
A robust BGP implementation is work conserving. This means that if
the number of prefixes is bound, arbitrarily high levels of route
change can be tolerated with bounded impact on route convergence for
occasionally changes in generally stable routes.
A BGP implementation under high load conditions should empty as much
inbound routing updates from its input streams, processing only the
most recent route if the route for a given NLRI changes multiple
times. TCP also provides blocking on the writes on the sender side.
A BGP implementation under load should expect blocks on write calls
and send only the most recent routes when sockets unblock rather than
sending entire history.
A robust implementation of BGP should have the following
characteristics:
1. It is able to operate in almost arbitrarily high levels
of route flap without loosing peerings (failing to send
keepalives) or loosing other protocol adjacencies as a
result of BGP load.
2. Instability of a subset of routes should not affect the
route advertisements or forwarding associated with the set
of stable routes.
3. High levels of instability and peers of different CPU speed
or load resulting in faster or slower processing of routes
should not cause instability and should have a bounded
impact on the convergence time for generally stable routes.
Numerous robust BGP implementations exist. Producing a robust
implementation is not a trivial matter but clearly achievable.
6. BGP Performance characteristics and Scalability
In this section, we provide "order of magnitude" answers to the In this section, we provide "order of magnitude" answers to the
questions of how much link bandwidth, router memory and router CPU questions of how much link bandwidth, router memory and router CPU
cycles the BGP protocol will consume under normal conditions. In cycles the BGP protocol will consume under normal conditions. In
particular, we will address the scalability of BGP and its particular, we will address the scalability of BGP and its
limitations. limitations.
It is important to note that BGP does not require all the routers 6.1. Link bandwidth and CPU utilization
within an autonomous system to participate in the BGP protocol. In
particular, only the border routers that provide connectivity between
the local autonomous system and their adjacent autonomous systems
need participate in BGP. The ability to constrain the set of BGP
speakers is one way to address scaling issues.
5.1. Link bandwidth and CPU utilization
Immediately after the initial BGP connection setup, BGP peers Immediately after the initial BGP connection setup, BGP peers
exchange complete set of routing information. If we denote the total exchange complete set of routing information. If we denote the total
number of routes in the Internet by N, the mean AS distance of the number of routes in the Internet by N, the mean AS distance of the
Internet by M (distance at the level of an autonomous system, Internet by M (distance at the level of an autonomous system,
expressed in terms of the number of autonomous systems), the total expressed in terms of the number of autonomous systems), the total
number of autonomous systems in the Internet by A, and assume that number of unique AS paths by A, and assume that the networks are
the networks are uniformly distributed among the autonomous systems, uniformly distributed among the autonomous systems, then the worst
then the worst case amount of bandwidth consumed during the initial case amount of bandwidth consumed during the initial exchange between
exchange between a pair of BGP speakers is a pair of BGP speakers is
BW = O(N + (M * A))
MR = O(N + M * A)
The following table illustrates the typical amount of bandwidth The following table illustrates the typical amount of bandwidth
consumed during the initial exchange between a pair of BGP speakers consumed during the initial exchange between a pair of BGP speakers
based on the above assumptions (ignoring bandwidth consumed by the based on the above assumptions (ignoring bandwidth consumed by the
BGP Header). For purposes of the estimates here, we will calculate MR BGP Header). For purposes of the estimates here, we will calculate
= 4 * (N + (M * A)). BW = 4 * (N + (M * A)).
# NLRI Mean AS Distance # AS's Bandwidth (MR) # NLRI Mean AS Distance # AS's Bandwidth (MR)
---------- ---------------- ------ ---------------- ---------- ---------------- ------ ----------------
40,000 15 400 184,000 bytes 40,000 15 400 184,000 bytes
100,000 10 10,000 800,000 bytes 100,000 10 10,000 800,000 bytes
120,000 10 15,000 1,080,000 bytes 120,000 10 15,000 1,080,000 bytes
140,000 15 20,000 1,760,000 bytes 140,000 15 20,000 1,760,000 bytes
[note that most of this bandwidth is consumed by the NLRI exchange] [note that most of this bandwidth is consumed by the NLRI exchange]
BGP was created specifically to reduce the size of the set of NLRI BGP was created specifically to reduce the size of the set of NLRI
entries which have to be carried and exchanged by border routers. The entries which have to be carried and exchanged by border routers.
aggregation scheme, defined in RFC 1519 [RFC1519], describes the The aggregation scheme, defined in RFC 1519 [RFC1519], describes the
provider-based aggregation scheme in use in today's Internet. provider-based aggregation scheme in use in today's Internet.
Due to the advantages of advertising a few large aggregate blocks Due to the advantages of advertising a few large aggregate blocks
instead of many smaller class-based individual networks, it is instead of many smaller class-based individual networks, it is
difficult to estimate the actual reduction in bandwidth and difficult to estimate the actual reduction in bandwidth and
processing that BGP has provided over BGP-3. If we simply enumerate processing that BGP has provided over BGP-3. If we simply enumerate
all aggregate blocks into their individual class-based networks, we all aggregate blocks into their individual class-based networks, we
would not take into account "dead" space that has been reserved for would not take into account "dead" space that has been reserved for
future expansion. The best metric for determining the success of future expansion. The best metric for determining the success of
BGP's aggregation is to sample the number NLRI entries in the BGP's aggregation is to sample the number NLRI entries in the
globally connected Internet today and compare it to projected growth globally connected Internet today and compare it to projected growth
rates before BGP was deployed. rates before BGP was deployed.
At the time of this writing, the full set of exterior routes carried At the time of this writing, the full set of exterior routes carried
by BGP is approximately 120,000 network entries [ROUTEVIEWS]. by BGP is approximately 120,000 network entries [ROUTEVIEWS].
5.1.1. CPU utilization 6.1.1. CPU utilization
An important and fundamental feature of BGP is that BGP's CPU An important and fundamental feature of BGP is that BGP's CPU
utilization depends only on the stability of the Internet. If the utilization depends only on the stability of the Internet. If the
Internet is stable, then the only link bandwidth and router CPU Internet is stable, then the only link bandwidth and router CPU
cycles consumed by BGP are due to the exchange of the BGP KEEPALIVE cycles consumed by BGP are due to the exchange of the BGP KEEPALIVE
messages. The KEEPALIVE messages are exchanged only between peers. messages. The KEEPALIVE messages are exchanged only between peers.
The suggested frequency of the exchange is 30 seconds. The KEEPALIVE The suggested frequency of the exchange is 30 seconds. The KEEPALIVE
messages are quite short (19 octets), and require virtually no messages are quite short (19 octets), and require virtually no
processing. As a result, the bandwidth consumed by the KEEPALIVE processing. As a result, the bandwidth consumed by the KEEPALIVE
messages is about 5 bits/sec. Operational experience confirms that messages is about 5 bits/sec. Operational experience confirms that
the overhead (in terms of bandwidth and CPU) associated with the the overhead (in terms of bandwidth and CPU) associated with the
KEEPALIVE messages should be viewed as negligible. KEEPALIVE messages should be viewed as negligible.
During periods of Internet instability, changes to the reachability During periods of Internet instability, changes to the reachability
information are passed between routers in UPDATE messages. If we information are passed between routers in UPDATE messages. The
denote the number of routing changes per second by C, then in the greatest overhead per UPDATE message occurs when each UPDATE message
worst case the amount of bandwidth consumed by the BGP can be contains only a single network. It should be pointed out that in
expressed as O(C * M). The greatest overhead per UPDATE message practice routing changes exhibit strong locality with respect to the
occurs when each UPDATE message contains only a single network. It AS path. That is, routes that change are likely to have common AS
should be pointed out that in practice routing changes exhibit strong path. In this case, multiple networks can be grouped into a single
locality with respect to the AS path. That is, routes that change are UPDATE message, thus significantly reducing the amount of bandwidth
likely to have common AS path. In this case, multiple networks can be required (see also Appendix F.1 of [BGP4]).
grouped into a single UPDATE message, thus significantly reducing the
amount of bandwidth required (see also Appendix F.1 of [BGP4]).
Since in the steady state the link bandwidth and router CPU cycles Since in the steady state the link bandwidth and router CPU cycles
consumed by the BGP protocol are dependent only on the stability of consumed by the BGP protocol are dependent only on the stability of
the Internet, it follows that BGP should have no scaling problems in the Internet, it follows that BGP should have no scaling problems in
the areas of link bandwidth and router CPU utilization. This assumes the areas of link bandwidth and router CPU utilization. This assumes
that as the Internet grows, the overall stability of the inter-AS that as the Internet grows, the overall stability of the inter-AS
connectivity of the Internet can be controlled. In particular, while connectivity of the Internet can be controlled. In particular, while
the size of the IPv4 Internet routing table is bounded by O(2^32 * the size of the IPv4 Internet routing table is bounded by O(232 * M),
M), (where M is a slow-moving function describing the AS (where M is a slow-moving function describing the AS
interconnectivity of the network), no such bound can be formulated interconnectivity of the network), no such bound can be formulated
for the dynamic properties (i.e., stability) of BGP. Finally, since for the dynamic properties (i.e., stability) of BGP. Although, the
the dynamic properties of the network cannot be quantitatively dynamic properties of the network cannot be quantitatively bounded,
bounded, stability must be addressed via heuristics such as BGP they can be controlled within BGP. Beyond certain changes in the
Route Flap Damping [RFC2439]. Due to the nature of BGP, such damping network, BGP can start to suppress such changes using BGP Route Flap
should be viewed as a matter local to an autonomous system matter Damping [RFC2439], pacing of its route updates, or BGP would be
(see also Appendix F.2 of [BGP4]). unable to keep up with the changes and force suppression of multiple
changes over very short periods by causing the BGP peer socket to
It may also be instructive to compare bandwidth and CPU requirements block on the sender.
of BGP with the Exterior Gateway Protocol (EGP). While with BGP the
complete information is exchanged only at the connection
establishment time, with EGP the complete information is exchanged
periodically (usually every 3 minutes). Note that both for BGP and
for EGP the amount of information exchanged is roughly on the order
of the number of networks reachable via a peer that sends the
information. Therefore, even if one assumes extreme instabilities of
BGP, its worst case behavior will be the same as the steady state
behavior of its predecessor, EGP.
Operational experience with BGP showed that the incremental update
approach employed by BGP provides qualitative improvement in both
bandwidth and CPU utilization when compared with complete periodic
updates used by EGP (see also presentation by Dennis Ferguson at the
Twentieth IETF, March 11-15, 1991, St. Louis).
5.1.2. Memory requirements 6.1.2. Memory requirements
To quantify the worst case memory requirements for BGP, we denote the To quantify the worst case memory requirements for BGP, we denote the
total number of networks in the Internet by N, the mean AS distance total number of networks in the Internet by N, the mean AS distance
of the Internet by M (distance at the level of an autonomous system, of the Internet by M (distance at the level of an autonomous system,
expressed in terms of the number of autonomous systems), the total expressed in terms of the number of autonomous systems), the total
number of autonomous systems in the Internet by A, and the total number of unique AS paths as A. Then the worst case memory
number of BGP speakers that a system is peering with by K (note that
K will usually be dominated by the total number of the BGP speakers
within a single autonomous system). Then the worst case memory
requirements (MR) can be expressed as requirements (MR) can be expressed as
MR = O((N + M * A) * K) MR = O(N + (M * A))
It is interesting to note that prior to the introduction of BGP in
the NSFNET Backbone, memory requirements on the NSFNET Backbone
routers running EGP were on the order of O(N *K).
Since a mean AS distance M is a slow moving function of the Since a mean AS distance M is a slow moving function of the
interconnectivity ("meshiness") of the Internet, for all practical interconnectivity ("meshiness") of the Internet, for all practical
purposes the worst case router memory requirements are on the order purposes the worst case router memory requirements are on the order
of the total number of networks in the Internet times the number of of the total number of networks in the Internet times the number of
peers the local system is peering with. We expect that the total peers the local system is peering with. We expect that the total
number of networks in the Internet will grow much faster than the number of networks in the Internet will grow much faster than the
average number of peers per router. As a result, BGP's memory scaling average number of peers per router. As a result, BGP's memory
properties are linearly related to the total number of networks in scaling properties are linearly related to the total number of
the Internet. networks in the Internet.
The following table illustrates typical memory requirements of a The following table illustrates typical memory requirements of a
router running BGP. It is assumed that each network is encoded as router running BGP. We denote average number of routes advertised by
four bytes, each AS is encoded as two bytes, and each networks is each peer as N, the total number of unique AS paths as A, the mean AS
reachable via some fraction of all of the peers (# BGP peers/per distance of the Internet as M (distance at the level of an autonomous
net). For purposes of the estimates here, we will calculate MR = system, expressed in terms of the number of autonomous systems),
((N*4) + (M*A)*2) * K. number of bytes required to store a route as R, and number of bytes
required to store one AS in an AS path as P. It is assumed that each
network is encoded as four bytes, each AS is encoded as two bytes,
and each networks is reachable via some fraction of all of the peers
(# BGP peers/per net). For purposes of the estimates here, we will
calculate MR = ((N * R) + (M * A) * P)
# Networks Mean AS Distance # AS's # BGP peers/per net Memory Req (MR) # Networks Mean AS Distance # AS's # BGP peers/per net Memory Req (MR)
---------- ---------------- ------ ------------------- -------------- ---------- ---------------- ------ ------------------- --------------
100,000 20 3,000 20 1,040,000 100,000 20 3,000 20 1,040,000
100,000 20 15,000 20 1,040,000 100,000 20 15,000 20 1,040,000
120,000 10 15,000 100 75,000,000 120,000 10 15,000 100 75,000,000
140,000 15 20,000 100 116,000,000 140,000 15 20,000 100 116,000,000
In analyzing BGP's memory requirements, we focus on the size of the In analyzing BGP's memory requirements, we focus on the size of the
forwarding table (and ignoring implementation details). In forwarding table (and ignoring implementation details). In
particular, we derive upper bounds for the size of the forwarding particular, we derive upper bounds for the size of the forwarding
table. For example, at the time of this writing, the forwarding table. For example, at the time of this writing, the forwarding
tables of a typical backbone router carry on the order of 120,000 tables of a typical backbone router carry on the order of 120,000
entries. Given this number, one might ask whether it would be entries. Given this number, one might ask whether it would be
possible to have a functional router with a table that will have possible to have a functional router with a table that will have
1,000,000 entries. Clearly the answer to this question is independent 1,000,000 entries. Clearly the answer to this question is more
of BGP. Interestingly, in his review of the BGP protocol for the BGP related to how BGP is implemented. A robust BGP implementation with a
review committee in March of 1990, Paul Tsuchiya noted that "BGP does reasonable CPU and memory should not have issues scaling to such
not scale well. This is not really the fault of BGP. It is the fault limits.
of the flat IP address space. Given the flat IP address space, any
routing protocol must carry network numbers in its updates." The
introduction of the provider based aggregation schemes (e.g., RFC
1519 [RFC1519]) have sought to address this issue, to the extent
possible, within the context of current addressing architectures.
6. BGP Policy Expressiveness and its Implications 7. BGP Policy Expressiveness and its Implications
BGP is unique among deployed IP routing protocols in that routing is BGP is unique among deployed IP routing protocols in that routing is
determined using semantically rich routing policies. Although routing determined using semantically rich routing policies. Although
policies are usually the first thing that comes to a network routing policies are usually the first thing that comes to a network
operator's mind concerning BGP, it is important to note that the operator's mind concerning BGP, it is important to note that the
languages and techniques for specifying BGP routing policies are not languages and techniques for specifying BGP routing policies are not
actually a part of the protocol specification (see RFC 2622 [RFC2622] actually a part of the protocol specification (see RFC 2622 [RFC2622]
for an example of such a policy language). In addition, the BGP for an example of such a policy language). In addition, the BGP
specification contains few restrictions, either explicitly or specification contains few restrictions, either explicitly or
implicitly, on routing policy languages. These languages have implicitly, on routing policy languages. These languages have
typically been developed by vendors and have evolved through typically been developed by vendors and have evolved through
interactions with network engineers in an environment lacking vendor- interactions with network engineers in an environment lacking vendor-
independent standards. independent standards.
The complexity of typical BGP configurations, at least in provider The complexity of typical BGP configurations, at least in provider
networks, has been steadily increasing. Router vendors typically networks, has been steadily increasing. Router vendors typically
provided hundreds of special commands for use in the configuration of provided hundreds of special commands for use in the configuration of
BGP, and this command set is continually expanding. For example, BGP BGP, and this command set is continually expanding. For example, BGP
communities [RFC1997] allow policy writers to selectively attach tags communities [RFC1997] allow policy writers to selectively attach tags
to routes and use these to signal policy information to other BGP- to routes and use these to signal policy information to other BGP-
speaking routers. Many providers allow customers, and sometimes speaking routers. Many providers allow customers, and sometimes
peers, to send communities that determine the scope and preference of peers, to send communities that determine the scope and preference of
their routes. These developments have more and more given the task of their routes. These developments have more and more given the task
writing BGP configurations aspects associated with open-ended of writing BGP configurations aspects associated with open-ended
programming. This has allowed network operators to encode complex programming. This has allowed network operators to encode complex
policies in order to address many unforeseen situations, and has policies in order to address many unforeseen situations, and has
opened the door for a great deal of creativity and experimentation in opened the door for a great deal of creativity and experimentation in
routing policies. This policy flexibility is one of the main reasons routing policies. This policy flexibility is one of the main reasons
why BGP is so well suited to the commercial environment of the why BGP is so well suited to the commercial environment of the
current Internet. current Internet.
However, this rich policy expressiveness has come with a cost that is However, this rich policy expressiveness has come with a cost that is
often not recognized. In particular, it is possible to construct often not recognized. In particular, it is possible to construct
locally defined routing policies that can lead to unexpected global locally defined routing policies that can lead to unexpected global
routing anomalies such as (unintended) nondeterminism and to protocol routing anomalies such as (unintended) nondeterminism and to protocol
divergence. If the interacting policies causing such anomalies are divergence. If the interacting policies causing such anomalies are
defined in different autonomous systems, then these problems can be defined in different autonomous systems, then these problems can be
very difficult to debug and correct. In the following sections, we very difficult to debug and correct. In the following sections, we
describe two such cases relating to the existence (or lack thereof) describe two such cases relating to the existence (or lack thereof)
of stable routings. of stable routings.
6.1. Existence of Unique Stable Routings 7.1. Existence of Unique Stable Routings
One can easily construct sets of policies for which BGP can not One can easily construct sets of policies for which BGP can not
guarantee that stable routings are unique. This can be illustrated by guarantee that stable routings are unique. This can be illustrated
the following simple example. Consider the example of four Autonomous by the following simple example. Consider the example of four
Systems, AS1, AS2, AS3, and AS4. AS1 and AS2 are peers, and they Autonomous Systems, AS1, AS2, AS3, and AS4. AS1 and AS2 are peers,
provide transit for AS3 and AS4 respectively, Suppose further that and they provide transit for AS3 and AS4 respectively, Suppose
AS3 provides transit for AS4 (in this case AS3 is a customer of AS1, further that AS3 provides transit for AS4 (in this case AS3 is a
and AS4 is a multihomed customer of both AS3 and AS4). AS4 may want customer of AS1, and AS4 is a multihomed customer of both AS3 and
to use the link to AS3 as a "backup" link, and sends AS3 a community AS2). AS4 may want to use the link to AS3 as a "backup" link, and
value that AS3 has configured to lower the preference of AS4's routes sends AS3 a community value that AS3 has configured to lower the
to a level below that of its upstream provider, AS1. The intended preference of AS4's routes to a level below that of its upstream
"backup routing" to AS4 is illustrated here: provider, AS1. The intended "backup routing" to AS4 is illustrated
here:
AS1 ------> AS2 AS1 ------> AS2
/|\ | /|\ |
| | | |
| | | |
| \|/ | \|/
AS3 ------- AS4 AS3 ------- AS4
That is, the AS3-AS4 link is intended to be used only when the That is, the AS3-AS4 link is intended to be used only when the
AS2-AS4 link is down (for outbound traffic, AS4 simply gives routes AS2-AS4 link is down (for outbound traffic, AS4 simply gives routes
from AS2 a higher local preference). This is a common scenario in from AS2 a higher local preference). This is a common scenario in
today's Internet. But note that this configuration has another stable today's Internet. But note that this configuration has another
solution: stable solution:
AS1 ------- AS2 AS1 ------- AS2
| | | |
| | | |
| | | |
\|/ \|/ \|/ \|/
AS3 ------> AS4 AS3 ------> AS4
In this case, AS3 does not translate the "depref my route" community In this case, AS3 does not translate the "depref my route" community
received from AS4 into a "depref my route" community for AS1, and so received from AS4 into a "depref my route" community for AS1, and so
skipping to change at page 14, line 32 skipping to change at page 15, line 32
bringing down the AS2-AS4 BGP session, and then bringing it back up. bringing down the AS2-AS4 BGP session, and then bringing it back up.
In general, BGP has no way to prefer the "intended" solution over the In general, BGP has no way to prefer the "intended" solution over the
anomalous one, and which is picked will depend on the unpredictable anomalous one, and which is picked will depend on the unpredictable
order of BGP messages. order of BGP messages.
While this example is relatively simple, many operators may fail to While this example is relatively simple, many operators may fail to
recognize that the true source of the problem is that the BGP recognize that the true source of the problem is that the BGP
policies of ASes can interact in unexpected ways, and that these policies of ASes can interact in unexpected ways, and that these
interactions can result in multiple stable routings. One can imagine interactions can result in multiple stable routings. One can imagine
that the interactions could be much more complex in the real that the interactions could be much more complex in the real
Internet. We suspect that such anomalies will only become more common Internet. We suspect that such anomalies will only become more
as BGP continues to evolve with richer policy expressiveness. For common as BGP continues to evolve with richer policy expressiveness.
example, extended communities provide an even more flexible means of For example, extended communities provide an even more flexible means
signaling information within and between autonomous systems than is of signaling information within and between autonomous systems than
possible with RFC 1997 communities. At the same time, applications of is possible with RFC 1997 communities. At the same time,
communities by network operators are evolving to address complex applications of communities by network operators are evolving to
issues of inter-domain traffic engineering. address complex issues of inter-domain traffic engineering.
6.2. Existence of Stable Routings 7.2. Existence of Stable Routings
One can also construct a set of policies for which BGP can not One can also construct a set of policies for which BGP can not
guarantee that a stable routing exists (or worse, that a stable guarantee that a stable routing exists (or worse, that a stable
routing will ever be found). For example, RFC 3345 [RFC3345] routing will ever be found). For example, RFC 3345 [RFC3345]
documents several scenarios that lead to route oscillations documents several scenarios that lead to route oscillations
associated with the use of the Multi-Exit Discriminator or MED, associated with the use of the Multi-Exit Discriminator or MED,
attribute. Route oscillation will happen in BGP when a set of attribute. Route oscillation will happen in BGP when a set of
policies has no solution. That is, when there is no stable routing policies has no solution. That is, when there is no stable routing
that satisfies the constraints imposed by policy, then BGP has no that satisfies the constraints imposed by policy, then BGP has no
choice by to keep trying. In addition, BGP configurations can have a choice by to keep trying. In addition, BGP configurations can have a
stable routing, yet the protocol may not be able to find it; BGP can stable routing, yet the protocol may not be able to find it; BGP can
"get trapped" down a blind alley that has no solution. "get trapped" down a blind alley that has no solution.
Protocol divergence is not, however, a problem associated solely with Protocol divergence is not, however, a problem associated solely with
use of the MED attribute. This potential exists in BGP even without use of the MED attribute. This potential exists in BGP even without
the use of the MED attribute. Hence, like the unintended the use of the MED attribute. Hence, like the unintended
nondeterminism described in the previous section, this type of nondeterminism described in the previous section, this type of
protocol divergence is an unintended consequence of the unconstrained protocol divergence is an unintended consequence of the unconstrained
nature of BGP policy languages. nature of BGP policy languages.
7. Applicability 8. Applicability
In this section we answer the question of which environments is BGP In this section we answer the question of which environments is BGP
well suited, and for which environments it is not suitable. This well suited, and for which environments it is not suitable. This
question is partially answered in Section 2 of RFC 1771 [RFC1771], question is partially answered in Section 2 of BGP [BGP4], which
which states: states:
"To characterize the set of policy decisions that can be enforced "To characterize the set of policy decisions that can be enforced
using BGP, one must focus on the rule that an AS advertises to its using BGP, one must focus on the rule that an AS advertises to its
neighbor ASs only those routes that it itself uses. This rule neighbor ASs only those routes that it itself uses. This rule
reflects the "hop-by-hop" routing paradigm generally used reflects the "hop-by-hop" routing paradigm generally used
throughout the current Internet. Note that some policies cannot throughout the current Internet. Note that some policies cannot
be supported by the "hop-by-hop" routing paradigm and thus require be supported by the "hop-by-hop" routing paradigm and thus require
techniques such as source routing to enforce. For example, BGP techniques such as source routing to enforce. For example, BGP
does not enable one AS to send traffic to a neighbor AS intending does not enable one AS to send traffic to a neighbor AS intending
that the traffic take a different route from that taken by traffic that the traffic take a different route from that taken by traffic
skipping to change at page 16, line 8 skipping to change at page 17, line 8
only the functionality that is essential, while at the same time only the functionality that is essential, while at the same time
providing a flexible mechanism within the protocol that allow us to providing a flexible mechanism within the protocol that allow us to
extend its functionality. For example, BGP capabilities provide an extend its functionality. For example, BGP capabilities provide an
easy and flexible way to introduce new features within the protocol. easy and flexible way to introduce new features within the protocol.
Finally, since BGP was designed with flexibility and extensibility in Finally, since BGP was designed with flexibility and extensibility in
mind, new and/or evolving requirements can be addressed via existing mind, new and/or evolving requirements can be addressed via existing
mechanisms. mechanisms.
To summarize, BGP is well suitable as an inter-autonomous system To summarize, BGP is well suitable as an inter-autonomous system
routing protocol for the IPv4 Internet that is based on IP [RFC791] routing protocol for the IPv4 Internet that is based on IP [RFC791]
as the Internet Protocol and "hop-by-hop" routing paradigm. Finally, as the Internet Protocol and "hop-by-hop" routing paradigm.
BGP is equally applicable to IPv6 [RFC2460] internets.
8. Intellectual Property 9. Intellectual Property
The IETF takes no position regarding the validity or scope of any The IETF takes no position regarding the validity or scope of any
intellectual property or other rights that might be claimed to intellectual property or other rights that might be claimed to
pertain to the implementation or use of the technology described in pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights this document or the extent to which any license under such rights
might or might not be available; neither does it represent that it might or might not be available; neither does it represent that it
has made any effort to identify any such rights. Information on the has made any effort to identify any such rights. Information on the
IETF's procedures with respect to rights in standards-track and IETF's procedures with respect to rights in standards-track and
standards-related documentation can be found in BCP-11. Copies of standards-related documentation can be found in BCP-11. Copies of
claims of rights made available for publication and any assurances of claims of rights made available for publication and any assurances of
skipping to change at page 16, line 33 skipping to change at page 17, line 32
obtain a general license or permission for the use of such obtain a general license or permission for the use of such
proprietary rights by implementors or users of this specification can proprietary rights by implementors or users of this specification can
be obtained from the IETF Secretariat. be obtained from the IETF Secretariat.
The IETF invites any interested party to bring to its attention any The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary copyrights, patents or patent applications, or other proprietary
rights which may cover technology that may be required to practice rights which may cover technology that may be required to practice
this standard. Please address the information to the IETF Executive this standard. Please address the information to the IETF Executive
Director. Director.
9. Acknowledgments 10. Acknowledgments
We would like to thank Paul Traina for authoring previous versions of We would like to thank Paul Traina for authoring previous versions of
this document. Tim Griffin and Randy Presuhn also provided many this document. Tim Griffin, Randy Presuhn, Curtis Villamizar and
insightful comments on earlier versions of this document. Atanu Ghosh also provided many insightful comments on earlier
versions of this document.
10. Security Considerations 11. Security Considerations
This document presents an analysis of the BGP protocol and as such This document presents an analysis of the BGP protocol and as such
presents no new security implications for BGP. presents no new security implications for BGP.
11. IANA Considerations 12. IANA Considerations
This document presents an analysis of the BGP protocol and hence This document presents an analysis of the BGP protocol and hence
presents no new IANA considerations. presents no new IANA considerations.
12. References 13. References
12.1. Informative References 13.1. Informative References
[BGP4] Rekhter, Y., T. Li., and S. Hares, Editors, "A [BGP4] Rekhter, Y., T. Li., and S. Hares, Editors, "A
Border Gateway Protocol 4 (BGP-4)", Border Gateway Protocol 4 (BGP-4)",
draft-ietf-idr-bgp4-20.txt. Work in progress. draft-ietf-idr-bgp4-20.txt. Work in progress.
[RFC791] "INTERNET PROTOCOL", DARPA INTERNET PROGRAM [RFC791] "INTERNET PROTOCOL", DARPA INTERNET PROGRAM
PROTOCOL SPECIFICATION, RFC 791, September, PROTOCOL SPECIFICATION, RFC 791, September,
1981. 1981.
[RFC854] Postel, J. and J. Reynolds, "TELNET PROTOCOL [RFC854] Postel, J. and J. Reynolds, "TELNET PROTOCOL
skipping to change at page 18, line 19 skipping to change at page 19, line 19
of the Border Gateway Protocol in the Internet", of the Border Gateway Protocol in the Internet",
RFC 1772, March 1995. RFC 1772, March 1995.
[RFC1997] Chandra. R, and T. Li, "BGP Communities [RFC1997] Chandra. R, and T. Li, "BGP Communities
Attribute", RFC 1997, August, 1996. Attribute", RFC 1997, August, 1996.
[RFC2439] Villamizar, C., Chandra, R., and R. Govindan, [RFC2439] Villamizar, C., Chandra, R., and R. Govindan,
"BGP Route Flap Damping", RFC 2439, November "BGP Route Flap Damping", RFC 2439, November
1998. 1998.
[RFC2460] Deering, S, and R. Hinden, "Internet Protocol, [RFC2622] Alaettinoglu, C., et. al., "Routing Policy
Version 6 (IPv6) Specification", RFC 2460,
December, 1998.
[RFC2622] C. Alaettinoglu, et al., "Routing Policy
Specification Language (RPSL)" RFC 2622, May, Specification Language (RPSL)" RFC 2622, May,
1998. 1998.
[RFC2842] Chandra, R. and J. Scudder, "Capabilities [RFC2842] Chandra, R. and J. Scudder, "Capabilities
Advertisement with BGP-4", RFC 2842, May 2000. Advertisement with BGP-4", RFC 2842, May 2000.
[RFC3345] McPherson, D., Gill, V., Walton, D., and [RFC3345] McPherson, D., Gill, V., Walton, D., and
A. Retana, "Border Gateway Protocol (BGP) Persistent A. Retana, "Border Gateway Protocol (BGP) Persistent
Route Oscillation Condition", RFC 3345, Route Oscillation Condition", RFC 3345,
August, 2002. August, 2002.
[ROUTEVIEWS] Meyer, D., "The Route Views Project", [ROUTEVIEWS] Meyer, D., "The Route Views Project",
http://www.routeviews.org http://www.routeviews.org
13. Author's Addresses 14. Author's Addresses
David Meyer David Meyer
Email: dmm@maoz.com Email: dmm@1-4-5.net
Keyur Patel Keyur Patel
Cisco Systems Cisco Systems
Email: keyupate@cisco.com Email: keyupate@cisco.com
14. Full Copyright Statement 15. Full Copyright Statement
Copyright (C) The Internet Society (2003). All Rights Reserved. Copyright (C) The Internet Society (2003). All Rights Reserved.
This document and translations of it may be copied and furnished to This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing document itself may not be modified in any way, such as by removing
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/