--- 1/draft-ietf-idr-restart-11.txt 2006-06-12 22:12:26.000000000 +0200 +++ 2/draft-ietf-idr-restart-12.txt 2006-06-12 22:12:26.000000000 +0200 @@ -1,20 +1,20 @@ Network Working Group Srihari R. Sangli Internet Draft Yakov Rekhter -Expiration Date: November 2006 Rex Fernando +Expiration Date: December 2006 Rex Fernando John G. Scudder Enke Chen Graceful Restart Mechanism for BGP - draft-ietf-idr-restart-11.txt + draft-ietf-idr-restart-12.txt Status of this Memo Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any @@ -29,21 +29,21 @@ IPR Disclosure Acknowledgement By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Abstract - This document proposes a mechanism for BGP that would help minimize + This document describes a mechanism for BGP that would help minimize the negative effects on routing caused by BGP restart. An End-of-RIB marker is specified and can be used to convey routing convergence information. A new BGP capability, termed "Graceful Restart Capability", is defined which would allow a BGP speaker to express its ability to preserve forwarding state during BGP restart. Finally, procedures are outlined for temporarily retaining routing information across a TCP transport reset. The mechanisms described in this document are applicable to all routers, both those with the ability to preserve forwarding state @@ -62,21 +62,21 @@ Usually when BGP on a router restarts, all the BGP peers detect that the session went down, and then came up. This "down/up" transition results in a "routing flap" and causes BGP route re-computation, generation of BGP routing updates and flap the forwarding tables. It could spread across multiple routing domains. Such routing flaps may create transient forwarding blackholes and/or transient forwarding loops. They also consume resources on the control plane of the routers affected by the flap. As such they are detrimental to the overall network performance. - This document proposes a mechanism for BGP that would help minimize + This document describes a mechanism for BGP that would help minimize the negative effects on routing caused by BGP restart. An End-of-RIB marker is specified and can be used to convey routing convergence information. A new BGP capability, termed "Graceful Restart Capability", is defined which would allow a BGP speaker to express its ability to preserve forwarding state during BGP restart. Finally, procedures are outlined for temporarily retaining routing information across a TCP transport reset. 3. Marker for End-of-RIB @@ -194,21 +194,21 @@ the has indeed been preserved during the previous BGP restart. When set (value 1), the bit indicates that the forwarding state has been preserved. The remaining bits are reserved, and SHOULD be set to zero by the sender and ignored by the receiver. When a sender of this capability doesn't include any in the capability, it means that the sender is not capable of preserving its forwarding state during BGP restart, but supports procedures for - the Receiving Speaker (as defined in Section 6.2 of this document). + the Receiving Speaker (as defined in Section 5.2 of this document). In that case the value of the "Restart Time" field advertised by the sender is irrelevant. A BGP speaker SHOULD NOT include more than one instance of the Graceful Restart Capability in the capability advertisement [BGP- CAP]. If more than one instance of the Graceful Restart Capability is carried in the capability advertisement, the receiver of the advertisement SHOULD ignore all but the last instance of the Graceful Restart Capability. @@ -235,40 +235,40 @@ The End-of-RIB marker SHOULD be sent by a BGP speaker to its peer once it completes the initial routing update (including the case when there is no update to send) for an address family after the BGP session is established. It is noted that the normal BGP procedures MUST be followed when the TCP session terminates due to the sending or receiving of a BGP NOTIFICATION message. - In general the Restart Time SHOULD NOT be greater than the HOLDTIME - carried in the OPEN. + A suggested default for the Restart Time is a value less than or + equal to the HOLDTIME carried in the OPEN. In the following sections, "Restarting Speaker" refers to a router whose BGP has restarted, and "Receiving Speaker" refers to a router that peers with the restarting speaker. Consider that the Graceful Restart Capability for an address family is advertised by the Restarting Speaker, and is understood by the Receiving Speaker, and a BGP session between them is established. The following sections detail the procedures that SHALL be followed by the Restarting Speaker as well as the Receiving Speaker once the Restarting Speaker restarts. 5.1. Procedures for the Restarting Speaker - When the Restarting Speaker restarts, possible it SHOULD retain, if - possible, the forwarding state for the BGP routes in the Loc-RIB, and - SHALL mark them as stale. It SHOULD NOT differentiate between stale - and other information during forwarding. + When the Restarting Speaker restarts, it SHOULD retain, if possible, + the forwarding state for the BGP routes in the Loc-RIB, and SHALL + mark them as stale. It SHOULD NOT differentiate between stale and + other information during forwarding. To re-establish the session with its peer, the Restarting Speaker MUST set the "Restart State" bit in the Graceful Restart Capability of the OPEN message. Unless allowed via configuration, the "Forwarding State" bit for an address family in the capability can be set only if the forwarding state has indeed been preserved for that address family during the restart. Once the session between the Restarting Speaker and the Receiving Speaker is re-established, the Restarting Speaker will receive and @@ -338,20 +338,28 @@ Graceful Restart Capability of the OPEN message sent by the Receiving Speaker SHALL NOT be set unless the Receiving Speaker has restarted. The presence and the setting of the "Forwarding State" bit for an address family depends upon the actual forwarding state and configuration. If the session does not get re-established within the "Restart Time" that the peer advertised previously, the Receiving Speaker SHALL delete all the stale routes from the peer that it is retaining. + A BGP speaker could have some way of determining whether its peer's + forwarding state is still viable, for example through [BFD] or + through monitoring layer two information. Specifics of such + mechanisms are beyond the scope of this document. In the event that + it determines that its peer's forwarding state is not viable prior to + the re-establishment of the session, the speaker MAY delete all the + stale routes from the peer that it is retaining. + Once the session is re-established, if the "Forwarding State" bit for a specific address family is not set in the newly received Graceful Restart Capability, or if a specific address family is not included in the newly received Graceful Restart Capability, or if the Graceful Restart Capability isn't received in the re-established session at all, then Receiving Speaker SHALL immediately remove all the stale routes from the peer that it is retaining for that address family. The Receiving Speaker SHALL send the End-of-RIB marker once it completes the initial update for an address family (including the @@ -482,27 +490,29 @@ - drops the TCP connection, - increments the ConnectRetryCounter by 1, - changes its state to Idle. 7. Deployment Considerations While the procedures described in this document would help minimize the effect of routing flaps, it is noted, however, that when a BGP - Graceful Restart capable router restarts, there is a potential for - transient routing loops or blackholes in the network if routing - information changes before the involved routers complete routing - updates and convergence. Also, depending on the network topology, if - not all IBGP speakers are Graceful Restart capable, there could be an - increased exposure to transient routing loops or blackholes when the - Graceful Restart procedures are exercised. + Graceful Restart capable router restarts, or if it restarts without + preserving its forwarding state (for example due to a power failure) + there is a potential for transient routing loops or blackholes in the + network if routing information changes before the involved routers + complete routing updates and convergence. Also, depending on the + network topology, if not all IBGP speakers are Graceful Restart + capable, there could be an increased exposure to transient routing + loops or blackholes when the Graceful Restart procedures are + exercised. The Restart Time, the upper bound for retaining routes and the upper bound for deferring route selection may need to be tuned as more deployment experience is gained. Finally, it is noted that the benefits of deploying BGP Graceful Restart in an AS whose IGPs and BGP are tightly coupled (i.e., BGP and IGPs would both restart) and IGPs have no similar Graceful Restart capability are reduced relative to the scenario where IGPs do have similar Graceful Restart capability. @@ -588,21 +598,26 @@ [BGP-AUTH] Heffernan A., "Protection of BGP Sessions via the TCP MD5 Signature Option", RFC 2385, August 1998. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [IANA-AFI] http://www.iana.org/assignments/address-family-numbers. [IANA-SAFI] http://www.iana.org/assignments/safi-namespace. -14. Author Information +14. Non-normative References + + [BFD] Katz, D., Ward, D., "Bidirectional Forwarding Detection", + draft-ietf-bfd-base-03.txt, work in progress + +15. Author Information Srihari R. Sangli Cisco Systems, Inc. EMail: rsrihari@cisco.com Yakov Rekhter Juniper Networks, Inc. EMail: yakov@juniper.net Rex Fernando