Network Working Group                                    Pierre Francois
Internet-Draft                          Universite catholique de Louvain                                  Institute IMDEA Networks
Intended status: Informational                            Bruno Decraene
Expires: April 28, 2011 June 10, 2012                                    France Telecom
                                                         Cristel Pelsser
                                               Internet Initiative Japan
                                                             Keyur Patel
                                                       Clarence Filsfils
                                                           Cisco Systems
                                                        October 25, 2010
                                                        December 8, 2011

                     Graceful BGP session shutdown
                      draft-ietf-grow-bgp-gshut-02
                      draft-ietf-grow-bgp-gshut-03

Abstract

   This draft describes operational procedures aimed at reducing the
   amount of traffic lost during planned maintenances of routers, routers or
   links, involving the shutdown of BGP peering sessions.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts.
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 28, 2011. June 10, 2012.

Copyright Notice

   Copyright (c) 2010 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008.  The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3  4
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3  4
   3.  Packet loss upon manual eBGP session shutdown  . . . . . . . .  4  5
   4.  Practices to avoid packet losses . . . . . . . . . . . . . . .  4  5
     4.1.  Improving availability of alternate paths  . . . . . . . .  5
     4.2.  Graceful shutdown procedures for eBGP sessions . .  Make before break convergence: g-shut  . . . .  5
       4.2.1.  Outbound traffic . . . . . .  6
       4.2.1.  eBGP g-shut  . . . . . . . . . . . . .  5
       4.2.2.  Inbound traffic . . . . . . . .  6
       4.2.2.  iBGP g-shut  . . . . . . . . . . .  6
       4.2.3.  Summary of operations . . . . . . . . . .  7
       4.2.3.  Router g-shut  . . . . . .  8
       4.2.4.  BGP implementation support for G-Shut . . . . . . . .  9
     4.3.  Graceful shutdown procedures for iBGP sessions . . . . . .  9  7
   5.  Forwarding modes and transient forwarding loops during
       convergence  . . . . . . . . . . . . 10
   6.  Dealing with Internet policies . . . . . . . . . . . . . . . . 10
   7.  8
   6.  Link Up cases  . . . . . . . . . . . . . . . . . . . . . . . . 11
     7.1.  8
     6.1.  Unreachability local to the ASBR . . . . . . . . . . . . . 11
     7.2.  8
     6.2.  iBGP convergence . . . . . . . . . . . . . . . . . . . . . 11
   8.  9
   7.  IANA considerations  . . . . . . . assigned g-shut BGP community . . . . . . . . . . . . . . 12
   9.  9
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 12
   10. 10
   9.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 13
   11. 10
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 10
   Appendix A.  Alternative techniques with limited applicability . . 14 11
     A.1.  In-filter reconfiguration  . . . . . . . . . . . . . . . . 14
     A.2.  Multi Exit Discriminator tweaking  . . . . . . . . . . . . 15
     A.3. 11
     A.2.  IGP distance Poisoning . . . . . . . . . . . . . . . . . . 15 11
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 12

1.  Introduction

   Routing changes in BGP can be caused by planned, manual, maintenance
   operations.  This document discusses operational procedures to be
   applied in order to reduce or eliminate losses of packets during the
   maintenance.  These losses come from the transient lack of
   reachability during the BGP convergence following the shutdown of an
   eBGP peering session between two Autonomous System Border Routers
   (ASBR).

   This document presents procedures for the cases where the forwarding
   plane is impacted by the maintenance, hence when the use of Graceful
   Restart does not apply.

   The procedures described in this document can be applied to reduce or
   avoid packet loss for outbound and inbound traffic flows initially
   forwarded along the peering link to be shut down.  These procedures
   allow
   trigger, in both involved ASes, rerouting to the alternate path,
   while allowing routers to keep using old paths until alternate ones
   are learned, ensuring installed in the RIB and in the FIB.  This ensures that
   routers always have a valid route available during the convergence
   process.

   The goal of the document is to meet the requirements described in
   [REQS] at best, without changing the BGP protocol or BGP
   implementations. protocol.

   Still, it explains why reserving a community value for the purpose of
   BGP session graceful shutdown would reduce the management overhead
   bound with the solution.  It would also allow vendors to provide an
   automatic graceful shutdown mechanism that does not require any
   router reconfiguration at maintenance time.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

2.  Terminology

   g-shut initiator : initiator: a router on which the session shutdown is performed
   for the maintenance.

   g-shut neighbor : neighbor: a router that peers with the g-shut initiator via
   (one of) the session(s) to be shut down.

   Note that for the link-up case, we will refer to these nodes as g-no-
   shut initiator, and g-no-shut neighbor.

   Initiator AS : AS: the Autonomous System of the g-shut initiator.

   Neighbor AS : AS: the Autonomous System of the g-shut neighbor.

   Affected path / Nominal / pre-convergence path : a BGP path via the
   peering link(s) undergoing the maintenance.  This path will no longer
   exist after

   Loss of Connectivity (LoC: the shutdown.

   Affected prefix : a prefix initially reached via an affected path.

   Affected router : state when a router having an affected prefix.

   Backup / alternate / post-convergence path : a has no path
   towards an affected prefix that will be selected as the best path by an affected
   router, when the link is shut down and the BGP convergence is
   completed.

   Transient alternate path : a path towards an affected prefix that may
   be transiently selected as best by an affected router during the
   convergence process but that is not a post-convergence path.

   Loss of Connectivity (LoC) : the state when a router has no path
   towards an affected prefix.

3.  Packet loss upon manual eBGP session shutdown

   Packets can prefix.

3.  Packet loss upon manual eBGP session shutdown

   Packets can be lost during a manual shutdown of an eBGP session for
   two reasons.

   First, routers involved in the convergence process can transiently
   lack of paths towards an affected prefix, and drop traffic destined
   to this prefix.  This is because alternate paths can be hidden by
   nodes of an AS.  This happens when the paths are not selected as best
   by the ASBR that receive them on an eBGP session, or by Route
   Reflectors that do not propagate them further in the iBGP topology
   because they do not select them as best.

   Second, within the AS, the FIB of routers can be transiently
   inconsistent during the BGP convergence and packets towards affected
   prefixes can loop and be dropped.  Note that these loops only happen
   when ASBR-to-ASBR encapsulation is not used within the AS.

   This document only addresses the first reason.

4.  Practices to avoid packet losses

   This section describes means for an ISP to reduce the transient loss
   of packets upon a manual shutdown of a BGP session.

4.1.  Improving availability of alternate paths

   All solutions that increase the availability of alternate BGP paths
   at routers performing packet lookups in BGP tables such as
   [BestExternal] and [AddPath] help in reducing the LoC bound with
   manual shutdown of eBGP sessions.

   One of such solutions increasing diversity in such a way that, at any
   single step of the convergence process following the eBGP session
   shutdown, a BGP router does not receive a message withdrawing the
   only path it currently knows for a given NLRI, allows for a
   simplified g-shut procedure.

   Increasing diversity with [AddPath] might lead to the respect of this
   property, depending on the path propagation decision process that
   add-path compliant routers would use.

   Using advertise-best-external [BestExternal] on ASBRs and RRs helps
   in avoiding lack of alternate paths in route reflectors upon a
   convergence.  Hence it reduces the LoC duration for the outbound
   traffic of the ISP upon an eBGP Session shutdown by reducing the iBGP
   path hunting.

   Note that the LoC for the inbound traffic of the maintained router,
   induced by a lack of alternate path propagation within the iBGP
   topology of a neighboring AS is not under the control of the operator
   performing the maintenance.  The part of the procedure described in Section 4.2.2
   should aimed at
   avoiding LoC for incoming paths can thus be applied upon the maintenance, even if no LoC
   are expected for the procedure
   described in Section 4.2.1 is not applied. outgoing paths.

4.2.  Graceful shutdown procedures for eBGP sessions  Make before break convergence: g-shut

   This section aims at describing a procedure describes configurations and actions to be applied performed to reduce
   the LoC with readily available BGP features, and without assuming a
   particular iBGP design in the Initiator and Neighbor ASes.

4.2.1.  Outbound traffic

   This section discusses
   perform a mean to render the affected paths less
   desirable by the BGP decision process graceful shutdown procedure for eBGP peering links.

   The goal of affected routers, still
   allowing these this procedure is to be used during let the convergence, paths being shutdown
   visible, but with a lower local preference, while alternate paths are propagated to the affected routers.

   A decrease of
   spread through the local-pref value iBGP topology.  Instead of withdrawing the affected paths can be
   issued in order to render the affected paths less preferable, at the
   highest possible level path,
   routers of an AS will keep on using it until they become aware of
   alternate paths.

4.2.1.  eBGP g-shut

4.2.1.1.  Pre-configuration

   On each ASBR supporting the BGP Decision Process.

   This operation can be performed by reconfiguring the out-filters
   associated with the g-shut procedure, an outbound BGP route
   policy is applied on all iBGP sessions established by of the ASBR, that:
   o    matches the g-shut
   initiator.

   The modification community
   o    sets the local-pref of the filters MUST supplant any other rule
   affecting paths tagged with the local-pref g-shut
        community to a low value of
   o    removes the g-shut community from the old paths.

   Compared to using
   o    optionally, adds an in-filter of the eBGP session AS specific g-shut community on these paths
        to indicate that these are to be shut down,
   the modification of the out-filters will not let withdrawn soon.  If some
        ingress ASBRs reset the local preference attribute, this AS
        specific g-shut initiator
   switch community will be used to another path, as override other local
        preference changes.

   Note that in the input case where an AS is aggregating multiple routes
   under a covering prefix, it is recommended to filter out the g-shut
   community from the resulting aggregate BGP decision process route.  By doing so, the
   setting of
   that router does not change.  As a consequence, the g-shut initiator
   will not modify the state community on one of its dataplane, and will not withdraw the
   affected paths over its iBGP sessions when it receives alternate
   paths.  It aggregated routes will however modify
   not let the local-pref of entire aggregate inherit the affected paths community.  Not doing so that upstream routers will switch to alternate ones.

   When the actual shutdown of
   would let the session is performed, entire aggregate undergo the g-shut
   initiator will itself switch to behavior.

4.2.1.2.  Operations at maintenance time

   On the alternate paths.

   In cases some g-shut initiator, upon maintenance time, it is required to:
   o  apply an outbound BGP speakers in route policy on the AS override maintained eBGP session
      to tag the local-pref
   attribute of paths received propagated over iBGP sessions, the procedure
   described above session with the g-shut
      community.  This will not work.  In such cases, trigger the recommended
   procedure is BGP implementation to re-
      advertise all active routes previously advertised, and tag them
      with the g-shut community.
   o  apply an inbound BGP route policy on the maintained eBGP session
      to tag the paths sent received over the iBGP sessions of session with the g-shut initiator with an AS specific
      community.  This AS specific
   community should lead

   o  wait for convergence to happen.
   o  perform a BGP session shutdown.

4.2.1.3.  BGP implementation support for G-Shut

   A BGP router implementation MAY provide features aimed at automating
   the setting application of the lowest local-pref value.
   To be effective, graceful shutdown procedures described above.

   Upon a session shutdown specified as graceful by the configuration related to this operator, a BGP
   implementation supporting a g-shut feature SHOULD:

   1.   Update all the paths propagated over the corresponding eBGP
        session, tagging the GSHUT community MUST
   supplant or to them.  Any subsequent
        update sent to the session being gracefully shut down would be applied after
        tagged with the already configured local-pref
   overriding.

   An operator may decide to follow a simplified procedure and directly
   apply an in-filter reducing GSHUT community.
   2.   Lower the local preference value of the paths received over the
        eBGP session being brought down.  While this
   procedure will be effective in many cases, corner cases as described
   in Appendix A.1 may happen, which may lead to some LoC for some
   affected destinations.  The use of this simplified procedure does not
   lead to LoC when used in conjunction shut down, upon their propagation over iBGP
        sessions.  Optionally, also tag these paths with [BestExternal].

4.2.2.  Inbound traffic

   The solution described for an AS specific
        g-shut community.  Note that alternatively, the outbound traffic can be applied at local preference
        of the
   neighbor AS.  This paths received over the eBGP session can be done either "manually" or by using a
   community value dedicated to this task.

4.2.2.1.  Phone call

   The operator performing the maintenance of the eBGP session can
   contact the operator at lowered on
        the other side g-shut initiator itself, instead of only when propagating
        over its iBGP sessions.
   3.   Optionally shut down the peering link, and let
   him apply session after a configured time.
   4.   Prevent the procedure described above for its own outbound traffic.

4.2.2.2.  Community tagging

   A community value (referred to as GSHUT community in this document)
   can be agreed upon from being inherited by neighboring ASes and used to trigger the g-shut
   behavior at the g-shut neighbor.

4.2.2.2.1.  Pre-Configuration

   A g-shut neighbor is pre-configured to set a low local-pref value for
   the path that
        would aggregate some paths received over eBGP sessions which are tagged with the GSHUT community.
        This rule must supplant any other rule affecting the local-pref value
   of behavior avoids the paths.

   This local-pref reconfiguration SHOULD GSHUT procedure to be performed at applied to the out-
   filters of
        aggregate upon the iBGP sessions graceful shutdown of the one of its covered
        prefixes.

   A BGP implementation supporting a g-shut neighbor.  That is, feature SHOULD also
   automatically install the
   g-shut neighbor does not take into account this low local-pref in its
   own BGP best path selection.  As described in Section 4.2.1 this
   approach avoids sending withdraw messages policies that can lead are supposed to LoC be
   configured, as decribed in
   some cases.

4.2.2.2.2.  Operational action upon maintenance

   Upon the manual shutdown, Section 4.2.1.1 for sessions over which
   g-shut is to be supported.

4.2.2.  iBGP g-shut

   If the output filter associated with iBGP topology is viable after the
   maintained eBGP session will be modified on maintenance of the g-shut initiator so
   as to tag session,
   i.e, if all BGP speakers of the paths AS have an iBGP signaling path for
   all prefixes advertised over on this g-shut iBGP session, then the
   shutdown of an iBGP session with does not lead to transient
   unreachability.

4.2.3.  Router g-shut

   In the GSHUT
   community.

4.2.2.2.3.  Transitivity case of a shutdown of a router, a reconfiguration of the community

   If
   outbound BGP route policies of the GSHUT community is an extended community, it SHOULD g-shut initiator MAY be chosen
   non-transitive.

   If performed
   to set a regular community is used, this community SHOULD be removed from low local-pref value for the path when paths originated by the path g-shut
   initiator (e.g, BGP aggregates redistributed from other protocols,
   including static routes).

   This behavior is propagated over eBGP sessions.

   Not propagating equivalent to the community further recommended behavior for paths
   "redistributed" from eBGP sessions to iBGP sessions in the Internet reduces case of
   the
   amount shutdown of BGP churn an ASBR.

5.  Forwarding modes and avoids rerouting in distant ASes that would
   also recognize this community value.  In other words, from a routing
   stability perspective, it helps concealing the transient forwarding loops during convergence at

   The g-shut procedure or the
   maintenance location.  From a policy perspective, it prevents
   malignant ASes from using solutions improving the community over paths propagated through
   intermediate ASes that availability of
   alternate paths, do not support the feature, in order to
   perform inbound traffic engineering at change the first AS recognizing fact that BGP convergence and the
   community.

   ASes which support
   subsequent FIB updates are runned independently on each router of the g-shut procedure SHOULD remove
   ASes.  If the community
   value(s) that they use for g-shut from AS applying the paths received from
   neighboring ASes that do solution does not support rely on encapsulation
   to forward packets from the procedure or Ingress Border Router to whom the
   service Egress
   Border Router, then transient forwarding loops and consequent packet
   losses can occur during the convergence process.  If zero LoC is not provided.

   There are
   required, encapsulation is required between ASBRs of the AS.

6.  Link Up cases where

   We identify two potential causes for transient packet losses upon an interdomain exploration
   eBGP link up event.  The first one is local to be performed
   to recover the reachability, e.g., in the case of a shutdown in
   confederations where the alternate paths will be found in another AS
   of the confederation.  In such scenarios, g-no-shut
   initiator, the community value SHOULD
   be allowed second one is due to transit through the confederation but SHOULD be removed
   from BGP convergence following the paths advertised outside
   injection of new best paths within the confederation.

   When iBGP topology.

6.1.  Unreachability local to the local-pref value of ASBR

   An ASBR that selects as best a path is conserved upon its propagation received over a newly brought up
   eBGP session may transiently drop traffic.  This can typically happen
   when the nexthop attribute differs from one AS the IP address of the confederation to eBGP
   peer, and the other, there is no need to
   have receiving ASBR has not yet resolved the GSHUT community be propagated throughout that confederation.

4.2.2.2.4.  Easing MAC address
   associated with the configuration for G-SHUT

   From a configuration burden viewpoint, it is much easier to use a
   single dedicated value for IP address of that "third party" nexthop.

   A BGP speaker implementation could avoid such losses by ensuring that
   "third party" nexthops are resolved before installing paths using
   these in the GSHUT community.

   First, on RIB.

   If the g-shut initiator, link up event corresponds to an eBGP session that is being
   manually brought up, over an already up multi-access link, then the
   operator would have a single
   configuration rule can ping third party nexthops that are expected to be applied at used
   before actually bringing the maintenance time, which would
   not depend on session up, or ping directed broadcast
   the identity subnet IP address of its peer.  This would make the
   maintenance operations less error prone.

   Second, on link.  By proceeding like this, the g-shut neighbor, a simple filter related to g-shut can MAC
   addresses associated with these third party nexthops will be applied to all resolved
   by the g-no-shut initiator.

6.2.  iBGP sessions.  Additionnaly, this filter does not
   need convergence

   Corner cases leading to be updated each time neighboring ASes are added or removed.

   The FCFS community value 0xFFFF0000 has been reserved LoC can occur during an eBGP link up event.

   A typical example for this
   purpose [BGPWKC].

4.2.3.  Summary of operations

   This section summarizes the configurations and actions to be
   performed to support the g-shut procedure such transient unreachability for eBGP peering links.

4.2.3.1.  Pre-configuration

   On each ASBR supporting the g-shut procedure, set-up an out-filter
   applied on all iBGP sessions of the ASBR, that :
   o    sets the local-pref of the paths tagged with the g-shut
        community to a low value
   o    removes the g-shut community from the paths.
   o    optionally, adds an AS specific g-shut community on these paths
        to indicate that these are to be withdrawn soon.  If some
        ingress ASBRs reset the local preference attribute, this AS
        specific g-shut community will be used to override other local
        preference changes.

4.2.3.2.  Operations at maintenance time

   On the g-shut initiator :
   o  Apply an out-filter on the maintained eBGP session to tag the
      paths propagated over the session with the g-shut community.
   o  Apply an in-filter on the maintained eBGP session to tag the paths
      received over given
   prefix is the following:

   Let's consider 3 route reflectors RR1, RR2, RR3.  There is a full
   mesh of iBGP session with between them.

        1.  RR1 is initially advertising the g-shut community.
   o  Wait for convergence current best path to happen.
   o  Perform a BGP session shutdown.

4.2.4.  BGP implementation support for G-Shut

   A BGP router implementation MAY provide features aimed at automating the application
        members of its iBGP RR full-mesh.  It propagated that path
        within its RR full-mesh.  RR2 knows only that path towards the graceful shutdown procedures described above.

   Upon a session shutdown specified as to be graceful by the operator,
   a BGP implementation supporting a g-shut feature would

   1.   Update all the paths propagated over the corresponding eBGP
        session, tagging the GSHUT community to them.  Any subsequent
        update sent to the session being gracefully shut down would be
        tagged with the GSHUT community.
        prefix.
        2.   Lower the local preference value of the paths received over the
        eBGP session being shut down, upon their propagation over iBGP
        sessions.  Optionally, also tag these paths with an AS specific
        g-shut community.  Note that alternatively, the local preference
        of the paths received over the eBGP session can be lowered on
        the g-shut initiator itself, instead of only when propagating
        over its iBGP sessions.  This simplified behavior can lead to
        some LoC, as described in Appendix A.1, if not used in
        conjunction with [BestExternal].
   3.   Optionally shut down the session after a configured time.
   4.   Prevent the GSHUT community from being inherited by a path that
        would aggregate some paths tagged with the GSHUT community.
        This behavior avoids the GSHUT procedure to be applied to the
        aggregate upon the graceful shutdown of one of its covered
        prefixes.

4.3.  Graceful shutdown procedures for iBGP sessions

   If the iBGP topology is viable after the maintenance of the session,
   i.e, if all BGP speakers of the AS have an iBGP signaling path for
   all prefixes advertised on this g-shut iBGP session, then the
   shutdown of an iBGP session does not lead to transient
   unreachability.

   However, in the case of a shutdown of a router, a reconfiguration of
   the out-filters of the g-shut initiator MAY be performed to set a low
   local-pref value for the paths originated by the g-shut initiator
   (e.g, BGP aggregates redistributed from other protocols, including
   static routes).

   This behavior is equivalent to the recommended behavior for paths
   "redistributed" from eBGP sessions to iBGP sessions in the case of
   the shutdown of an ASBR.

5.  Forwarding modes and forwarding loops

   If the AS applying the solution does not rely on encapsulation to
   forward packets from the Ingress Border Router to the Egress Border
   Router, then transient forwarding loops and consequent packet losses
   can occur during the convergence process, even if the procedure
   described above is applied.  Hence if zero LoC is required,
   encapsulation is required between ASBRs of the AS.

   Using the out-filter reconfiguration avoids the forwarding loops
   between the g-shut initiator and its directly connected upstream
   neighboring routers.  Indeed, when this reconfiguration is applied,
   the g-shut initiator keeps using its own external path and lets the
   upstream routers converge to the alternate ones.  During this phase,
   no forwarding loops can occur between the g-shut initiator and its
   upstream neighbors as the g-shut initiator keeps using the affected
   paths via its eBGP peering links.  When all the upstream routers have
   switched to alternate paths, the transition performed by the g-shut
   initiator when the session is actually shut down, will be loopfree.
   Transient forwarding loops between other routers will not be avoided
   with this procedure.

6.  Dealing with Internet policies

   A side gain of the maintenance solution is that it can also reduce
   the churn implied by a shutdown of an eBGP session.

   For this, it is recommended to apply the filters modifying the local-
   pref value of the paths to values strictly lower but as close as
   possible to the local-pref values of the post-convergence paths.

   For example, if an eBGP link is shut down between a provider and one
   of its customers, and another link with this customer remains active,
   then the value of the local-pref of the old paths SHOULD be decreased
   to the smallest possible value of the 'customer' local_pref range,
   minus 1.  Thus, routers will not transiently switch to paths received
   from shared-cost peers or providers, which could lead to the
   propagation of withdraw messages over eBGP sessions with shared-cost
   peers and providers.

   Proceeding like this reduces both BGP churn and traffic shifting as
   routers will less likely switch to transient paths.

   In the above example, it also prevents transient unreachabilities in
   the neighboring AS that are due to the sending of "abrupt" withdraw
   messages to shared-cost peers and providers.

7.  Link Up cases

   We identify two potential causes for transient packet losses upon an
   eBGP link up event.  The first one is local to the g-no-shut
   initiator, the second one is due to the BGP convergence following the
   injection of new best paths within the iBGP topology.

7.1.  Unreachability local to the ASBR

   An ASBR that selects as best a path received over a newly brought up
   eBGP session may transiently drop traffic.  This can typically happen
   when the nexthop attribute differs from the IP address of the eBGP
   peer, and the receiving ASBR has not yet resolved the MAC address
   associated with the IP address of that "third party" nexthop.

   A BGP speaker implementation could avoid such losses by ensuring that
   "third party" nexthops are resolved before installing paths using
   these in the RIB.

   If the link up event corresponds to an eBGP session that is being
   manually brought up, over an already up multi-access link, then the
   operator can ping third party nexthops that are expected to be used
   before actually bringing the session up, or ping directed broadcast
   the subnet IP address of the link.  By proceeding like this, the MAC
   addresses associated with these third party nexthops will be resolved
   by the g-no-shut initiator.

7.2.  iBGP convergence

   Similar corner cases as described in Appendix A.1 for the link down
   case, can occur during an eBGP link up event.

   A typical example for such transient unreachability for a given
   prefix is the following :

        1.  A Route Reflector, RR1, is initially advertising the current
        best path to the members of its iBGP RR full-mesh.  It
        propagated that path within its RR full-mesh.  Another route
        reflector of the full-mesh, RR2, knows only that path towards
        the prefix.
        2.  A third Route Reflector of the RR full-mesh, RR3 receives a
        new best path orginated by the "g-no-shut" initiator, being one
        of its RR clients.  RR3 selects it as best, and propagates an
        UPDATE within its RR full-mesh, i.e., to RR1 and RR2.
        3.  RR1 receives that path, reruns its decision process, and
        picks this new path as best.  As a result, RR1 withdraws its
        previously announced best-path on the iBGP sessions of its RR
        full-mesh.
        4.  If, for any reason, RR3 processes the withdraw generated in
        step 3, before processing the update generated in step 2, RR3
        transiently suffers from unreachability for the affected prefix.

   The use of [BestExternal] among the RR of the iBGP full-mesh can
   solve these corner cases by ensuring that within an AS, the
   advertisement of a new route is not translated into the withdraw of a
   former route.

   Indeed, "best-external" ensures that an ASBR does not withdraw a
   previously advertised (eBGP) path when it receives an additional,
   preferred path over an iBGP session.  Also, "best-intra-cluster"
   ensures that a RR does not withdraw a previously advertised (iBGP)
   path to its non clients (e.g. other RRs in a mesh of RR) when it
   receives a new, preferred path over an iBGP session.

8.  IANA considerations

   Applying the g-shut procedure is rendered much easier with a reserved
   g-shut community value.  The community value 0xFFFF0000 has been
   reserved from the FCFS community pool for this purpose.

9.  Security Considerations

   By providing the g-shut service to a neighboring AS, an ISP provides
   means to this neighbor to lower the local-pref value assigned to the
   paths received from this neighbor.

   The neighbor could abuse the technique and do inbound traffic
   engineering by declaring some prefixes as undergoing a maintenance so
   as to switch traffic to another peering link.

   If this behavior is not tolerated by the ISP, it SHOULD monitor the
   use of the g-shut community by this neighbor.

   ASes which support the g-shut procedure SHOULD remove the community
   value(s) that they use for g-shut from the paths received from
   neighboring ASes that do not support the procedure or to whom the
   service is not provided.  Doing so prevents malignant ASes from using
   the community through intermediate ASes that do not support the
   feature, in order to perform inbound traffic engineering.

10.  Acknowledgments

   The authors wish to thank Olivier Bonaventure and Pradosh Mohapatra
   for their useful comments on this work.

11.  References

   [AddPath]  D. Walton, A. Retana, and E. Chen, "Advertisement of
              Multiple Paths in BGP", draft-walton-bgp-add-paths-06.txt
              (work in progress).

   [BestExternal]
              Marques, P., Fernando, R., Chen, E., and P. Mohapatra,
              "Advertisement of the best-external route to IBGP",
               draft-ietf-idr-best-external-00.txt, May 2009.

   [REQS]     Decraene, B., Francois, P., Pelsser, C., Ahmad, Z.,
              Armengol, A., and T. Takeda, "Requirements for the
              graceful shutdown of BGP sessions",
               draft-ietf-grow-bgp-graceful-shutdown-requirements-
              06.txt, October 2010.

   [RFC4360]  Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
              Communities Attribute", RFC 4360, February 2006.

   [Clarification4360]
              Decraene, B., Vanbever, L., and P. Francois, "RFC 4360
              Clarification Request",
               draft-decraene-idr-rfc4360-clarification-00,
              October 2009.

   [BGPWKC]   "http://www.iana.org/assignments/
              bgp-well-known-communities".

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

Appendix A.  Alternative techniques with limited applicability

   A few alternative techniques have been considered to provide g-shut
   capabilities but have been rejected due to their limited
   applicability.  This section describe them for possible reference.

A.1.  In-filter reconfiguration

   An In-filter reconfiguration on the eBGP session undergoing  RR3 receives a new best path orginated by the
   maintenance could be performed instead "g-no-shut"
        initiator, being one of out-filter reconfigurations its RR clients.  RR3 selects it as best,
        and propagates an UPDATE within its RR full-mesh, i.e., to RR1
        and RR2.
        3.  RR1 receives that path, reruns its decision process, and
        picks this new path as best.  As a result, RR1 withdraws its
        previously announced best-path on the iBGP sessions of its RR
        full-mesh.
        4.  If, for any reason, RR3 processes the g-shut initiator.

   Upon withdraw generated in
        step 3, before processing the application update generated in step 2, RR3
        transiently suffers from unreachability for the affected prefix.

   The use of [BestExternal] among the maintenance procedure, if RR of the g-shut
   initiator has an alternate path in its Adj-Rib-In, it will switch to
   it directly.

   If this new path was advertised iBGP full-mesh can
   solve these corner cases by ensuring that within an eBGP neighbor of the g-shut
   initiator, AS, the g-shut initiator will send
   advertisement of a BGP Path Update message
   advertising the new path over its iBGP and eBGP sessions.

   If this new path was received over an iBGP session, route is not translated into the g-shut
   initiator will select withdraw of a
   former route.

   Indeed, "best-external" ensures that path and an ASBR does not withdraw the a
   previously advertised (eBGP) path over its non-client iBGP sessions.  There can be iBGP
   topologies where the iBGP peers of the g-shut initiator do not know when it receives an alternate path, and hence may drop traffic.

   Also, applying additional,
   preferred path over an In-filter reconfiguration on the eBGP session
   undergoing the maintenance may lead to transient LoC, in full-mesh iBGP topologies if

        a.  An ASBR of the initiator AS, ASBR1 did session.  Also, "best-intra-cluster"
   ensures that a RR does not initially select
        its own external withdraw a previously advertised (iBGP)
   path as best, and

        b.  An ASBR to its non clients (e.g. other RRs in a mesh of the initiator AS, ASBR2 advertises RR) when it
   receives a new new, preferred path
        along its over an iBGP sessions upon the reception of ASBR1's update
        following session.

7.  IANA assigned g-shut BGP community

   Applying the in-filter reconfiguration on g-shut procedure is rendered much easier with the use of
   a single g-shut initiator, community value which could be used on all eBGP
   sessions, for both inbound and

        c.  ASBR1 receives outbound signaling.  The community
   value 0xFFFF0000 has been assigned by IANA for this purpose.

   For Internet routes, a non transitive extended community will be
   reserved from the update message, runs its Decision Process
        and hence withdraws its external path after having selected
        ASBR2's path as best, and

        d.  An impacted router pool defined in [EXT_POOL].  Using such a community
   type allows for not leaking graceful signaling out of the AS processes the withdraw of ASBR1
        before processing
   boundaries, without the update from ASBR2.

   Applying a reconfiguration of need to explicitly configure filters to strip
   the out-filters prevents such transient
   unreachabilities.

   Indeed, when community off upon path propagation.

8.  Security Considerations

   By providing the g-shut initiator propagates service to a neighboring AS, an update of ISP provides
   means to this neighbor to lower the old
   path first, local-pref value assigned to the withdraw
   paths received from ASBR2 does not trigger unreachability
   in other nodes, as this neighbor.

   The neighbor could abuse the old path technique and do inbound traffic
   engineering by declaring some prefixes as undergoing a maintenance so
   as to switch traffic to another peering link.

   If this behavior is still available.  Indeed, even
   though not tolerated by the ISP, it receives alternate paths, SHOULD monitor the
   use of the g-shut community by this neighbor.

   ASes using the regular (transitive) g-shut community SHOULD remove
   the community from neighboring ASes that do not support the g-shut initiator keeps
   procedure.  Doing so prevents malignant remote ASes from using
   its old path as best as the in-filter of the maintained eBGP session
   has
   community through intermediate ASes that do not been modified yet.

   Applying support the out-filter reconfiguration also prevents packet loops
   between feature,
   in order to perform inbound traffic engineering.  ASes using the g-shut initiator and its direct neighbors when
   encapsulation is non-
   transitive extended community do not used between need to do this as the ASBRs community
   is non transitive and hence cannot be used by remote ASes.

9.  Acknowledgments

   The authors wish to thank Olivier Bonaventure and Pradosh Mohapatra
   for their useful comments on this work.

10.  References

   [AddPath]  D. Walton, A. Retana, and E. Chen, "Advertisement of
              Multiple Paths in BGP", draft-walton-bgp-add-paths-06.txt
              (work in progress).

   [BestExternal]
              Marques, P., Fernando, R., Chen, E., and P. Mohapatra,
              "Advertisement of the AS.

   Note that applying this simplified procedure best-external route to IBGP",
               draft-ietf-idr-best-external-00.txt, May 2009.

   [REQS]     Decraene, B., Francois, P., Pelsser, C., Ahmad, Z.,
              Armengol, A., and T. Takeda, "Requirements for the
              graceful shutdown of BGP sessions",
               draft-ietf-grow-bgp-graceful-shutdown-requirements-

              06.txt, October 2010.

   [RFC4360]  Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
              Communities Attribute", RFC 4360, February 2006.

   [EXT_POOL]
              Decraene, B. and P. Francois, "Assigned BGP extended
              communities",
               draft-ietf-idr-reserved-extended-communities-01,
              May 2011.

   [BGPWKC]   "http://www.iana.org/assignments/
              bgp-well-known-communities".

   [RFC2119]  Bradner, S., "Key words for use in conjunction RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

Appendix A.  Alternative techniques with
   [BestExternal] does not lead limited applicability

   A few alternative techniques have been considered to LoC.

A.2. provide g-shut
   capabilities but have been rejected due to their limited
   applicability.  This section describe them for possible reference.

A.1.  Multi Exit Discriminator tweaking

   The MED attribute of the paths to be avoided can be increased so as
   to force the routers in the neighboring AS to select other paths.

   The solution only works if the alternate paths are as good as the
   initial ones with respect to the Local-Pref value and the AS Path
   Length value.  In the other cases, increasing the MED value will not
   have an impact on the decision process of the routers in the
   neighboring AS.

A.3.

A.2.  IGP distance Poisoning

   The distance to the BGP nexthop corresponding to the maintained
   session can be increased in the IGP so that the old paths will be
   less preferred during the application of the IGP distance tie-break
   rule.  However, this solution only works for the paths whose
   alternates are as good as the old paths with respect to their Local-
   Pref value, their AS Path length, and their MED value.

   Also, this poisoning cannot be applied when nexthop self is used as
   there is no nexthop specific to the maintained session to poison in
   the IGP.

Authors' Addresses

   Pierre Francois
   Universite catholique de Louvain
   Place Ste Barbe, 2
   Louvain-la-Neuve  1348
   BE
   Institute IMDEA Networks
   Avda. del Mar Mediterraneo, 22
   Leganese  28918
   ES

   Email: pierre.francois@uclouvain.be
   URI:   http://inl.info.ucl.ac.be/pfr pierre.francois@imdea.org

   Bruno Decraene
   France Telecom
   38-40 rue du General Leclerc
   92794 Issi Moulineaux cedex 9
   FR

   Email: bruno.decraene@orange-ftgroup.com bruno.decraene@orange.com

   Cristel Pelsser
   Internet Initiative Japan
   Jinbocho Mitsui Bldg.
   1-105 Kanda Jinbo-cho
   Tokyo  101-0051
   JP

   Email: pelsser.cristel@iij.ad.jp

   Keyur Patel
   Cisco Systems
   170 West Tasman Dr
   San Jose, CA  95134
   US

   Email: keyupate@cisco.com

   Clarence Filsfils
   Cisco Systems
   De kleetlaan 6a
   Diegem  1831
   BE

   Email: cfilsfil@cisco.com