Network
GROW Working Group                                         P. Francis                                             R. Raszuk
Internet-Draft                                                   MPI-SWS                                                     A. Lo
Intended status: Informational                                     X. Xu
Expires: September 3, 2011                                        Huawei
                                                              H. Ballani
                                                              Cornell U.
                                                               R. Raszuk                                     Cisco
Expires: January 5, 2012                                        L. Zhang
                                                                    UCLA
                                                           March 2,
                                                                   X. Xu
                                                                  Huawei
                                                            July 4, 2011

                   Simple Virtual Aggregation (S-VA)
                    draft-ietf-grow-simple-va-02.txt
                    draft-ietf-grow-simple-va-03.txt

Abstract

   The continued growth in the Default Free Routing Table (DFRT)
   stresses the global routing system in a number of ways.  One of the
   most costly stresses is FIB size: ISPs often must upgrade router
   hardware simply because the FIB has run out of space, and router
   vendors must design routers that have adequate FIB.

   FIB suppression is an approach to relieving stress on the FIB by NOT
   loading selected RIB entries into the FIB.  Simple Virtual
   Aggregation (S-VA) is a simple form of Virtual Aggregation (VA) that
   allows any and all edge routers to shrink their RIB and FIB
   requirements substantially and therefore increase their useful
   lifetime.

   S-VA does not change increase FIB requirements for core routers.  S-VA is
   extremely easy to
   configure---considerably configure considerably more so than the various
   tricks done today to extend the life of edge routers.  S-VA can be
   deployed autonomously by an ISP (cooperation between ISPs is not
   required), and can co-exist with legacy routers in the ISP.  There are no
   changes from the 01 version to this version.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."
   This Internet-Draft will expire on September 3, 2011. January 5, 2012.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . 4
     1.1.  Scope of this Document  . . . . . . . . . . . . . . . . . . 5
     1.2.  Requirements notation . . . . . . . . . . . . . . . . . . . 5
     1.3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5
   2.  Operation of S-VA . . . . . . . . . . . . . . . . . . . . . . . 6
     2.1.  Tunnels
   3.  Deployment considerations . . . . . . . . . . . . . . . . . . . 7
   4.  IANA Considerations . . . . . .  7
     2.2.  Legacy Routers . . . . . . . . . . . . . . . . 8
   5.  Security Considerations . . . . . .  8
   3.  IANA Considerations . . . . . . . . . . . . . . 8
   6.  Acknowledgements  . . . . . . .  8
   4.  Security Considerations . . . . . . . . . . . . . . . . 8
   7.  References  . . .  9
   5.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9
   6.
     7.1.  Normative References  . . . . . . . . . . . . . . . . . . . 9
     7.2.  Informative References  . . . . . . . . . . . . . . . . . . 9
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . . . 10 9

1.  Introduction

   ISPs today manage constant DFRT growth in a number of ways.  One way,
   of course, is for ISPs to upgrade their router hardware before DFRT
   growth outstrips the size of the FIB.  This is too expensive for many
   ISPs.  They would prefer to extend the lifetime of routers whose FIBs
   can no longer hold the full DFRT.

   A common approach taken by lower-tier ISPs is to default route to
   their providers.  Routes to customers and peer ISPs are maintained,
   but everything else defaults to the provider.  This approach has
   several disadvantages.  First, packets to Internet destinations may
   take longer-than-necessary AS paths.

   This problem can be mitigated through careful configuration of
   partial defaults, but this can require substantial configuration
   overhead.  A second problem with defaulting to providers is that the
   ISP is no longer able to provide the full DFRT to its customers.
   Finally, provider defaults prevents the ISP from being able to detect
   martian packets.  As a result, the ISP transmits packets that could
   otherwise have been dropped over its expensive provider links.  Simple Virtual Aggregation (S-VA) solves
   these problems because the full DFRT is used by core routers.

   An alternative is for the ISP to maintain full routes in its core
   routers, but to filter routes from edge routers that do not require a
   full DFRT.  These edge routers can then default route to the core or
   exit routers.  This is often possible with edge routers that
   interface to customer networks.  The problem with this approach is
   that it cannot be used for all edge routers.  For instance, it cannot
   be used for routers that connect to transits.  It should also not be
   used for routers that connect to customers which wish to receive the
   full DFRT.

   This draft describes a very simple technique, called Simple Virtual
   Aggregation (S-VA), that allows any and all edge routers to have
   substantially reduced FIB requirements even while still advertising
   and receiving the full DFRT over BGP.  The basic idea is as follows.
   Core routers in the ISP maintain the full DFRT in the FIB and RIB.
   Edge routers maintain the full DFRT in the BGP protocol RIB, but
   suppress certain routes from the FIB. being installed in RIB and FIB tables.
   Edge routers install a default route to core
   routers.  Label Switched Paths (LSP) are used routers, to transmit packets
   from a core router, through ABRs which
   are installed on the edge router, POP to the Next Hop remote
   Autonomous System Border Router (ASBR).  ASBRs strip the tunnel
   header (MPLS core boundary or IP) before forwarding tunneled packets to the remote ASBR (in much the same way MPLS Penultimate Hop Popping (PHP) strips
   the LSP header before forwarding packets to the tunnel target). routers.

   S-VA requires no changes to BGP and no changes to MPLS any choice of
   forwarding mechanisms in routers.  Configuration is extremely simple:
   S-VA must be enabled, enabled on the edge router which needs to save it's RIB
   and routers FIB space.  In the same time operator must told whether they are FIB-suppressing inject into his intra-
   domain routing a new prefix further called virtual aggregate (VA-
   prefix) which will be used as the aggregate forwarding reference by
   the edge routers or not. performing S-VA.  Everything else is automatic.
   ISPs can deploy FIB suppression autonomously and with no coordination
   with neighbor ASes.

1.1.  Scope of this Document

   The scope of this document is limited to Intra-domain S-VA operation.
   In other words, the case where a single ISP autonomously operates
   S-VA internally without any coordination with neighboring ISPs.

   Note that this document assumes that the S-VA "domain" (i.e. the unit
   of autonomy) is the AS (that is, different ASes run S-VA
   independently and without coordination).  For the remainder of this
   document, the terms ISP, AS, and domain are used interchangeably.

   This document applies equally to IPv4 and IPv6. IPv6 both unicast and
   multicast address families.

   S-VA may operate with a mix of upgraded routers and legacy routers.
   There are no topological restrictions placed on the mix of routers.
   In order
   S-VA functionality is local to avoid loops between upgraded the router on which it is enabled and legacy routers, however,
   legacy routers must be able to terminate tunnels.
   routing correctness is guaranteed.

   Note that S-VA is a greatly simplified variant of "full VA"
   [I-D.ietf-grow-va].  With full VA, all routers (core or otherwise)
   can have reduced FIBs.  However, full VA requires substantial new
   configuration and operational complexity compared to S-VA.  Full VA
   also requires the use of MPLS LSPs between all routers.  Note that
   S-VA was formerly specified in [I-D.ietf-grow-va].  It has been moved
   to this separate draft to simplify its understanding.

1.2.  Requirements notation

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

1.3.  Terminology

   FIB-Installing

   RIB/FIB-Installing Router (FIR):  An S-VA router that does not suppress
      any routes, and advertises itself as a default route for 0/0.
      Typically a core router, POP to core boundary router or route reflector an ASBR
      would be configured as an FIR.
   FIB-Suppressing
   RIB/FIB-Suppressing Router (FSR):  An S-VA router that installs a
      route to 0/0, and may suppress other routes.  Typically an edge
      router would be configured as an FSR.

   Install and Suppress:  The terms "install" and "suppress" are used to
      describe whether a protocol local RIB entry has been loaded or not
      loaded into the global RIB and FIB.  In other words, the phrase
      "install a route" means "install a route into the global RIB and
      FIB", and the phrase "suppress a route" means "do not install a
      route from BGP into the global RIB and FIB".
   Legacy Router:  A router that does not run S-VA, and has no knowledge
      of S-VA.
   Global Routing Information Base (RIB):  The term global RIB is used rather sloppily
      in this document to refer either
      to indicate the loc-RIB (as router's main routing information base.  That RIB
      is normally used in
      [RFC4271]), or to populate FIB tables of the combined Adj-RIBs-In, the Loc-RIB, and router.  It needs
      to be highlighted that unless FIB compression is used global RIB
      and FIB tables are in sync.
   Local/Protocol Routing Information Base (loc-RIB):  The term local
      RIB is used to indicate the protocol's table where product of SPF
      or BGP best path selection is kept before being installed in
      global RIB.  In some protocol's implementations for example BGP
      loc-RIB can be further divided into Adj-RIBs-In, the Loc-RIB, and
      the Adj-RIBs-Out.

2.  Operation of S-VA

   There are three types of routers in S-VA, FIB-Installing routers
   (FIR), FIB-Suppressing routers (FSR), and optionally legacy routers.
   While any router can be an FIR or an FSR (there are no topology
   constraints), the simplist most simple form of deployment is for AS border or
   POP border routers to be configured as edge routers, FIRs, and for non-border customer facing
   edge routers (for
   instance respectively in the routers used as route reflectors) AS or in the POP to be configured as
   core routers.  S-VA, however, does not mandate this deployment per
   se.
   FSRs.

   FIRs must originate a default BGP route to NLRI 0/0 [RFC4271].  The
   ORIGIN is set to INCOMPLETE (value 2), the AS number of the FIR's AS is used in
   the AS_PATH, 2) and the BGP NEXT_HOP is set to
   match the router's own address. other BGP routes which are also advertised by said FIR.
   The ATOMIC_AGGREGATE and AGGREGATOR attributes are not included.  The
   FIR MUST attach a NO_EXPORT Communities Community Attribute [RFC1997] to the
   default route.

   FIRs must should not FIB-suppress any routes.  They may, however, still
   use some form of local FIB compression algorithm if deemed necessary.

   FSRs must FIB-install a route to 0/0.  When transmitting a packet to
   a FIR (i.e. based on a detect the VA prefix 0/0 FIB lookup), and install it both in loc-RIB,
   RIB and FIB.  Following that FSR may suppres any more specific routes
   which carry the packet must same next hop as the VA prefix.  To guarantee
   semantical correctness FSR by default should also be tunneled.
   This is able to prevent loops that would otherwise occur when a packet
   transits multiple FSRs on detect
   installation of not matching next hop route and reinstall all the way
   more specifics which were previously eligible for suppression to the core, some of
   maintain semantical forwarding correctness.

   Generally, any more specific route which have
   FIB-installed carries the route for same next hop as
   the destination, VA-prefix 0/0 is eligible for suppression.  However, provided
   that there was at least one less specific prefix (e.g., 1.0.0.0/8)
   and others the next-hop of such prefix was different from that of the VA
   0/0, those more specific prefixes (e.g., 1.1.1.0/24) which have
   not.  FSRs may FIB-install any other routes.  They should install any
   routes are
   otherwise subject to suppression would not be eligible for which their eBGP neighbor
   suppression anymore.

   Similarly when IBGP multipath is the NEXT_HOP.  There enabled and when multiple VA
   prefixes are a
   couple reasons for this, detected which can be illustrated in are multipath candidates under given
   network condition only those more specific prefixes are subject to
   suppresion which have the figure
   below.  This figure shows an identical set of next hops as multipath set
   of VA prefixes.

   Let's illustrate the expected behaviour on the figure below.  This
   figure shows an autonomous system with a FIR FIR1 and an FSR FSR1.
   FSR1 is an ASBR and is connected to two remote ASBRs, EP1 and EP2.

        +------------------------------------------+
        |      Autonomous System                   |   +----+
        |                                          |   |EP1 |
        |                                      /---+---|    |
        |   To   ----\ +----+          +----+ /    |   +----+
        | Other       \|FIR1|----------|FSR1|/     |
        |Routers      /|    |          |    |\     |
        |        ----/ +----+          +----+ \    |   +----+
        |                                      \---+---|EP2 |
        |                                          |   |    |
        |                                          |   +----+
        +------------------------------------------+

   Suppose that FSR1 does not FIB-install has been enabled to perform S-VA.  Originally it
   receives all routes for which from FIR1 (doing next hop self) as well as
   directly connected EBGP peers EP1 and EP2
   are next hops.  In this case, when EP2 sends EP2.  FIR1 now will advertise a packet
   VA prefix 0/0 with next hop set to himself.  That will trigger
   detection of such prefix on FSR1 for and suppression all routes which
   have the same next hop is EP1, FSR1 will first tunnel the packet to FIR1, as VA prefix and which will tunnel otherwise would be
   installed in RIB and FIB.  However it right back needs to FSR1.  This trombone routing is
   avoided if local ASBRs FIB-install routes where their neighbor remote
   ASBRs are the BGP NEXT_HOP.

   In addition, FSR1 cannot filter source addresses using strict unicast
   Reverse Path Forwarding (uRPF) unless it FIB-installs the routes
   learned from the remote ASBR.  Note, however, that FSRs cannot do
   loose uRPF.  Rather, this must be done by FIRs.

   The above observations lead to the following rules: FSRs that are
   ASBRs should FIB-install all routes for which the neighbor is the BGP
   NEXT_HOP.  FSRs observed that are ASBRs must FIB-install FSR1
   will not suppres any EBGP routes that are
   used for uRPF.

2.1.  Tunnels

   S-VA works with both MPLS received from his peers EP1 and IP-in-IP tunnels.  There are
   potentially up to two tunnels required for a packet EP2
   due to traverse an AS
   with S-VA.  The first tunnel is that next hop being different from an FSR to a FIR (for the
   0/0 default).  This is called the default tunnel.  The second tunnel
   targets the remote ASBR which is the BGP NEXT_HOP, although the
   tunnel header is stripped by the local ASBR before transmitting one assinged to
   the remote ASBR.  This is the exit tunnel.  The start of the exit
   tunnel is an ingress local ASBR in the case where the ingress local
   ASBR has FIB-installed the associated route.  Otherwise, the start of
   the exit tunnel is a FIR. VA-prefix.

3.  Deployment considerations

   The target address simplest deployment model of the default tunnel is always the FIR.  If MPLS S-VA is used, the FIRs must initiate LSPs to themselves using either it's use within the
   Label Distribution Protocol (LDP) [RFC5036].  RSVP-TE [RFC3209] may
   also be used.

   If IP-in-IP tunnels are used, then POP.  In
   such model the BGP Encapsulation Extended
   Community (BGPencap-Attribute) ([RFC5512]) is used POP to convey core boundary routers (usually RRs in the
   ability data
   path) would act as FIRs and would inject VA-prefix 0/0 to accept tunnels at the target address (the BGP NEXT_HOP).

   For all of it's
   clients within the exit tunnels, again either MPLS or IP-in-IP can be used. POP.  In
   the case such model of IP-in-IP, the inner label defined in [RFC4023] and
   signaled in BGP with [RFC3107] is used by the local ASBR to identify
   the remote ASBR which is the BGP NEXT_HOP for the packet.
   Specifically, when a local ASBR, which operation an observation
   can be either an FSR or a FIR,
   advertises an eBGP-received route into iBGP, it sets the BGP NEXT_HOP
   as itself.  It assigns a label made that such ABRs do have full routing knowledge and client
   to the route.  This label ABR distance is used negligable as compared with client to intra-domain
   exit distance.

   Therefor under the inner label above intra POP S-VA deployment model clients can
   be configured that even in packets tunneled to the local ASBR, and event of lack of ABR to ABR
   advertisement symmetry there is used still no need to
   identify the remote ASBR from which the monitor if more
   specific unsuppressed route was received.  When
   receiving a packet with would cover suppressed one.  Thus in this label, the local ASBR strips off the
   label, and forwards the native packet
   particular deployment model there is no need to detect and reinstall
   the remote ASBR indicated by
   the label.

   In the case of MPLS, the inner label may or may not previously suppressed ones.

   Another deploymet consideration should be used.  If it
   is used, then an LSP is established given to networks which may
   utilize route reflection.  In the IP address event of the local
   ASBR as described above for FIRs.  The BGP NEXT_HOP is set to enabling IBGP multipath a
   special care must be
   itself (the same address taken that serves both outbound prefixes as the FEC in the LSP).  The
   inner label is established well as described in the previous paragraph for
   IP-in-IP tunnels, but with VA-
   prefixes would pass via said route reflectors to their clients.

   In order to addess the encapsulation defined in [RFC3032].

   If above aspects the inner label is following solutions could be
   considered:

   - Use of intra-POP S-VA
   - Full mesh  Small or medium side networks where S-VA can be deployed
      are normally fully meshed and do not used, then the local ASBR must initiate a
   Downstream Unsolicited LSP for each remote ASBR.  The FEC for the LSP
   is the remote ASBR address use route reflection.  It
      also needs to pointed out that is used in the some large networks are also fully
      meshed today.
   - Use of add-paths  Use of add-paths new BGP NEXT_HOP field.
   When a packet is received on encoding will allow to
      distribute more then one overall best path from RR to each client.
   - Alternate advertisement of these LSPs, the local ASBR strips
   the MPLS header, and forwards the packet VA-prefix   S-VA prefix does not need to
      be advertised in BGP.  The BGP suppression will happen as long as
      we configure the remote ASBR indicated
   by the label.

2.2.  Legacy Routers S-VA may be operated with a mix of legacy next hop(s) and S-VA-upgraded routers.
   The legacy routers, however, must be able to forward tunneled
   packets.  In the case of MPLS tunnels, this means implementation verifies
      that they must
   fully participate in MPLS signaling.  If a legacy router such VA-prefix is an ASBR,
   then it must also initiate tunnels to itself and be able to detunnel
   packets (without installed in the inner label).

3. RIB and FIB.

4.  IANA Considerations

   There are no IANA considerations.

4.

5.  Security Considerations

   The authors are not aware of any new security considerations due to
   S-VA.

5.

6.  Acknowledgements

   The concept for S-VA Virtual Aggregation comes from Robert Raszuk.

6. Paul Francis.  In this
   document authors only simplified some aspects of it's behaviour to
   allow simpler adoption by some operators.

   Authors would like to thank Clarence Filsfils for his valuable input.

7.  References

7.1.  Normative References

   [I-D.ietf-grow-va]
              Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and
              L. Zhang, "FIB Suppression with Virtual Aggregation",
              draft-ietf-grow-va-00 (work in progress), May 2009.

   [RFC1997]  Chandrasekeran, R., Traina, P., and T. Li, "BGP
              Communities Attribute", RFC 1997, August 1996.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3032]  Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
              Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack
              Encoding", RFC 3032, January 2001.

   [RFC3107]  Rekhter, Y. and E. Rosen, "Carrying Label Information in
              BGP-4", RFC 3107, May 2001.

   [RFC3209]  Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
              and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
              Tunnels", RFC 3209, December 2001.

   [RFC4023]  Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating
              MPLS in IP or Generic Routing Encapsulation (GRE)",
              RFC 4023, March 2005.

   [RFC4271]  Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
              Protocol 4 (BGP-4)", RFC 4271, January 2006.

   [RFC5036]  Andersson, L., Minei, I., and B. Thomas, "LDP
              Specification", RFC 5036, October 2007.

   [RFC5512]  Mohapatra, P. and E. Rosen, "BGP Encapsulation SAFI

7.2.  Informative References

   [I-D.ietf-grow-va]
              Francis, P., Xu, X., Ballani, H., Jen, D., Raszuk, R., and
              BGP Tunnel Encapsulation Attribute", RFC 5512, April 2009.
              L. Zhang, "FIB Suppression with Virtual Aggregation",
              draft-ietf-grow-va-05 (work in progress), June 2011.

Authors' Addresses

   Paul Francis
   Max Planck Institute for Software Systems
   Gottlieb-Daimler-Strasse
   Kaiserslautern  67633
   Germany

   Phone: +49 631 930 39600
   Email: francis@mpi-sws.org

   Xiaohu Xu
   Huawei Technologies
   No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District
   Beijing, Beijing  100085
   P.R.China

   Phone: +86 10 82836073
   Email: xuxh@huawei.com

   Hitesh Ballani
   Cornell University
   4130 Upson Hall
   Ithaca, NY  14853
   US

   Phone: +1 607 279 6780
   Email: hitesh@cs.cornell.edu

   Robert Raszuk
   Cisco Systems, Inc.
   170 West Tasman Drive
   San Jose, CA  95134
   USA

   Phone:
   Email: raszuk@cisco.com

   Alton Lo
   Cisco Systems, Inc.
   170 West Tasman Drive
   San Jose, CA  95134
   USA

   Phone:
   Email: altonlo@cisco.com
   Lixia Zhang
   UCLA
   3713 Boelter Hall
   Los Angeles, CA  90095
   US

   Phone:
   Email: lixia@cs.ucla.edu

   Xiaohu Xu
   Huawei Technologies
   No.3 Xinxi Rd., Shang-Di Information Industry Base, Hai-Dian District
   Beijing, Beijing  100085
   P.R.China

   Phone: +86 10 82836073
   Email: xuxh@huawei.com