draft-ietf-bess-evpn-overlay-01.txt   draft-ietf-bess-evpn-overlay-02.txt 
L2VPN Workgroup A. Sajassi (Editor) L2VPN Workgroup A. Sajassi (Editor)
INTERNET-DRAFT Cisco INTERNET-DRAFT Cisco
Intended Status: Standards Track Intended Status: Standards Track J. Drake (Editor)
J. Drake (Editor) Juniper
Y. Rekhter Juniper Nabil Bitar
R. Shekhar Verizon
B. Schliesser Nabil Bitar Aldrin Isaac
Juniper Verizon Juniper
James Uttaro
S. Salam Aldrin Isaac AT&T
K. Patel Bloomberg W. Henderickx
D. Rao Alcatel-Lucent
S. Thoria James Uttaro
Cisco AT&T
L. Yong W. Henderickx
Huawei Alcatel-Lucent
D. Cai
S. Sinha
Cisco
Wen Lin
Nischal Sheth
Juniper
Expires: August 24, 2015 February 24, 2015 Expires: April 19, 2016 October 19, 2015
A Network Virtualization Overlay Solution using EVPN A Network Virtualization Overlay Solution using EVPN
draft-ietf-bess-evpn-overlay-01 draft-ietf-bess-evpn-overlay-02
Abstract Abstract
This document describes how Ethernet VPN (EVPN) [RFC7432] can be used This document describes how Ethernet VPN (EVPN) [RFC7432] can be used
as an Network Virtualization Overlay (NVO) solution and explores the as an Network Virtualization Overlay (NVO) solution and explores the
various tunnel encapsulation options over IP and their impact on the various tunnel encapsulation options over IP and their impact on the
EVPN control-plane and procedures. In particular, the following EVPN control-plane and procedures. In particular, the following
encapsulation options are analyzed: VXLAN, NVGRE, and MPLS over GRE. encapsulation options are analyzed: VXLAN, NVGRE, and MPLS over GRE.
Status of this Memo Status of this Memo
skipping to change at page 3, line 17 skipping to change at page 2, line 49
7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation . . 15 7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation . . 15
8 NVE Residing in ToR Switch . . . . . . . . . . . . . . . . . . 15 8 NVE Residing in ToR Switch . . . . . . . . . . . . . . . . . . 15
8.1 EVPN Multi-Homing Features . . . . . . . . . . . . . . . . 16 8.1 EVPN Multi-Homing Features . . . . . . . . . . . . . . . . 16
8.1.1 Multi-homed Ethernet Segment Auto-Discovery . . . . . . 16 8.1.1 Multi-homed Ethernet Segment Auto-Discovery . . . . . . 16
8.1.2 Fast Convergence and Mass Withdraw . . . . . . . . . . . 16 8.1.2 Fast Convergence and Mass Withdraw . . . . . . . . . . . 16
8.1.3 Split-Horizon . . . . . . . . . . . . . . . . . . . . . 16 8.1.3 Split-Horizon . . . . . . . . . . . . . . . . . . . . . 16
8.1.4 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 17 8.1.4 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 17
8.1.5 DF Election . . . . . . . . . . . . . . . . . . . . . . 17 8.1.5 DF Election . . . . . . . . . . . . . . . . . . . . . . 17
8.2 Impact on EVPN BGP Routes & Attributes . . . . . . . . . . . 18 8.2 Impact on EVPN BGP Routes & Attributes . . . . . . . . . . . 18
8.3 Impact on EVPN Procedures . . . . . . . . . . . . . . . . . 18 8.3 Impact on EVPN Procedures . . . . . . . . . . . . . . . . . 18
8.3.1 Split Horizon . . . . . . . . . . . . . . . . . . . . . 18 8.3.1 Split Horizon . . . . . . . . . . . . . . . . . . . . . 19
8.3.2 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 19 8.3.2 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 19
9 Support for Multicast . . . . . . . . . . . . . . . . . . . . . 19 9 Support for Multicast . . . . . . . . . . . . . . . . . . . . . 19
10 Inter-AS . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 10 Data Center Interconnections - DCI . . . . . . . . . . . . . . 20
11 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 21 10.1 DCI using GWs . . . . . . . . . . . . . . . . . . . . . . . 20
12 Security Considerations . . . . . . . . . . . . . . . . . . . 21 10.2 DCI using ASBRs . . . . . . . . . . . . . . . . . . . . . . 21
13 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 10.2.1 ASBR Functionality with NVEs in Hypervisors . . . . . . 22
14 References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 10.2.2 ASBR Functionality with NVEs in TORs . . . . . . . . . 22
14.1 Normative References . . . . . . . . . . . . . . . . . . . 22 11 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 24
14.2 Informative References . . . . . . . . . . . . . . . . . . 22 12 Security Considerations . . . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 13 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25
14 References . . . . . . . . . . . . . . . . . . . . . . . . . . 25
14.1 Normative References . . . . . . . . . . . . . . . . . . . 25
14.2 Informative References . . . . . . . . . . . . . . . . . . 25
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 26
1 Introduction 1 Introduction
In the context of this document, a Network Virtualization Overlay In the context of this document, a Network Virtualization Overlay
(NVO) is a solution to address the requirements of a multi-tenant (NVO) is a solution to address the requirements of a multi-tenant
data center, especially one with virtualized hosts, e.g., Virtual data center, especially one with virtualized hosts, e.g., Virtual
Machines (VMs). The key requirements of such a solution, as described Machines (VMs). The key requirements of such a solution, as described
in [Problem-Statement], are: in [Problem-Statement], are:
- Isolation of network traffic per tenant - Isolation of network traffic per tenant
skipping to change at page 4, line 30 skipping to change at page 4, line 30
between different data centers between different data centers
- Allowing a given VM to move between different physical points of - Allowing a given VM to move between different physical points of
attachment within a given L2 segment attachment within a given L2 segment
The underlay network for NVO solutions is assumed to provide IP The underlay network for NVO solutions is assumed to provide IP
connectivity between NVO endpoints (NVEs). connectivity between NVO endpoints (NVEs).
This document describes how Ethernet VPN (EVPN) can be used as an NVO This document describes how Ethernet VPN (EVPN) can be used as an NVO
solution and explores applicability of EVPN functions and procedures. solution and explores applicability of EVPN functions and procedures.
In particular, it describes the various tunnel encapsulation options In particular, it describes the various tunnel encapsulation options
for EVPN over IP, and their impact on the EVPN control-plane and for EVPN over IP, and their impact on the EVPN control-plane and
procedures for two main scenarios: procedures for two main scenarios:
a) when the NVE resides in the hypervisor, and a) when the NVE resides in the hypervisor, and
b) when the NVE resides in a Top of Rack (ToR) device b) when the NVE resides in a Top of Rack (ToR) device
Note that the use of EVPN as an NVO solution does not necessarily Note that the use of EVPN as an NVO solution does not necessarily
mandate that the BGP control-plane be running on the NVE. For such mandate that the BGP control-plane be running on the NVE. For such
scenarios, it is still possible to leverage the EVPN solution by scenarios, it is still possible to leverage the EVPN solution by
using XMPP, or alternative mechanisms, to extend the control-plane to using XMPP, or alternative mechanisms, to extend the control-plane to
skipping to change at page 5, line 20 skipping to change at page 5, line 20
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
3 Terminology 3 Terminology
NVO: Network Virtualization Overlay NVO: Network Virtualization Overlay
NVE: Network Virtualization Endpoint NVE: Network Virtualization Endpoint
VNI: Virtual Network Identifier (for VXLAN) VNI: Virtual Network Identifier (for VXLAN)
VSID: VIrtual Subnet Identifier (for NVGRE) VSID: Virtual Subnet Identifier (for NVGRE)
EVPN: Ethernet VPN EVPN: Ethernet VPN
EVI: An EVPN instance spanning the Provider Edge (PE) devices EVI: An EVPN instance spanning the Provider Edge (PE) devices
participating in that EVPN. participating in that EVPN.
MAC-VRF: A Virtual Routing and Forwarding table for Media Access MAC-VRF: A Virtual Routing and Forwarding table for Media Access
Control (MAC) addresses on a PE. Control (MAC) addresses on a PE.
Ethernet Segment (ES): When a customer site (device or network) is Ethernet Segment (ES): When a customer site (device or network) is
skipping to change at page 11, line 16 skipping to change at page 11, line 16
DCN and WAN BGP sessions, it is important that when RT values are DCN and WAN BGP sessions, it is important that when RT values are
auto-derived for VNIs (or VSIDs), there is no conflict in RT spaces auto-derived for VNIs (or VSIDs), there is no conflict in RT spaces
between DCN and WAN networks assuming that both are operating within between DCN and WAN networks assuming that both are operating within
the same AS. Also, there can be scenarios where both VXLAN and NVGRE the same AS. Also, there can be scenarios where both VXLAN and NVGRE
encapsulations may be needed within the same DCN and their encapsulations may be needed within the same DCN and their
corresponding VNIs and VSIDs are administered independently which corresponding VNIs and VSIDs are administered independently which
means VNI and VSID spaces can overlap. In order to ensure that no means VNI and VSID spaces can overlap. In order to ensure that no
such conflict in RT spaces arises, RT values for DCNs are auto- such conflict in RT spaces arises, RT values for DCNs are auto-
derived as follow: derived as follow:
0 1 2 3 4 0 1 2 3 4
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+---+
| AS # |A| TYPE| D-ID | Service Instance ID| | AS # |A| TYPE| D-ID |Service Instance ID|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+---+
- 2 bytes of global admin field of the RT is set to the AS number. - 2 bytes of global admin field of the RT is set to the AS number.
- Three least significant bytes of the local admin field of the RT is - Three least significant bytes of the local admin field of the RT is
set to the VNI or VSID, I-SID, or VID. The most significant bit of set to the VNI or VSID, I-SID, or VID. The most significant bit of
the local admin field of the RT is set as follow: the local admin field of the RT is set as follow:
0: auto-derived 0: auto-derived
1: manually-derived 1: manually-derived
- The next 3 bits of the most significant byte of the local admin - The next 3 bits of the most significant byte of the local admin
skipping to change at page 12, line 19 skipping to change at page 12, line 19
the difference being that an MPLS label has local significance while the difference being that an MPLS label has local significance while
a VNI or VSID typically has global significance. Accordingly, and a VNI or VSID typically has global significance. Accordingly, and
specifically to support the option of locally assigned VNIs, the MPLS specifically to support the option of locally assigned VNIs, the MPLS
label field in the MAC Advertisement, Ethernet AD per EVI, and label field in the MAC Advertisement, Ethernet AD per EVI, and
Inclusive Multicast Ethernet Tag routes is used to carry the VNI or Inclusive Multicast Ethernet Tag routes is used to carry the VNI or
VSID. For the balance of this memo, the MPLS label field will be VSID. For the balance of this memo, the MPLS label field will be
referred to as the VNI/VSID field. The VNI/VSID field is used for referred to as the VNI/VSID field. The VNI/VSID field is used for
both local and global VNIs/VSIDs, and for either case the entire 24- both local and global VNIs/VSIDs, and for either case the entire 24-
bit field is used to encode the VNI/VSID value. bit field is used to encode the VNI/VSID value.
For the VLAN based mode (a single VNI per MAC-VRF), the Ethernet Tag For the VLAN-based service (a single VNI per MAC-VRF), the Ethernet
field in the MAC/IP Advertisement, Ethernet AD per EVI, and Inclusive Tag field in the MAC/IP Advertisement, Ethernet AD per EVI, and
Multicast route MUST be set to zero just as in the VLAN Based service Inclusive Multicast route MUST be set to zero just as in the VLAN
in [RFC7432]. Based service in [RFC7432].
For the VNI-aware bundle mode (multiple VNIs per MAC-VRF with each For the VLAN-aware bundle service (multiple VNIs per MAC-VRF with
VNI associated with its own bridge table), the Ethernet Tag field in each VNI associated with its own bridge table), the Ethernet Tag
the MAC Advertisement, Ethernet AD per EVI, and Inclusive Multicast field in the MAC Advertisement, Ethernet AD per EVI, and Inclusive
route MUST identify a bridge table within a MAC-VRF and the set of Multicast route MUST identify a bridge table within a MAC-VRF and the
Ethernet Tags for that EVI needs to be configured consistently on all set of Ethernet Tags for that EVI needs to be configured consistently
PEs within that EVI. For local VNIs, the value advertised in the on all PEs within that EVI. For local VNIs, the value advertised in
Ethernet Tag field MUST be set to a VID just as in the VLAN-aware the Ethernet Tag field MUST be set to a VID just as in the VLAN-aware
bundle service in [RFC7432]. Such setting must be done consistently bundle service in [RFC7432]. Such setting must be done consistently
on all PE devices participating in that EVI within a given domain. on all PE devices participating in that EVI within a given domain.
For global VNIs, the value advertised in the Ethernet Tag field For global VNIs, the value advertised in the Ethernet Tag field
SHOULD be set to a VNI as long as it matches the existing semantics SHOULD be set to a VNI as long as it matches the existing semantics
of the Ethernet Tag, i.e., it identifies a bridge table within a MAC- of the Ethernet Tag, i.e., it identifies a bridge table within a MAC-
VRF and the set of VNIs are configured consistently on each PE in VRF and the set of VNIs are configured consistently on each PE in
that EVI. It should be noted that if within a single domain, a mix of that EVI.
local and global VNIs are used for the same VLAN-aware bundle
service, then the Ethernet Tag field in the EVPN BGP route
advertisements SHALL be set to a VID.
In order to indicate that which type of data plane encapsulation In order to indicate that which type of data plane encapsulation
(i.e., VXLAN, NVGRE, MPLS, or MPLS in GRE) is to be used, the BGP (i.e., VXLAN, NVGRE, MPLS, or MPLS in GRE) is to be used, the BGP
Encapsulation extended community defined in [RFC5512] is included Encapsulation extended community defined in [RFC5512] is included
with all EVPN routes (i.e. MAC Advertisement, Ethernet AD per EVI, with all EVPN routes (i.e. MAC Advertisement, Ethernet AD per EVI,
Ethernet AD per ESI, Inclusive Multicast Ethernet Tag, and Ethernet Ethernet AD per ESI, Inclusive Multicast Ethernet Tag, and Ethernet
Segment) advertised by an egress PE. Five new values have been Segment) advertised by an egress PE. Five new values have been
assigned by IANA to extend the list of encapsulation types defined in assigned by IANA to extend the list of encapsulation types defined in
[RFC5512]: [RFC5512]:
skipping to change at page 15, line 16 skipping to change at page 15, line 12
attributes associated with multi-homing are no longer required. This attributes associated with multi-homing are no longer required. This
reduces the required routes and attributes to the following subset of reduces the required routes and attributes to the following subset of
four out of the set of eight : four out of the set of eight :
- MAC Advertisement Route - MAC Advertisement Route
- Inclusive Multicast Ethernet Tag Route - Inclusive Multicast Ethernet Tag Route
- MAC Mobility Extended Community - MAC Mobility Extended Community
- Default Gateway Extended Community - Default Gateway Extended Community
However, as noted in section 8.6 of [RFC7432] in order to enable a However, as noted in section 8.6 of [RFC7432] in order to enable a
single-homed ingress PE to take advantage of fast convergence, single-homing ingress PE to take advantage of fast convergence,
aliasing, and backup-path when interacting with multi-homed egress aliasing, and backup-path when interacting with multi-homed egress
PEs attached to a given Ethernet segment, a single-homed ingress PE PEs attached to a given Ethernet segment, a single-homing ingress PE
SHOULD be able to receive and process Ethernet AD per ES and Ethernet SHOULD be able to receive and process Ethernet AD per ES and Ethernet
AD per EVI routes." AD per EVI routes."
7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation 7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation
When the NVEs reside on the hypervisors, the EVPN procedures When the NVEs reside on the hypervisors, the EVPN procedures
associated with multi-homing are no longer required. This limits the associated with multi-homing are no longer required. This limits the
procedures on the NVE to the following subset of the EVPN procedures: procedures on the NVE to the following subset of the EVPN procedures:
1. Local learning of MAC addresses received from the VMs per section 1. Local learning of MAC addresses received from the VMs per section
skipping to change at page 15, line 44 skipping to change at page 15, line 40
3. Performing remote learning using BGP per Section 10.2 of 3. Performing remote learning using BGP per Section 10.2 of
[RFC7432]. [RFC7432].
4. Discovering other NVEs and constructing the multicast tunnels 4. Discovering other NVEs and constructing the multicast tunnels
using the Inclusive Multicast Ethernet Tag routes. using the Inclusive Multicast Ethernet Tag routes.
5. Handling MAC address mobility events per the procedures of Section 5. Handling MAC address mobility events per the procedures of Section
16 in [RFC7432]. 16 in [RFC7432].
However, as noted in section 8.6 of [RFC7432] in order to enable a However, as noted in section 8.6 of [RFC7432] in order to enable a
single-homed ingress PE to take advantage of fast convergence, single-homing ingress PE to take advantage of fast convergence,
aliasing, and back-up path when interacting with multi-homed egress aliasing, and back-up path when interacting with multi-homed egress
PEs attached to a given Ethernet segment, a single-homed ingress PE PEs attached to a given Ethernet segment, a single-homing ingress PE
SHOULD implement the ingress node processing of Ethernet AD per ES SHOULD implement the ingress node processing of Ethernet AD per ES
and Ethernet AD per EVI routes as defined in sections 8.2 Fast and Ethernet AD per EVI routes as defined in sections 8.2 Fast
Convergence and 8.4 Aliasing and Backup-Path of [RFC7432]. Convergence and 8.4 Aliasing and Backup-Path of [RFC7432].
8 NVE Residing in ToR Switch 8 NVE Residing in ToR Switch
In this section, we discuss the scenario where the NVEs reside in the In this section, we discuss the scenario where the NVEs reside in the
Top of Rack (ToR) switches AND the servers (where VMs are residing) Top of Rack (ToR) switches AND the servers (where VMs are residing)
are multi-homed to these ToR switches. The multi-homing may operate are multi-homed to these ToR switches. The multi-homing may operate
in All-Active or Single-Active redundancy mode. If the servers are in All-Active or Single-Active redundancy mode. If the servers are
single-homed to the ToR switches, then the scenario becomes similar single-homed to the ToR switches, then the scenario becomes similar
to that where the NVE resides in the hypervisor, as discussed in to that where the NVE resides in the hypervisor, as discussed in
Section 5, as far as the required EVPN functionality. Section 5, as far as the required EVPN functionality.
[RFC7432] defines a set of BGP routes, attributes and procedures to [RFC7432] defines a set of BGP routes, attributes and procedures to
support multi-homing. We first describe these functions and support multi-homing. We first describe these functions and
skipping to change at page 18, line 13 skipping to change at page 18, line 11
If a CE is multi-homed to two or more NVEs on an Ethernet segment If a CE is multi-homed to two or more NVEs on an Ethernet segment
operating in all-active redundancy mode, then for a given EVI only operating in all-active redundancy mode, then for a given EVI only
one of these NVEs, termed the Designated Forwarder (DF) is one of these NVEs, termed the Designated Forwarder (DF) is
responsible for sending it broadcast, multicast, and, if configured responsible for sending it broadcast, multicast, and, if configured
for that EVI, unknown unicast frames. for that EVI, unknown unicast frames.
This is required in order to prevent duplicate delivery of multi- This is required in order to prevent duplicate delivery of multi-
destination frames to a multi-homed host or VM, in case of all-active destination frames to a multi-homed host or VM, in case of all-active
redundancy. redundancy.
In NVEs where .1Q tagged frames are received from hosts, the DF
election is performed on host VLAN IDs (VIDs). It is assumed that for
a given Ethernet Segment, VIDs are unique and consistent (e.g., no
duplicate VIDs exist).
In GWs where VxLAN encapsulated frames are received, the DF election
is performed on VNIs. Again, it is assumed that for a given Ethernet
Segment, VNIs are unique and consistent (e.g., no duplicate VNIs
exist).
8.2 Impact on EVPN BGP Routes & Attributes 8.2 Impact on EVPN BGP Routes & Attributes
Since multi-homing is supported in this scenario, then the entire set Since multi-homing is supported in this scenario, then the entire set
of BGP routes and attributes defined in [RFC7432] are used. As of BGP routes and attributes defined in [RFC7432] are used. As
discussed in Section 3.1.3, the VSID or VNI is carried in the discussed in Section 3.1.3, the VSID or VNI is carried in the
VNI/VSID field in the MAC Advertisement, Ethernet AD per EVI, and VNI/VSID field in the MAC Advertisement, Ethernet AD per EVI, and
Inclusive Multicast Ethernet Tag routes. Inclusive Multicast Ethernet Tag routes.
8.3 Impact on EVPN Procedures 8.3 Impact on EVPN Procedures
skipping to change at page 20, line 29 skipping to change at page 20, line 35
In the scenario where the multicast tunnel is a tree, both the In the scenario where the multicast tunnel is a tree, both the
Inclusive as well as the Aggregate Inclusive variants may be used. In Inclusive as well as the Aggregate Inclusive variants may be used. In
the former case, a multicast tree is dedicated to a VNI or VSID. the former case, a multicast tree is dedicated to a VNI or VSID.
Whereas, in the latter, a multicast tree is shared among multiple Whereas, in the latter, a multicast tree is shared among multiple
VNIs or VSIDs. This is done by having the NVEs advertise multiple VNIs or VSIDs. This is done by having the NVEs advertise multiple
Inclusive Multicast routes with different VNI or VSID encoded in the Inclusive Multicast routes with different VNI or VSID encoded in the
Ethernet Tag field, but with the same tunnel identifier encoded in Ethernet Tag field, but with the same tunnel identifier encoded in
the PMSI Tunnel attribute. the PMSI Tunnel attribute.
10 Inter-AS 10 Data Center Interconnections - DCI
For inter-AS operation, two scenarios must be considered: For DCI, the following two main scenarios are considered when
connecting data centers running evpn-overlay (as described here) over
MPLS/IP core network:
- Scenario 1: The tunnel endpoint IP addresses are public - Scenario 1: DCI using GWs
- Scenario 2: The tunnel endpoint IP addresses are private - Scenario 2: DCI using ASBRs
In the first scenario, inter-AS operation is straight-forward and The following two subsections describe the operations for each of
follows existing BGP inter-AS procedures. However, in the first these scenarios.
scenario where the tunnel endpoint IP addresses are public, there may
be security concern regarding the distribution of these addresses
among different ASes. This security concern is one of the main
reasons for having the so called inter-AS "option-B" in MPLS VPN
solutions such as EVPN.
The second scenario is more challenging, because the absence of the 10.1 DCI using GWs
MPLS client layer from the VXLAN encapsulation creates a situation
where the ASBR has no fully qualified indication within the tunnel
header as to where the tunnel endpoint resides. To elaborate on this,
recall that with MPLS, the client layer labels (i.e. the VPN labels)
are downstream assigned. As such, this label implicitly has a
connotation of the tunnel endpoint, and it is sufficient for the ASBR
to look up the client layer label in order to identify the label
translation required as well as the tunnel endpoint to which a given
packet is being destined. With the VXLAN encapsulation, the VNI is
globally assigned and hence is shared among all endpoints. The
destination IP address is the only field which identifies the tunnel
endpoint in the tunnel header, and this address is privately managed
by every data center network. Since the tunnel address is allocated
out of a private address pool, then we either need to do a lookup
based on VTEP IP address in context of a VRF (e.g., use IP-VPN) or
terminate the VXLAN tunnel and do a lookup based on the tenant's MAC
address to identify the egress tunnel on the ASBR. This effectively
mandates that the ASBR to either run another overlay solution such as
IP-VPN over MPLS/IP core network or to be aware of the MAC addresses
of all VMs in its local AS, at the very least.
If VNIs/VSIDs have local significance, then the inter-AS operation This is the typical scenario for interconnecting data centers over
can be simplified to that of MPLS and thus MPLS inter-AS option B and WAN. In this scenario, EVPN routes are terminated and processed in
C can be leveraged in here. That's why the use of local significance each GW and MAC/IP routes are always re-advertised from DC to WAN but
VNIs/VSIDs (e.g., MPLS labels) are recommended for inter-AS operation from WAN to DC, they are not re-advertised if unknown MAC address
of DC networks without gateways. (and default IP address) are utilized in NVEs. In this scenario, each
GW maintains a MAC-VRF (and/or IP-VRF) for each EVI. The main
advantage of this approach is that NVEs do not need to maintain MAC
and IP addresses from any remote data centers when default IP route
and unknown MAC routes are used - i.e., they only need to maintain
routes that are local to their own DC. When default IP route and
unknown MAC route are used, any unknown IP and MAC packets from NVEs
are forwarded to the GWs where all the VPN MAC and IP routes are
maintained. This approach reduces the size of MAC-VRF and IP-VRF
significantly at NVEs. Furthermore, it results in a faster
convergence time upon a link or NVE failure in a multi-homed network
or device redundancy scenario, because the failure related BGP routes
(such as mass withdraw message) do not need to get propagated all the
way to the remote NVEs in the remote DCs. This approach is described
in details in section 3.4 of [DCI-EVPN-OVERLAY].
10.2 DCI using ASBRs
This approach can be considered as the opposite of the first approach
and it favors simplification at DCI devices over NVEs such that
larger MAC-VRF (and IP-VRF) tables are need to be maintained on NVEs;
whereas, DCI devices don't need to maintain any MAC (and IP)
forwarding tables. Furthermore, DCI devices do not need to terminate
and processed routes related to multi-homing but rather to relay
these messages for the establishment of an end-to-end LSP path. In
other words, DCI devices in this approach operate similar to ASBRs
for inter-AS options B. This requires locally assigned VNIs to be
used just like downstream assigned MPLS VPN label where for all
practical purposes the VNIs function like 24-bit VPN labels. This
approach is equally applicable to data centers (or access networks)
with MPLS encapsulation.
In inter-AS option B, when ASBR receives an EVPN route from its DC
over iBGP and re-advertises it to other ASBRs, it re-advertises the
EVPN route by re-writing the BGP next-hops to itself, thus losing the
identity of the PE that originated the advertisement. This re-write
of BGP next-hop impacts the EVPN Mass Withdraw route (Ethernet A-D
per ES) and its procedure adversely. In EVPN, the route used for
aliasing (Ethernet A-D per EVI route) has the same RD as the MAC/IP
routes associated with that EVI. Therefore, the receiving PE can
associated the receive MAC/IP routes with its corresponding aliasing
route using their RDs even if their next hop is written to the same
ASBR router's address. However, in EVPN, the mass-withdraw route uses
a different RD than that of its associated MAC/IP routes. Thus, the
way to associate them together is via their next-hop router's
address. Now, when BGP next hop address representing the originating
PE, gets re-written by the re-advertising ASBR, it creates ambiguity
in the receiving PE that cannot be resolved. Therefore, the
functionality needed at the ASBRs depends on whether the EVPN
Ethernet A-D routes (per ES and/or per EVI) are originated and
whether there is a need to handle route resolution ambiguity for
Ethernet A-D per ES route.
The following two subsections describe the functionality needed by
the ASBRs depending on whether the NVEs reside in a Hypervisors or in
TORs.
10.2.1 ASBR Functionality with NVEs in Hypervisors
When NVEs reside in hypervisors as described in section 7.1, there is
no multi-homing and thus there is no need for the originating NVE to
send Ethernet A-D per ES or Ethernet A-D per EVI routes. Furthermore,
the processing of these routes by the receiving NVE in the hypervisor
are optional per [RFC7432] and as described in section 7. Therefore,
the ambiguity issue discussed above doesn't exist for this scenario
and the functionality of ASBRs are that of existing L2VPN (or L3VPN)
where the ASBRs assist in setting up end-to-end LSPs among the NVEs'
MAC-VRFs. As noted previously, for all practical purposes, the 24-bit
locally assigned VNIs used in this scenario, function as 24-bit
labels in setting up the end-to-end LSPs.
10.2.2 ASBR Functionality with NVEs in TORs
When NVEs reside in TORs and operate in multi-homing redundancy mode,
then as described in section 8, there is a need for the originating
NVE to send Ethernet A-D per ES route(s) (used for mass withdraw) and
Ethernet A-D per EVI routes (used for aliasing). As described above,
the re-write of BGP next-hop by ASBRs creates ambiguities when
Ethernet A-D per ES routes are received by the remote PE in a
different ASBR because the receiving PE cannot associated that route
with the MAC/IP routes from the same Ethernet Segment advertised by
the same originating PE. This ambiguity inhibits the function of
mass-withdraw per ES by the receiving PE in a different ASBR.
As an example consider a scenario where CE is multi-homed to PE1 and
PE2 where these PEs are connected via ASBR1 and then ASBR2 to the
remote PE3. Furthermore, consider that PE1 receives M1 from CE1 but
not PE2. Therefore, PE1 advertises Eth A-D per ES1, Eth A-D per EVI1,
and M1; whereas, PE2 only advertises Eth A-D per ES1 and Eth A-D per
EVI1. ASBR1 receives all these five advertisements and passes them to
ASBR2 (with itself as the BGP next hop). ASBR2, in turn, passes them
to the remote PE3 with itself as the BGP next hop. PE3 receives these
five routes where all of them have the same BGP next-hop (i.e.,
ASBR2). Furthermore, the two Ether A-D per ES routes received by PE3
have the same info - i.e., same ESI and the same BGP next hop.
Although both of these routes are maintained by the BGP process in
PE3, information from only one of them is used in the L2 routing
table (L2 RIB).
PE1
/ \
CE ASBR1---ASBR2---PE3
\ /
PE2
Figure 1: Inter-AS Option B
Now, when the AC between the PE2 and the CE fails and PE2 sends NLRI
withdrawal for Ether A-D per ES route and this withdrawal gets
propagated and received by the PE3, the BGP process in PE3 removes
the corresponding BGP route; however, it doesn't remove the
associated info (namely ESI and BGP next hop) from the L2 routing
table (L2 RIB) because it still has the other Ether A-D per ES route
(originated from PE1) with the same info. That is why the mass-
withdraw mechanism does not work when doing DCI with inter-AS option
B. However, as described next, the Aliasing function works and so
does mass-withdraw per EVI (which is associated with withdrawing the
EVPN route associated with Aliasing - i.e., Ether A-D per EVI route).
In the above example, the PE3 receives two Aliasing routes with the
same BGP next hop (ASBR2) but different RDs. One of the Alias route
has the same RD as the advertised MAC route (M1). PE3 follows the
route resolution procedure specified in [RFC7432] upon receiving the
two Aliasing route. PE3 should also resolve the alias path properly
even though both the primary and backup paths have the same BGP next
hop, they have different RDs and the alias route with the different
RD than that of the MAC route is considered as the backup path.
Therefore, PE3 installs both primary and backup paths (and their
associated ESI/EVI MPLS labels or local VNIs) for the MAC route M1.
This creates two end-to-end LSPs from PE3 to PE1 for M1 such that
when PE3 wants to forward traffic destined to M1, it can load
balanced between the two paths. Although route resolution for
Aliasing routes with the same BGP next hop is not described in this
level of details in [RFC7432], it is expected to operate as such and
thus it is clarified here.
When the AC between the PE2 and the CE fails and PE2 sends NLRI
withdrawal for Ether A-D per EVI routes and these withdrawals get
propagated and received by the PE3, the PE3 removes the Aliasing
route and updates all the corresponding MAC routes for that EVI to
remove the backup path. This action makes the mass-withdraw
functionality to perform at the per-EVI level (instead of per-ES).
The mass-withdraw at per-EVI level requires more messages than that
of per-ES level and thus its convergence time is not as good as per
ES level. However, its convergence time is much better than
individual MAC withdraw.
In summary, it can be seen that aliasing and backup path
functionality should work as is for inter-AS option B. Furthermore,
in case of inter-AS option B, mass-withdraw functionality falls back
from per-ES to per-EVI. If per-ES mass-withdraw functionality is
needed along with backward compatibility, then it is recommended to
use GWs (per section 10.1) instead of ASBRs for DCI.
11 Acknowledgement 11 Acknowledgement
The authors would like to thank David Smith, John Mullooly, Thomas The authors would like to thank David Smith, John Mullooly, Thomas
Nadeau for their valuable comments and feedback. Nadeau for their valuable comments and feedback. The authors would
also like to thank Jakob Heitz for his contribution on section 10.
12 Security Considerations 12 Security Considerations
This document uses IP-based tunnel technologies to support data This document uses IP-based tunnel technologies to support data
plane transport. Consequently, the security considerations of those plane transport. Consequently, the security considerations of those
tunnel technologies apply. This document defines support for VXLAN tunnel technologies apply. This document defines support for VXLAN
and NVGRE encapsulations. The security considerations from those and NVGRE encapsulations. The security considerations from those
documents as well as [RFC4301] apply to the data plane aspects of documents as well as [RFC4301] apply to the data plane aspects of
this document. this document.
skipping to change at page 23, line 27 skipping to change at page 26, line 21
Network Virtualization", draft-ietf-nvo3-overlay-problem-statement- Network Virtualization", draft-ietf-nvo3-overlay-problem-statement-
01, September 2012. 01, September 2012.
[L3VPN-ENDSYSTEMS] Marques et al., "BGP-signaled End-system IP/VPNs", [L3VPN-ENDSYSTEMS] Marques et al., "BGP-signaled End-system IP/VPNs",
draft-ietf-l3vpn-end-system, work in progress, October 2012. draft-ietf-l3vpn-end-system, work in progress, October 2012.
[NOV3-FRWK] Lasserre et al., "Framework for DC Network [NOV3-FRWK] Lasserre et al., "Framework for DC Network
Virtualization", draft-ietf-nvo3-framework-01.txt, work in progress, Virtualization", draft-ietf-nvo3-framework-01.txt, work in progress,
October 2012. October 2012.
Contributors
S. Salam K. Patel D. Rao S. Thoria D. Cai Cisco
Y. Rekhter R. Shekhar Wen Lin Nischal Sheth Juniper
L. Yong Huawei
Authors' Addresses Authors' Addresses
Ali Sajassi Ali Sajassi
Cisco Cisco
Email: sajassi@cisco.com Email: sajassi@cisco.com
John Drake John Drake
Juniper Networks Juniper Networks
Email: jdrake@juniper.net Email: jdrake@juniper.net
Nabil Bitar Nabil Bitar
Verizon Communications Verizon Communications
Email : nabil.n.bitar@verizon.com Email : nabil.n.bitar@verizon.com
Aldrin Isaac Aldrin Isaac
Bloomberg Juniper
Email: aisaac71@bloomberg.net Email: aisaac@juniper.net
James Uttaro James Uttaro
AT&T AT&T
Email: uttaro@att.com Email: uttaro@att.com
Wim Henderickx Wim Henderickx
Alcatel-Lucent Alcatel-Lucent
e-mail: wim.henderickx@alcatel-lucent.com e-mail: wim.henderickx@alcatel-lucent.com
Ravi Shekhar
Juniper Networks
Email: rshekhar@juniper.net
Samer Salam
Cisco
Email: ssalam@cisco.com
Keyur Patel
Cisco
Email: Keyupate@cisco.com
Dhananjaya Rao
Cisco
Email: dhrao@cisco.com
Samir Thoria
Cisco
Email: sthoria@cisco.com
 End of changes. 29 change blocks. 
101 lines changed or deleted 240 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/