draft-ietf-bess-evpn-overlay-00.txt   draft-ietf-bess-evpn-overlay-01.txt 
skipping to change at page 1, line 29 skipping to change at page 1, line 29
Huawei Alcatel-Lucent Huawei Alcatel-Lucent
D. Cai D. Cai
S. Sinha S. Sinha
Cisco Cisco
Wen Lin Wen Lin
Nischal Sheth Nischal Sheth
Juniper Juniper
Expires: May 10, 2015 November 10, 2014 Expires: August 24, 2015 February 24, 2015
A Network Virtualization Overlay Solution using EVPN A Network Virtualization Overlay Solution using EVPN
draft-ietf-bess-evpn-overlay-00 draft-ietf-bess-evpn-overlay-01
Abstract Abstract
This document describes how EVPN can be used as an NVO solution and This document describes how Ethernet VPN (EVPN) [RFC7432] can be used
explores the various tunnel encapsulation options over IP and their as an Network Virtualization Overlay (NVO) solution and explores the
impact on the EVPN control-plane and procedures. In particular, the various tunnel encapsulation options over IP and their impact on the
following encapsulation options are analyzed: MPLS over GRE, VXLAN, EVPN control-plane and procedures. In particular, the following
and NVGRE. encapsulation options are analyzed: VXLAN, NVGRE, and MPLS over GRE.
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as other groups may also distribute working documents as
Internet-Drafts. Internet-Drafts.
skipping to change at page 2, line 47 skipping to change at page 2, line 47
3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 EVPN Features . . . . . . . . . . . . . . . . . . . . . . . . . 6 4 EVPN Features . . . . . . . . . . . . . . . . . . . . . . . . . 6
5 Encapsulation Options for EVPN Overlays . . . . . . . . . . . . 7 5 Encapsulation Options for EVPN Overlays . . . . . . . . . . . . 7
5.1 VXLAN/NVGRE Encapsulation . . . . . . . . . . . . . . . . . 7 5.1 VXLAN/NVGRE Encapsulation . . . . . . . . . . . . . . . . . 7
5.1.1 Virtual Identifiers Scope . . . . . . . . . . . . . . . 8 5.1.1 Virtual Identifiers Scope . . . . . . . . . . . . . . . 8
5.1.1.1 Data Center Interconnect with Gateway . . . . . . . 8 5.1.1.1 Data Center Interconnect with Gateway . . . . . . . 8
5.1.1.2 Data Center Interconnect without Gateway . . . . . . 9 5.1.1.2 Data Center Interconnect without Gateway . . . . . . 9
5.1.2 Virtual Identifiers to EVI Mapping . . . . . . . . . . . 9 5.1.2 Virtual Identifiers to EVI Mapping . . . . . . . . . . . 9
5.1.2.1 Auto Derivation of RT . . . . . . . . . . . . . . . 10 5.1.2.1 Auto Derivation of RT . . . . . . . . . . . . . . . 10
5.1.3 Constructing EVPN BGP Routes . . . . . . . . . . . . . 11 5.1.3 Constructing EVPN BGP Routes . . . . . . . . . . . . . 11
5.2 MPLS over GRE . . . . . . . . . . . . . . . . . . . . . . . 12 5.2 MPLS over GRE . . . . . . . . . . . . . . . . . . . . . . . 13
6 EVPN with Multiple Data Plane Encapsulations . . . . . . . . . 13 6 EVPN with Multiple Data Plane Encapsulations . . . . . . . . . 13
7 NVE Residing in Hypervisor . . . . . . . . . . . . . . . . . . 13 7 NVE Residing in Hypervisor . . . . . . . . . . . . . . . . . . 14
7.1 Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE 7.1 Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE
Encapsulation . . . . . . . . . . . . . . . . . . . . . . . 14 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . 14
7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation . . 14 7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation . . 15
8 NVE Residing in ToR Switch . . . . . . . . . . . . . . . . . . 15 8 NVE Residing in ToR Switch . . . . . . . . . . . . . . . . . . 15
8.1 EVPN Multi-Homing Features . . . . . . . . . . . . . . . . 15 8.1 EVPN Multi-Homing Features . . . . . . . . . . . . . . . . 16
8.1.1 Multi-homed Ethernet Segment Auto-Discovery . . . . . . 16 8.1.1 Multi-homed Ethernet Segment Auto-Discovery . . . . . . 16
8.1.2 Fast Convergence and Mass Withdraw . . . . . . . . . . . 16 8.1.2 Fast Convergence and Mass Withdraw . . . . . . . . . . . 16
8.1.3 Split-Horizon . . . . . . . . . . . . . . . . . . . . . 16 8.1.3 Split-Horizon . . . . . . . . . . . . . . . . . . . . . 16
8.1.4 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 16 8.1.4 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 17
8.1.5 DF Election . . . . . . . . . . . . . . . . . . . . . . 17 8.1.5 DF Election . . . . . . . . . . . . . . . . . . . . . . 17
8.2 Impact on EVPN BGP Routes & Attributes . . . . . . . . . . . 17 8.2 Impact on EVPN BGP Routes & Attributes . . . . . . . . . . . 18
8.3 Impact on EVPN Procedures . . . . . . . . . . . . . . . . . 17 8.3 Impact on EVPN Procedures . . . . . . . . . . . . . . . . . 18
8.3.1 Split Horizon . . . . . . . . . . . . . . . . . . . . . 18 8.3.1 Split Horizon . . . . . . . . . . . . . . . . . . . . . 18
8.3.2 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 19 8.3.2 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 19
9 Support for Multicast . . . . . . . . . . . . . . . . . . . . . 19 9 Support for Multicast . . . . . . . . . . . . . . . . . . . . . 19
10 Inter-AS . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 10 Inter-AS . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
11 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 21 11 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 21
12 Security Considerations . . . . . . . . . . . . . . . . . . . 21 12 Security Considerations . . . . . . . . . . . . . . . . . . . 21
13 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 13 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22
14 References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 14 References . . . . . . . . . . . . . . . . . . . . . . . . . . 22
14.1 Normative References . . . . . . . . . . . . . . . . . . . 22 14.1 Normative References . . . . . . . . . . . . . . . . . . . 22
14.2 Informative References . . . . . . . . . . . . . . . . . . 22 14.2 Informative References . . . . . . . . . . . . . . . . . . 22
skipping to change at page 4, line 35 skipping to change at page 4, line 35
The underlay network for NVO solutions is assumed to provide IP The underlay network for NVO solutions is assumed to provide IP
connectivity between NVO endpoints (NVEs). connectivity between NVO endpoints (NVEs).
This document describes how Ethernet VPN (EVPN) can be used as an NVO This document describes how Ethernet VPN (EVPN) can be used as an NVO
solution and explores applicability of EVPN functions and procedures. solution and explores applicability of EVPN functions and procedures.
In particular, it describes the various tunnel encapsulation options In particular, it describes the various tunnel encapsulation options
for EVPN over IP, and their impact on the EVPN control-plane and for EVPN over IP, and their impact on the EVPN control-plane and
procedures for two main scenarios: procedures for two main scenarios:
a) when the NVE resides in the hypervisor, and a) when the NVE resides in the hypervisor, and
b) when the NVE resides in a ToR device b) when the NVE resides in a Top of Rack (ToR) device
Note that the use of EVPN as an NVO solution does not necessarily Note that the use of EVPN as an NVO solution does not necessarily
mandate that the BGP control-plane be running on the NVE. For such mandate that the BGP control-plane be running on the NVE. For such
scenarios, it is still possible to leverage the EVPN solution by scenarios, it is still possible to leverage the EVPN solution by
using XMPP, or alternative mechanisms, to extend the control-plane to using XMPP, or alternative mechanisms, to extend the control-plane to
the NVE as discussed in [L3VPN-ENDSYSTEMS]. the NVE as discussed in [L3VPN-ENDSYSTEMS].
The possible encapsulation options for EVPN overlays that are The possible encapsulation options for EVPN overlays that are
analyzed in this document are: analyzed in this document are:
skipping to change at page 5, line 18 skipping to change at page 5, line 18
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
3 Terminology 3 Terminology
NVO: Network Virtualization Overlay NVO: Network Virtualization Overlay
NVE: Network Virtualization Endpoint NVE: Network Virtualization Endpoint
VNI: Virtual Network Identifier (for VxLAN) VNI: Virtual Network Identifier (for VXLAN)
VSID: VIrtual Subnet Identifier (for NVGRE) VSID: VIrtual Subnet Identifier (for NVGRE)
EVPN: Ethernet VPN EVPN: Ethernet VPN
EVI: An EVPN instance spanning across the PEs participating in that EVI: An EVPN instance spanning the Provider Edge (PE) devices
EVPN participating in that EVPN.
MAC-VRF: A Virtual Routing and Forwarding table for MAC addresses on MAC-VRF: A Virtual Routing and Forwarding table for Media Access
a PE for an EVI Control (MAC) addresses on a PE.
Ethernet Segment Identifier (ESI): If a CE is multi-homed to two or Ethernet Segment (ES): When a customer site (device or network) is
more PEs, the set of Ethernet links that attaches the CE to the PEs connected to one or more PEs via a set of Ethernet links, then that
is an 'Ethernet segment'. Ethernet segments MUST have a unique non- set of links is referred to as an 'Ethernet segment'.
zero identifier, the 'Ethernet Segment Identifier'.
Ethernet Tag: An Ethernet Tag identifies a particular broadcast Ethernet Segment Identifier (ESI): A unique non-zero identifier that
identifies an Ethernet segment is called an 'Ethernet Segment
Identifier'.
Ethernet Tag: An Ethernet tag identifies a particular broadcast
domain, e.g., a VLAN. An EVPN instance consists of one or more domain, e.g., a VLAN. An EVPN instance consists of one or more
broadcast domains. Ethernet tag(s) are assigned to the broadcast broadcast domains.
domains of a given EVPN instance by the provider of that EVPN, and
each PE in that EVPN instance performs a mapping between broadcast
domain identifier(s) understood by each of its attached CEs and the
corresponding Ethernet tag.
Single-Active Multihoming: When a device or a network is multihomed PE: Provider Edge device.
to a group of two or more PEs and when only a single PE in such a
redundancy group can forward traffic to/from the multihomed device or
network for a given VLAN, such multihoming is referred to as "Single-
Active"
All-Active Multihoming: When a device is multihomed to a group of two Single-Active Redundancy Mode: When only a single PE, among all the
or more PEs and when all PEs in such redundancy group can forward PEs attached to an Ethernet segment, is allowed to forward traffic
traffic to/from the multihomed device or network for a given VLAN, to/from that Ethernet segment for a given VLAN, then the Ethernet
such multihoming is referred to as "All-Active". segment is defined to be operating in Single-Active redundancy mode.
All-Active Redundancy Mode: When all PEs attached to an Ethernet
segment are allowed to forward known unicast traffic to/from that
Ethernet segment for a given VLAN, then the Ethernet segment is
defined to be operating in All-Active redundancy mode.
4 EVPN Features 4 EVPN Features
EVPN was originally designed to support the requirements detailed in EVPN was originally designed to support the requirements detailed in
[EVPN-REQ] and therefore has the following attributes which directly [RFC7209] and therefore has the following attributes which directly
address control plane scaling and ease of deployment issues. address control plane scaling and ease of deployment issues.
1) Control plane traffic is distributed with BGP and Broadcast and 1) Control plane traffic is distributed with BGP and Broadcast and
Multicast traffic is sent using a shared multicast tree or with Multicast traffic is sent using a shared multicast tree or with
ingress replication. ingress replication.
2) Control plane learning is used for MAC (and IP) addresses instead 2) Control plane learning is used for MAC (and IP) addresses instead
of data plane learning. The latter requires the flooding of unknown of data plane learning. The latter requires the flooding of unknown
unicast and ARP frames; whereas, the former does not require any unicast and ARP frames; whereas, the former does not require any
flooding. flooding.
3) Route Reflector is used to reduce a full mesh of BGP sessions 3) Route Reflector is used to reduce a full mesh of BGP sessions
among PE devices to a single BGP session between a PE and the RR. among PE devices to a single BGP session between a PE and the RR.
Furthermore, RR hierarchy can be leveraged to scale the number BGP Furthermore, RR hierarchy can be leveraged to scale the number of BGP
routes on the RR. routes on the RR.
4) Auto-discovery via BGP is used to discover PE devices 4) Auto-discovery via BGP is used to discover PE devices
participating in a given VPN, PE devices participating in a given participating in a given VPN, PE devices participating in a given
redundancy group, tunnel encapsulation types, multicast tunnel type, redundancy group, tunnel encapsulation types, multicast tunnel type,
multicast members, etc. multicast members, etc.
5) All-Active multihoming is used. This allows a given customer 5) All-Active multihoming is used. This allows a given customer
device (CE) to have multiple links to multiple PEs, and traffic device (CE) to have multiple links to multiple PEs, and traffic
to/from that CE fully utilizes all of these links. This set of links to/from that CE fully utilizes all of these links. This set of links
skipping to change at page 6, line 49 skipping to change at page 6, line 49
notified of the failure via the withdrawal of a single EVPN route. notified of the failure via the withdrawal of a single EVPN route.
This allows those PEs to remove the withdrawing PE as a next hop for This allows those PEs to remove the withdrawing PE as a next hop for
every MAC address associated with the failed link. This is termed every MAC address associated with the failed link. This is termed
'mass withdrawal'. 'mass withdrawal'.
7) BGP route filtering and constrained route distribution are 7) BGP route filtering and constrained route distribution are
leveraged to ensure that the control plane traffic for a given EVI is leveraged to ensure that the control plane traffic for a given EVI is
only distributed to the PEs in that EVI. only distributed to the PEs in that EVI.
8) When a 802.1Q interface is used between a CE and a PE, each of the 8) When a 802.1Q interface is used between a CE and a PE, each of the
VLAN ID (VID) on that interface can be mapped onto a bridge domain VLAN ID (VID) on that interface can be mapped onto a bridge table
(for upto 4094 such bridge domains). All these bridge domains can (for upto 4094 such bridge tables). All these bridge tables may be
also be mapped onto a single EVI (in case of VLAN-aware bundle mapped onto a single MAC-VRF (in case of VLAN-aware bundle service).
service).
9) VM Mobility mechanisms ensure that all PEs in a given EVI know 9) VM Mobility mechanisms ensure that all PEs in a given EVI know
the ES with which a given VM, as identified by its MAC and IP the ES with which a given VM, as identified by its MAC and IP
addresses, is currently associated. addresses, is currently associated.
10) Route Targets are used to allow the operator (or customer) to 10) Route Targets are used to allow the operator (or customer) to
define a spectrum of logical network topologies including mesh, hub & define a spectrum of logical network topologies including mesh, hub &
spoke, and extranets (e.g., a VPN whose sites are owned by different spoke, and extranets (e.g., a VPN whose sites are owned by different
enterprises), without the need for proprietary software or the aid of enterprises), without the need for proprietary software or the aid of
other virtual or physical devices. other virtual or physical devices.
skipping to change at page 7, line 28 skipping to change at page 7, line 28
plane for NVO are extremely important. EVPN and the extensions plane for NVO are extremely important. EVPN and the extensions
described herein, are designed with this level of scalability in described herein, are designed with this level of scalability in
mind. mind.
5 Encapsulation Options for EVPN Overlays 5 Encapsulation Options for EVPN Overlays
5.1 VXLAN/NVGRE Encapsulation 5.1 VXLAN/NVGRE Encapsulation
Both VXLAN and NVGRE are examples of technologies that provide a data Both VXLAN and NVGRE are examples of technologies that provide a data
plane encapsulation which is used to transport a packet over the plane encapsulation which is used to transport a packet over the
common physical IP infrastructure between NVEs, VXLAN Tunnel End common physical IP infrastructure between VXLAN Tunnel End Points
Point (VTEPs) in VXLAN and Network Virtualization Endpoint (NVEs) in (VTEPs) in VXLAN network and Network Virtualization Endpoints (NVEs)
NVGRE. Both of these technologies include the identifier of the in NVGRE network. Both of these technologies include the identifier
specific NVO instance, Virtual Network Identifier (VNI) in VXLAN and of the specific NVO instance, Virtual Network Identifier (VNI) in
Virtual Subnet Identifier (VSID), NVGRE, in each packet. VXLAN and Virtual Subnet Identifier (VSID) in NVGRE, in each packet.
Note that a Provider Edge (PE) is equivalent to a VTEP/NVE. Note that a Provider Edge (PE) is equivalent to a VTEP/NVE.
[VXLAN] encapsulation is based on UDP, with an 8-byte header VXLAN encapsulation is based on UDP, with an 8-byte header following
following the UDP header. VXLAN provides a 24-bit VNI, which the UDP header. VXLAN provides a 24-bit VNI, which typically provides
typically provides a one-to-one mapping to the tenant VLAN ID, as a one-to-one mapping to the tenant VLAN ID, as described in
described in [VXLAN]. In this scenario, the VTEP does not include an [RFC7348]. In this scenario, the ingress VTEP does not include an
inner VLAN tag on frame encapsulation, and discards decapsulated inner VLAN tag on the encapsulated frame, and the egress VTEP
frames with an inner VLAN tag. This mode of operation in [VXLAN] maps discards the frames with an inner VLAN tag. This mode of operation in
to VLAN Based Service in [EVPN], where a tenant VLAN ID gets mapped [RFC7348] maps to VLAN Based Service in [RFC7432], where a tenant
to an EVPN instance (EVI). VLAN ID gets mapped to an EVPN instance (EVI).
[VXLAN] also provides an option of including an inner VLAN tag in the VXLAN also provides an option of including an inner VLAN tag in the
encapsulated frame, if explicitly configured at the VTEP. This mode encapsulated frame, if explicitly configured at the VTEP. This mode
of operation can either map to VLAN Based Service or VLAN Bundle of operation can map to VLAN Bundle Service in [RFC7432] because all
Service in [EVPN] because inner VLAN tag is not used for lookup by the tenant's tagged frames map to a single bridge table / MAC-VRF,
the disposition PE when performing VXLAN decapsulation as described and the inner VLAN tag is not used for lookup by the disposition PE
in section 6 of [VXLAN]. when performing VXLAN decapsulation as described in section 6 of
[RFC7348].
[NVGRE] encapsulation is based on [GRE] and it mandates the inclusion [NVGRE] encapsulation is based on [GRE] and it mandates the inclusion
of the optional GRE Key field which carries the VSID. There is a one- of the optional GRE Key field which carries the VSID. There is a one-
to-one mapping between the VSID and the tenant VLAN ID, as described to-one mapping between the VSID and the tenant VLAN ID, as described
in [NVGRE] and the inclusion of an inner VLAN tag is prohibited. This in [NVGRE] and the inclusion of an inner VLAN tag is prohibited. This
mode of operation in [NVGRE] maps to VLAN Based Service in [EVPN]. mode of operation in [NVGRE] maps to VLAN Based Service in
[RFC7432].
As described in the next section there is no change to the encoding As described in the next section there is no change to the encoding
of EVPN routes to support VXLAN or NVGRE encapsulation except for the of EVPN routes to support VXLAN or NVGRE encapsulation except for the
use of BGP Encapsulation extended community. However, there is use of BGP Encapsulation extended community. However, there is
potential impact to the EVPN procedures depending on where the NVE is potential impact to the EVPN procedures depending on where the NVE is
located (i.e., in hypervisor or TOR) and whether multi-homing located (i.e., in hypervisor or TOR) and whether multi-homing
capabilities are required. capabilities are required.
5.1.1 Virtual Identifiers Scope 5.1.1 Virtual Identifiers Scope
skipping to change at page 9, line 41 skipping to change at page 10, line 8
5.1.2 Virtual Identifiers to EVI Mapping 5.1.2 Virtual Identifiers to EVI Mapping
When the EVPN control plane is used in conjunction with VXLAN or When the EVPN control plane is used in conjunction with VXLAN or
NVGRE, two options for mapping the VXLAN VNI or NVGRE VSID to an EVI NVGRE, two options for mapping the VXLAN VNI or NVGRE VSID to an EVI
are possible: are possible:
1. Option 1: Single Subnet per EVI 1. Option 1: Single Subnet per EVI
In this option, a single subnet represented by a VNI or VSID is In this option, a single subnet represented by a VNI or VSID is
mapped to a unique EVI. As such, a BGP RD and RT is needed per VNI / mapped to a unique EVI. This corresponds to the VLAN Based service in
VSID on every VTEP. The advantage of this model is that it allows the [RFC7432], where a tenant VLAN ID gets mapped to an EVPN instance
BGP RT constraint mechanisms to be used in order to limit the (EVI). As such, a BGP RD and RT is needed per VNI / VSID on every
propagation and import of routes to only the VTEPs that are VTEP. The advantage of this model is that it allows the BGP RT
interested in a given VNI or VSID. The disadvantage of this model may constraint mechanisms to be used in order to limit the propagation
be the provisioning overhead if RD and RT are not derived and import of routes to only the VTEPs that are interested in a given
automatically from VNI or VSID. VNI or VSID. The disadvantage of this model may be the provisioning
overhead if RD and RT are not derived automatically from VNI or
VSID.
In this option, the MAC-VRF table is identified by the RT in the In this option, the MAC-VRF table is identified by the RT in the
control plane and by the VNI or VSID for the data-plane. In this control plane and by the VNI or VSID in the data-plane. In this
option, the specific the MAC-VRF table corresponds to only a single option, the specific the MAC-VRF table corresponds to only a single
bridge domain (e.g., a single subnet). bridge table.
2. Option 2: Multiple Subnets per EVI 2. Option 2: Multiple Subnets per EVI
In this option, multiple subnets each represented by a unique VNI or In this option, multiple subnets each represented by a unique VNI or
VSID are mapped to a unique EVI. For example, if a tenant has VSID are mapped to a single EVI. For example, if a tenant has
multiple segments/subnets each represented by a VNI or VSID, then all multiple segments/subnets each represented by a VNI or VSID, then all
the VNIs (or VSIDs) for that tenant are mapped to a single EVI - the VNIs (or VSIDs) for that tenant are mapped to a single EVI -
e.g., the EVI in this case represents the tenant and not a subnet . e.g., the EVI in this case represents the tenant and not a subnet .
The advantage of this model is that it doesn't require the This corresponds to the VLAN-Aware Bundle service in [RFC7432]. The
provisioning of RD/RT per VNI or VSID. However, this is a moot point advantage of this model is that it doesn't require the provisioning
if option 1 with if auto-derivation is used. The disadvantage of this of RD/RT per VNI or VSID. However, this is a moot point if option 1
model is that routes would be imported by VTEPs that may not be with auto-derivation is used. The disadvantage of this model is that
interested in a given VNI or VSID. routes would be imported by VTEPs that may not be interested in a
given VNI or VSID.
In this option the MAC-VRF table is identified by the RT in the In this option the MAC-VRF table is identified by the RT in the
control plane and a specific bridge domain for that MAC-VRF is control plane and a specific bridge table for that MAC-VRF is
identified by the <RT, Ethernet Tag ID> in the control plane. In this identified by the <RT, Ethernet Tag ID> in the control plane. In this
option, the VNI/VSID in the data-plane is sufficient to identify a option, the VNI/VSID in the data-plane is sufficient to identify a
specific bridge domain - e.g., no need to do a lookup based on specific bridge table - e.g., no need to do a lookup based on
VNI/VSID field and Ethernet Tag ID fields to identify a bridge VNI/VSID and Ethernet Tag ID fields to identify a bridge table.
domain.
5.1.2.1 Auto Derivation of RT 5.1.2.1 Auto Derivation of RT
When the option of a single VNI or VSID per EVI is used, it is When the option of a single VNI or VSID per EVI is used, it is
important to auto-derive RT for EVPN BGP routes in order to simplify important to auto-derive RT for EVPN BGP routes in order to simplify
configuration for data center operations. RD can be derived easily as configuration for data center operations. RD can be derived easily as
described in [EVPN] and RT can be auto-derived as described next. described in [RFC7432] and RT can be auto-derived as described next.
Since a gateway PE as depicted in figure-1 participates in both the Since a gateway PE as depicted in figure-1 participates in both the
DCN and WAN BGP sessions, it is important that when RT values are DCN and WAN BGP sessions, it is important that when RT values are
auto-derived for VNIs (or VSIDs), there is no conflict in RT spaces auto-derived for VNIs (or VSIDs), there is no conflict in RT spaces
between DCN and WAN networks assuming that both are operating within between DCN and WAN networks assuming that both are operating within
the same AS. Also, there can be scenarios where both VXLAN and NVGRE the same AS. Also, there can be scenarios where both VXLAN and NVGRE
encapsulations may be needed within the same DCN and their encapsulations may be needed within the same DCN and their
corresponding VNIs and VSIDs are administered independently which corresponding VNIs and VSIDs are administered independently which
means VNI and VSID spaces can overlap. In order to ensure that no means VNI and VSID spaces can overlap. In order to ensure that no
such conflict in RT spaces arises, RT values for DCNs are auto- such conflict in RT spaces arises, RT values for DCNs are auto-
skipping to change at page 11, line 46 skipping to change at page 12, line 16
the ingress PE. This label is used upon receipt of that packet by the the ingress PE. This label is used upon receipt of that packet by the
egress PE for disposition of that packet. This is very similar to the egress PE for disposition of that packet. This is very similar to the
use of the VNI or VSID by the egress VTEP or NVE, respectively, with use of the VNI or VSID by the egress VTEP or NVE, respectively, with
the difference being that an MPLS label has local significance while the difference being that an MPLS label has local significance while
a VNI or VSID typically has global significance. Accordingly, and a VNI or VSID typically has global significance. Accordingly, and
specifically to support the option of locally assigned VNIs, the MPLS specifically to support the option of locally assigned VNIs, the MPLS
label field in the MAC Advertisement, Ethernet AD per EVI, and label field in the MAC Advertisement, Ethernet AD per EVI, and
Inclusive Multicast Ethernet Tag routes is used to carry the VNI or Inclusive Multicast Ethernet Tag routes is used to carry the VNI or
VSID. For the balance of this memo, the MPLS label field will be VSID. For the balance of this memo, the MPLS label field will be
referred to as the VNI/VSID field. The VNI/VSID field is used for referred to as the VNI/VSID field. The VNI/VSID field is used for
both locally and globally assigned VNIs or VSIDs. both local and global VNIs/VSIDs, and for either case the entire 24-
bit field is used to encode the VNI/VSID value.
For the VNI based mode (a single VNI per EVI), the Ethernet Tag field For the VLAN based mode (a single VNI per MAC-VRF), the Ethernet Tag
in the MAC Advertisement, Ethernet AD per EVI, and Inclusive field in the MAC/IP Advertisement, Ethernet AD per EVI, and Inclusive
Multicast route MUST be set to zero just as in the VLAN Based service Multicast route MUST be set to zero just as in the VLAN Based service
in [EVPN]. For the VNI bundle mode (multiple VNIs per EVI with a in [RFC7432].
single bridge domain), the Ethernet Tag field in the MAC
Advertisement, Ethernet AD per EVI, and Inclusive Multicast Ethernet
Tag routes MUST be set to zero just as in the VLAN Bundle service in
[EVPN].
For the VNI-aware bundle mode (multiple VNIs per EVI each with its For the VNI-aware bundle mode (multiple VNIs per MAC-VRF with each
own bridge domain), the Ethernet Tag field in the MAC Advertisement, VNI associated with its own bridge table), the Ethernet Tag field in
Ethernet AD per EVI, and Inclusive Multicast route MUST identify a the MAC Advertisement, Ethernet AD per EVI, and Inclusive Multicast
bridge domain within an EVI and the set of Ethernet Tags for that EVI route MUST identify a bridge table within a MAC-VRF and the set of
needs to be configured consistently on all PEs within that EVI. The Ethernet Tags for that EVI needs to be configured consistently on all
value advertised in the Ethernet Tag field MAY be a VNI as long as it PEs within that EVI. For local VNIs, the value advertised in the
matches the existing semantics of the Ethernet Tag, i.e., it Ethernet Tag field MUST be set to a VID just as in the VLAN-aware
identifies a bridge domain within an EVI and the set of VNIs are bundle service in [RFC7432]. Such setting must be done consistently
configured consistently on each PE in that EVI. on all PE devices participating in that EVI within a given domain.
For global VNIs, the value advertised in the Ethernet Tag field
SHOULD be set to a VNI as long as it matches the existing semantics
of the Ethernet Tag, i.e., it identifies a bridge table within a MAC-
VRF and the set of VNIs are configured consistently on each PE in
that EVI. It should be noted that if within a single domain, a mix of
local and global VNIs are used for the same VLAN-aware bundle
service, then the Ethernet Tag field in the EVPN BGP route
advertisements SHALL be set to a VID.
In order to indicate that which type of data plane encapsulation In order to indicate that which type of data plane encapsulation
(i.e., VXLAN, NVGRE, MPLS, or MPLS in GRE) is to be used, the BGP (i.e., VXLAN, NVGRE, MPLS, or MPLS in GRE) is to be used, the BGP
Encapsulation extended community defined in [RFC5512] is included Encapsulation extended community defined in [RFC5512] is included
with all EVPN routes (i.e. MAC Advertisement, Ethernet AD per EVI, with all EVPN routes (i.e. MAC Advertisement, Ethernet AD per EVI,
Ethernet AD per ESI, Inclusive Multicast Ethernet Tag, and Ethernet Ethernet AD per ESI, Inclusive Multicast Ethernet Tag, and Ethernet
Segment) advertised by an egress PE. Four new values will be defined Segment) advertised by an egress PE. Five new values have been
to extend the list of encapsulation types defined in [RFC5512]: assigned by IANA to extend the list of encapsulation types defined in
[RFC5512]:
+ TBD (IANA assigned) - VXLAN Encapsulation + 8 - VXLAN Encapsulation
+ TBD (IANA assigned) - NVGRE Encapsulation + 9 - NVGRE Encapsulation
+ TBD (IANA assigned) - MPLS Encapsulation + 10 - MPLS Encapsulation
+ TBD (IANA assigned) - MPLS in GRE Encapsulation + 11 - MPLS in GRE Encapsulation
+ 12 - VXLAN GPE Encapsulation
If the BGP Encapsulation extended community is not present, then the If the BGP Encapsulation extended community is not present, then the
default MPLS encapsulation or a statically configured encapsulation default MPLS encapsulation or a statically configured encapsulation
is assumed. is assumed.
The Ethernet Segment and Ethernet AD per ESI routes MAY be advertised
with multiple encapsulation types as long as they use the same EVPN
multi-homing procedures - e.g., the mix of VXLAN and NVGRE
encapsulation types is a valid one but not the mix of VXLAN and MPLS
encapsulation types.
The Next Hop field of the MP_REACH_NLRI attribute of the route MUST The Next Hop field of the MP_REACH_NLRI attribute of the route MUST
be set to the IPv4 or IPv6 address of the NVE. The remaining fields be set to the IPv4 or IPv6 address of the NVE. The remaining fields
in each route are set as per [EVPN]. in each route are set as per [RFC7432].
5.2 MPLS over GRE 5.2 MPLS over GRE
The EVPN data-plane is modeled as an EVPN MPLS client layer sitting The EVPN data-plane is modeled as an EVPN MPLS client layer sitting
over an MPLS PSN tunnel. Some of the EVPN functions (split-horizon, over an MPLS PSN tunnel. Some of the EVPN functions (split-horizon,
aliasing and repair-path) are tied to the MPLS client layer. If MPLS aliasing, and backup-path) are tied to the MPLS client layer. If MPLS
over GRE encapsulation is used, then the EVPN MPLS client layer can over GRE encapsulation is used, then the EVPN MPLS client layer can
be carried over an IP PSN tunnel transparently. Therefore, there is be carried over an IP PSN tunnel transparently. Therefore, there is
no impact to the EVPN procedures and associated data-plane no impact to the EVPN procedures and associated data-plane
operation. operation.
The existing standards for MPLS over GRE encapsulation as defined by The existing standards for MPLS over GRE encapsulation as defined by
[RFC4023] can be used for this purpose; however, when it is used in [RFC4023] can be used for this purpose; however, when it is used in
conjunction with EVPN the key field SHOULD be present, and SHOULD be conjunction with EVPN the key field SHOULD be present, and SHOULD be
used to provide a 32-bit entropy field. The Checksum and Sequence used to provide a 32-bit entropy field. The Checksum and Sequence
Number fields are not needed and their corresponding C and S bits Number fields are not needed and their corresponding C and S bits
MUST be set to zero. MUST be set to zero.
6 EVPN with Multiple Data Plane Encapsulations 6 EVPN with Multiple Data Plane Encapsulations
The use of the BGP Encapsulation extended community allows each PE in The use of the BGP Encapsulation extended community allows each PE in
a given EVI to know each of the encapsulations supported by each of a given EVI to know each of the encapsulations supported by each of
the other PEs in that EVI. I.e., each of the PEs in a given EVI may the other PEs in that EVI. I.e., each of the PEs in a given EVI may
support multiple data plane encapsulations. An ingress PE can send a support multiple data plane encapsulations. An ingress PE can send a
frame to an egress PE only if the set of encapsulations advertised by frame to an egress PE only if the set of encapsulations advertised by
the egress PE in the subject MAC Advertisement or Per EVI Ethernet AD the egress PE in the subject MAC/IP Advertisement or per EVI Ethernet
route, forms a non-empty intersection with the set of encapsulations AD route, forms a non-empty intersection with the set of
supported by the ingress PE, and it is at the discretion of the
ingress PE which encapsulation to choose from this intersection.
(As noted in section 5.1.3, if the BGP Encapsulation extended
community is not present, then the default MPLS encapsulation or a
statically configured encapsulation is assumed.)
If BGP Encapsulation extended community is not present, then the
default MPLS encapsulation (or statically configured encapsulation)
is used. However, if this attribute is present, then an ingress PE
can send a frame to an egress PE only if the set of encapsulations
advertised by the egress PE in the subject MAC Advertisement or Per
EVI Ethernet AD route, forms a non-empty intersection with the set of
encapsulations supported by the ingress PE, and it is at the encapsulations supported by the ingress PE, and it is at the
discretion of the ingress PE which encapsulation to choose from this discretion of the ingress PE which encapsulation to choose from this
intersection. intersection. (As noted in section 5.1.3, if the BGP Encapsulation
extended community is not present, then the default MPLS
encapsulation or a statically configured encapsulation is assumed.)
An ingress node that uses shared multicast trees for sending An ingress node that uses shared multicast trees for sending
broadcast or multicast frames MUST maintain distinct trees for each broadcast or multicast frames MUST maintain distinct trees for each
different encapsulation type. different encapsulation type.
It is the responsibility of the operator of a given EVI to ensure It is the responsibility of the operator of a given EVI to ensure
that all of the PEs in that EVI support at least one common that all of the PEs in that EVI support at least one common
encapsulation. If this condition is violated, it could result in encapsulation. If this condition is violated, it could result in
service disruption or failure. The use of the BGP Encapsulation service disruption or failure. The use of the BGP Encapsulation
extended community provides a method to detect when this condition is extended community provides a method to detect when this condition is
skipping to change at page 14, line 19 skipping to change at page 14, line 39
hosting the VM, and need not be "visible" to any other PEs, and thus hosting the VM, and need not be "visible" to any other PEs, and thus
does not require any specific protocol mechanisms. The most common does not require any specific protocol mechanisms. The most common
case of this is when the NVE resides in the hypervisor. case of this is when the NVE resides in the hypervisor.
In the sub-sections that follow, we will discuss the impact on EVPN In the sub-sections that follow, we will discuss the impact on EVPN
procedures for the case when the NVE resides on the hypervisor and procedures for the case when the NVE resides on the hypervisor and
the VXLAN or NVGRE encapsulation is used. the VXLAN or NVGRE encapsulation is used.
7.1 Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE Encapsulation 7.1 Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE Encapsulation
When the VXLAN VNI or NVGRE VSID is assumed to be a global value, one In the scenario where all data centers are under a single
might question the need for the Route Distinguisher (RD) in the EVPN
routes. In the scenario where all data centers are under a single
administrative domain, and there is a single global VNI/VSID space, administrative domain, and there is a single global VNI/VSID space,
the RD MAY be set to zero in the EVPN routes. However, in the the RD MAY be set to zero in the EVPN routes. However, in the
scenario where different groups of data centers are under different scenario where different groups of data centers are under different
administrative domains, and these data centers are connected via one administrative domains, and these data centers are connected via one
or more backbone core providers as described in [NOV3-Framework], the or more backbone core providers as described in [NOV3-Framework], the
RD must be a unique value per EVI or per NVE as described in [EVPN]. RD must be a unique value per EVI or per NVE as described in
In other words, whenever there is more than one administrative domain [RFC7432]. In other words, whenever there is more than one
for global VNI or VSID, then a non-zero RD MUST be used, or whenever administrative domain for global VNI or VSID, then a non-zero RD MUST
the VNI or VSID value have local significance, then a non-zero RD be used, or whenever the VNI or VSID value have local significance,
MUST be used. It is recommend to use a non-zero RD at all time. then a non-zero RD MUST be used. It is recommend to use a non-zero RD
at all time.
When the NVEs reside on the hypervisor, the EVPN BGP routes and When the NVEs reside on the hypervisor, the EVPN BGP routes and
attributes associated with multi-homing are no longer required. This attributes associated with multi-homing are no longer required. This
reduces the required routes and attributes to the following subset of reduces the required routes and attributes to the following subset of
four out of the set of eight : four out of the set of eight :
- MAC Advertisement Route - MAC Advertisement Route
- Inclusive Multicast Ethernet Tag Route - Inclusive Multicast Ethernet Tag Route
- MAC Mobility Extended Community - MAC Mobility Extended Community
- Default Gateway Extended Community - Default Gateway Extended Community
However, as noted in section 8.6 of [EVPN] in order to enable a However, as noted in section 8.6 of [RFC7432] in order to enable a
single-homed ingress PE to take advantage of fast convergence, single-homed ingress PE to take advantage of fast convergence,
aliasing, and backup-path when interacting with multi-homed egress aliasing, and backup-path when interacting with multi-homed egress
PEs attached to a given Ethernet segment, a single-homed ingress PE PEs attached to a given Ethernet segment, a single-homed ingress PE
SHOULD be able to receive and process Ethernet AD per ES and Ethernet SHOULD be able to receive and process Ethernet AD per ES and Ethernet
AD per EVI routes." AD per EVI routes."
7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation 7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation
When the NVEs reside on the hypervisors, the EVPN procedures When the NVEs reside on the hypervisors, the EVPN procedures
associated with multi-homing are no longer required. This limits the associated with multi-homing are no longer required. This limits the
procedures on the NVE to the following subset of the EVPN procedures: procedures on the NVE to the following subset of the EVPN procedures:
1. Local learning of MAC addresses received from the VMs per section 1. Local learning of MAC addresses received from the VMs per section
10.1 of [EVPN]. 10.1 of [RFC7432].
2. Advertising locally learned MAC addresses in BGP using the MAC 2. Advertising locally learned MAC addresses in BGP using the MAC
Advertisement routes. Advertisement routes.
3. Performing remote learning using BGP per Section 10.2 of [EVPN]. 3. Performing remote learning using BGP per Section 10.2 of
[RFC7432].
4. Discovering other NVEs and constructing the multicast tunnels 4. Discovering other NVEs and constructing the multicast tunnels
using the Inclusive Multicast Ethernet Tag routes. using the Inclusive Multicast Ethernet Tag routes.
5. Handling MAC address mobility events per the procedures of Section 5. Handling MAC address mobility events per the procedures of Section
16 in [EVPN]. 16 in [RFC7432].
However, as noted in section 8.6 of [EVPN] in order to enable a However, as noted in section 8.6 of [RFC7432] in order to enable a
single-homed ingress PE to take advantage of fast convergence, single-homed ingress PE to take advantage of fast convergence,
aliasing, and back-up path when interacting with multi-homed egress aliasing, and back-up path when interacting with multi-homed egress
PEs attached to a given Ethernet segment, a single-homed ingress PE PEs attached to a given Ethernet segment, a single-homed ingress PE
SHOULD implement the ingress node processing of Ethernet AD per ES SHOULD implement the ingress node processing of Ethernet AD per ES
and Ethernet AD per EVI routes as defined in sections 8.2 Fast and Ethernet AD per EVI routes as defined in sections 8.2 Fast
Convergence and 8.4 Aliasing and Backup-Path of [EVPN]. Convergence and 8.4 Aliasing and Backup-Path of [RFC7432].
8 NVE Residing in ToR Switch 8 NVE Residing in ToR Switch
In this section, we discuss the scenario where the NVEs reside in the In this section, we discuss the scenario where the NVEs reside in the
Top of Rack (ToR) switches AND the servers (where VMs are residing) Top of Rack (ToR) switches AND the servers (where VMs are residing)
are multi-homed to these ToR switches. The multi-homing may operate are multi-homed to these ToR switches. The multi-homing may operate
in All-Active or Single-Active redundancy mode. If the servers are in All-Active or Single-Active redundancy mode. If the servers are
single-homed to the ToR switches, then the scenario becomes similar single-homed to the ToR switches, then the scenario becomes similar
to that where the NVE resides in the hypervisor, as discussed in to that where the NVE resides in the hypervisor, as discussed in
Section 5, as far as the required EVPN functionality. Section 5, as far as the required EVPN functionality.
[EVPN] defines a set of BGP routes, attributes and procedures to [RFC7432] defines a set of BGP routes, attributes and procedures to
support multi-homing. We first describe these functions and support multi-homing. We first describe these functions and
procedures, then discuss which of these are impacted by the procedures, then discuss which of these are impacted by the
encapsulation (such as VXLAN or NVGRE) and what modifications are encapsulation (such as VXLAN or NVGRE) and what modifications are
required. required.
8.1 EVPN Multi-Homing Features 8.1 EVPN Multi-Homing Features
In this section, we will recap the multi-homing features of EVPN to In this section, we will recap the multi-homing features of EVPN to
highlight the encapsulation dependencies. The section only describes highlight the encapsulation dependencies. The section only describes
the features and functions at a high-level. For more details, the the features and functions at a high-level. For more details, the
reader is to refer to [EVPN]. reader is to refer to [RFC7432].
8.1.1 Multi-homed Ethernet Segment Auto-Discovery 8.1.1 Multi-homed Ethernet Segment Auto-Discovery
EVPN NVEs (or PEs) connected to the same Ethernet Segment (e.g. the EVPN NVEs (or PEs) connected to the same Ethernet Segment (e.g. the
same server via LAG) can automatically discover each other with same server via LAG) can automatically discover each other with
minimal to no configuration through the exchange of BGP routes. minimal to no configuration through the exchange of BGP routes.
8.1.2 Fast Convergence and Mass Withdraw 8.1.2 Fast Convergence and Mass Withdraw
EVPN defines a mechanism to efficiently and quickly signal, to remote EVPN defines a mechanism to efficiently and quickly signal, to remote
skipping to change at page 16, line 29 skipping to change at page 16, line 49
NVE withdraws the corresponding Ethernet A-D route. This triggers all NVE withdraws the corresponding Ethernet A-D route. This triggers all
NVEs that receive the withdrawal to update their next-hop adjacencies NVEs that receive the withdrawal to update their next-hop adjacencies
for all MAC addresses associated with the Ethernet segment in for all MAC addresses associated with the Ethernet segment in
question. If no other NVE had advertised an Ethernet A-D route for question. If no other NVE had advertised an Ethernet A-D route for
the same segment, then the NVE that received the withdrawal simply the same segment, then the NVE that received the withdrawal simply
invalidates the MAC entries for that segment. Otherwise, the NVE invalidates the MAC entries for that segment. Otherwise, the NVE
updates the next-hop adjacencies to point to the backup NVE(s). updates the next-hop adjacencies to point to the backup NVE(s).
8.1.3 Split-Horizon 8.1.3 Split-Horizon
If a CE that is multi-homed to two or more NVEs on an Ethernet If a server is multi-homed to two or more NVEs on an Ethernet segment
segment ES1 operating in all-active redundancy mode sends a ES1 operating in all-active redundancy mode sends a multicast,
multicast, broadcast or unknown unicast packet to a one of these broadcast or unknown unicast packet to a one of these NVEs, then it
NVEs, then that NVE will forward that packet to all of the other PEs is important to ensure the packet is not looped back to the server
in that EVI including the other NVEs attached to ES1 and those NVEs via another NVE connected to this server. The filtering mechanism on
MUST drop the packet and not forward back to the originating CE. the NVE to prevent such loop and packet duplication is called "split
This is termed 'split horizon filtering'. horizon filtering'.
8.1.4 Aliasing and Backup-Path 8.1.4 Aliasing and Backup-Path
In the case where a station is multi-homed to multiple NVEs, it is In the case where a station is multi-homed to multiple NVEs, it is
possible that only a single NVE learns a set of the MAC addresses possible that only a single NVE learns a set of the MAC addresses
associated with traffic transmitted by the station. This leads to a associated with traffic transmitted by the station. This leads to a
situation where remote NVEs receive MAC advertisement routes, for situation where remote NVEs receive MAC advertisement routes, for
these addresses, from a single NVE even though multiple NVEs are these addresses, from a single NVE even though multiple NVEs are
connected to the multi-homed station. As a result, the remote NVEs connected to the multi-homed station. As a result, the remote NVEs
are not able to effectively load-balance traffic among the NVEs are not able to effectively load-balance traffic among the NVEs
skipping to change at page 17, line 41 skipping to change at page 18, line 16
responsible for sending it broadcast, multicast, and, if configured responsible for sending it broadcast, multicast, and, if configured
for that EVI, unknown unicast frames. for that EVI, unknown unicast frames.
This is required in order to prevent duplicate delivery of multi- This is required in order to prevent duplicate delivery of multi-
destination frames to a multi-homed host or VM, in case of all-active destination frames to a multi-homed host or VM, in case of all-active
redundancy. redundancy.
8.2 Impact on EVPN BGP Routes & Attributes 8.2 Impact on EVPN BGP Routes & Attributes
Since multi-homing is supported in this scenario, then the entire set Since multi-homing is supported in this scenario, then the entire set
of BGP routes and attributes defined in [EVPN] are used. As discussed of BGP routes and attributes defined in [RFC7432] are used. As
in Section 3.1.3, the VSID or VNI is carried in the VNI/VSID field in discussed in Section 3.1.3, the VSID or VNI is carried in the
the MAC Advertisement, Ethernet AD per EVI, and Inclusive Multicast VNI/VSID field in the MAC Advertisement, Ethernet AD per EVI, and
Ethernet Tag routes. Inclusive Multicast Ethernet Tag routes.
8.3 Impact on EVPN Procedures 8.3 Impact on EVPN Procedures
Two cases need to be examined here, depending on whether the NVEs are Two cases need to be examined here, depending on whether the NVEs are
operating in Active/Standby or in All-Active redundancy. operating in Active/Standby or in All-Active redundancy.
First, let's consider the case of Active/Standby redundancy, where First, lets consider the case of Active/Standby redundancy, where the
the hosts are multi-homed to a set of NVEs, however, only a single hosts are multi-homed to a set of NVEs, however, only a single NVE is
NVE is active at a given point of time for a given VNI or VSID. In active at a given point of time for a given VNI or VSID. In this
this case, the Split-Horizon and Aliasing functions are not required case, the split-horizon and the aliasing functions are not required
but other functions such as multi-homed Ethernet segment auto- but other functions such as multi-homed Ethernet segment auto-
discovery, fast convergence and mass withdraw, repair path, and DF discovery, fast convergence and mass withdraw, backup path, and DF
election are required. In this case, the impact of the use of the election are required.
VXLAN/NVGRE encapsulation on the EVPN procedures is when the Backup-
Path function is supported, as discussed next:
In EVPN, the NVEs connected to a multi-homed site using
Active/Standby redundancy optionally advertise a VPN label, in the
Ethernet A-D Route per EVI, used to send traffic to the backup NVE in
the case where the primary NVE fails. In the case where VXLAN or
NVGRE encapsulation is used, some alternative means that does not
rely on MPLS labels is required to support Backup-Path. This is
discussed in Section 4.3.2 below. If the Backup-Path function is not
used, then the VXLAN/NVGRE encapsulation would have no impact on the
EVPN procedures.
Second, let's consider the case of All-Active redundancy. In this Second, let's consider the case of All-Active redundancy. In this
case, out of the EVPN multi-homing features listed in section 4.1, case, out of the EVPN multi-homing features listed in section 8.1,
the use of the VXLAN or NVGRE encapsulation impacts the Split-Horizon the use of the VXLAN or NVGRE encapsulation impacts the split-horizon
and Aliasing features, since those two rely on the MPLS client layer. and aliasing features, since those two rely on the MPLS client layer.
Given that this MPLS client layer is absent with these types of Given that this MPLS client layer is absent with these types of
encapsulations, alternative procedures and mechanisms are needed to encapsulations, alternative procedures and mechanisms are needed to
provide the required functions. Those are discussed in detail next. provide the required functions. Those are discussed in detail next.
8.3.1 Split Horizon 8.3.1 Split Horizon
In EVPN, an MPLS label is used for split-horizon filtering to support In EVPN, an MPLS label is used for split-horizon filtering to support
active/active multi-homing where an ingress ToR switch adds a label active/active multi-homing where an ingress NVE adds a label
corresponding to the site of origin (aka ESI MPLS Label) when corresponding to the site of origin (aka ESI Label) when
encapsulating the packet. The egress ToR switch checks the ESI MPLS encapsulating the packet. The egress NVE checks the ESI label when
label when attempting to forward a multi-destination frame out an attempting to forward a multi-destination frame out an interface, and
interface, and if the label corresponds to the same site identifier if the label corresponds to the same site identifier (ESI) associated
(ESI) associated with that interface, the packet gets dropped. This with that interface, the packet gets dropped. This prevents the
prevents the occurrence of forwarding loops. occurrence of forwarding loops.
Since the VXLAN or NVGRE encapsulation does not include this ESI MPLS Since the VXLAN or NVGRE encapsulation does not include this ESI
label, other means of performing the split-horizon filtering function label, other means of performing the split-horizon filtering function
MUST be devised. The following approach is recommended for split- MUST be devised. The following approach is recommended for split-
horizon filtering when VXLAN or NVGRE encapsulation is used. horizon filtering when VXLAN or NVGRE encapsulation is used.
Every NVE track the IP address(es) associated with the other NVE(s) Every NVE track the IP address(es) associated with the other NVE(s)
with which it has shared multi-homed Ethernet Segments. When the NVE with which it has shared multi-homed Ethernet Segments. When the NVE
receives a multi-destination frame from the overlay network, it receives a multi-destination frame from the overlay network, it
examines the source IP address in the tunnel header (which examines the source IP address in the tunnel header (which
corresponds to the ingress NVE) and filters out the frame on all corresponds to the ingress NVE) and filters out the frame on all
local interfaces connected to Ethernet Segments that are shared with local interfaces connected to Ethernet Segments that are shared with
the ingress NVE. With this approach, it is required that the ingress the ingress NVE. With this approach, it is required that the ingress
NVE performs replication locally to all directly attached Ethernet NVE performs replication locally to all directly attached Ethernet
Segments (regardless of the DF Election state) for all flooded Segments (regardless of the DF Election state) for all flooded
traffic ingress from the access interfaces (i.e. from the hosts). traffic ingress from the access interfaces (i.e. from the hosts).
This approach is referred to as "Local Bias", and has the advantage This approach is referred to as "Local Bias", and has the advantage
that only a single IP address needs to be used per NVE for split- that only a single IP address needs to be used per NVE for split-
horizon filtering, as opposed to requiring an IP address per Ethernet horizon filtering, as opposed to requiring an IP address per Ethernet
Segment per NVE. Segment per NVE.
In order to prevent unhealthy interactions between the split horizon In order to prevent unhealthy interactions between the split horizon
procedures defined in [EVPN] and the local bias procedures described procedures defined in [RFC7432] and the local bias procedures
in this document, a mix of MPLS over GRE encapsulations on the one described in this document, a mix of MPLS over GRE encapsulations on
hand and VXLAN/NVGRE encapsulations on the other on a given Ethernet the one hand and VXLAN/NVGRE encapsulations on the other on a given
Segment is prohibited. Ethernet Segment is prohibited.
8.3.2 Aliasing and Backup-Path 8.3.2 Aliasing and Backup-Path
The Aliasing and the Backup-Path procedures for VXLAN/NVGRE The Aliasing and the Backup-Path procedures for VXLAN/NVGRE
encapsulation is very similar to the ones for MPLS. In case of MPLS, encapsulation is very similar to the ones for MPLS. In case of MPLS,
two different Ethernet AD routes are used for this purpose. The one two different Ethernet AD routes are used for this purpose. The one
used for Aliasing has a VPN scope and carries a VPN label but the one used for Aliasing has a VPN scope and carries a VPN label but the one
used for Backup-Path has Ethernet segment scope and doesn't carry any used for Backup-Path has Ethernet segment scope and doesn't carry any
VPN specific info (e.g., Ethernet Tag and MPLS label are set to VPN specific info (e.g., Ethernet Tag and MPLS label are set to
zero). The same two routes are used when VXLAN or NVGRE encapsulation zero).
is used with the difference that when Ethernet AD route is used for
Aliasing with VPN scope, the Ethernet Tag field is set to VNI or VSID
to indicate VPN scope (and MPLS field may be set to a VPN label if
needed).
9 Support for Multicast 9 Support for Multicast
The E-VPN Inclusive Multicast BGP route is used to discover the The E-VPN Inclusive Multicast BGP route is used to discover the
multicast tunnels among the endpoints associated with a given VXLAN multicast tunnels among the endpoints associated with a given VXLAN
VNI or NVGRE VSID. The Ethernet Tag field of this route is used to VNI or NVGRE VSID. The Ethernet Tag field of this route is used to
encode the VNI for VLXAN or VSID for NVGRE. The Originating router's encode the VNI for VLXAN or VSID for NVGRE. The Originating router's
IP address field is set to the NVE's IP address. This route is tagged IP address field is set to the NVE's IP address. This route is tagged
with the PMSI Tunnel attribute, which is used to encode the type of with the PMSI Tunnel attribute, which is used to encode the type of
multicast tunnel to be used as well as the multicast tunnel multicast tunnel to be used as well as the multicast tunnel
skipping to change at page 21, line 27 skipping to change at page 21, line 34
11 Acknowledgement 11 Acknowledgement
The authors would like to thank David Smith, John Mullooly, Thomas The authors would like to thank David Smith, John Mullooly, Thomas
Nadeau for their valuable comments and feedback. Nadeau for their valuable comments and feedback.
12 Security Considerations 12 Security Considerations
This document uses IP-based tunnel technologies to support data This document uses IP-based tunnel technologies to support data
plane transport. Consequently, the security considerations of those plane transport. Consequently, the security considerations of those
tunnel technologies apply. This document defines support for [VXLAN] tunnel technologies apply. This document defines support for VXLAN
and [NVGRE]. The security considerations from those documents as well and NVGRE encapsulations. The security considerations from those
as [RFC4301] apply to the data plane aspects of this document. documents as well as [RFC4301] apply to the data plane aspects of
this document.
As with [RFC5512], any modification of the information that is used As with [RFC5512], any modification of the information that is used
to form encapsulation headers, to choose a tunnel type, or to choose to form encapsulation headers, to choose a tunnel type, or to choose
a particular tunnel for a particular payload type may lead to user a particular tunnel for a particular payload type may lead to user
data packets getting misrouted, misdelivered, and/or dropped. data packets getting misrouted, misdelivered, and/or dropped.
More broadly, the security considerations for the transport of IP More broadly, the security considerations for the transport of IP
reachability information using BGP are discussed in [RFC4271] and reachability information using BGP are discussed in [RFC4271] and
[RFC4272], and are equally applicable for the extensions described [RFC4272], and are equally applicable for the extensions described
in this document. in this document.
skipping to change at page 22, line 15 skipping to change at page 22, line 23
13 IANA Considerations 13 IANA Considerations
IANA has allocated the following BGP Tunnel Encapsulation Attribute IANA has allocated the following BGP Tunnel Encapsulation Attribute
Tunnel Types: Tunnel Types:
8 VXLAN Encapsulation 8 VXLAN Encapsulation
9 NVGRE Encapsulation 9 NVGRE Encapsulation
10 MPLS Encapsulation 10 MPLS Encapsulation
11 MPLS in GRE Encapsulation 11 MPLS in GRE Encapsulation
12 VxLAN GPE Encapsulation 12 VXLAN GPE Encapsulation
14 References 14 References
14.1 Normative References 14.1 Normative References
[KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4271] Y. Rekhter, Ed., T. Li, Ed., S. Hares, Ed., "A Border [RFC4271] Y. Rekhter, Ed., T. Li, Ed., S. Hares, Ed., "A Border
Gateway Protocol 4 (BGP-4)", January 2006. Gateway Protocol 4 (BGP-4)", January 2006.
skipping to change at page 22, line 37 skipping to change at page 22, line 45
[RFC4272] S. Murphy, "BGP Security Vulnerabilities Analysis.", [RFC4272] S. Murphy, "BGP Security Vulnerabilities Analysis.",
January 2006. January 2006.
[RFC4301] S. Kent, K. Seo., "Security Architecture for the [RFC4301] S. Kent, K. Seo., "Security Architecture for the
Internet Protocol.", December 2005. Internet Protocol.", December 2005.
[RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation
Subsequent Address Family Identifier (SAFI) and the BGP Subsequent Address Family Identifier (SAFI) and the BGP
Tunnel Encapsulation Attribute", RFC 5512, April 2009. Tunnel Encapsulation Attribute", RFC 5512, April 2009.
14.2 Informative References [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", RFC 7432,
February 2014
[EVPN-REQ] Sajassi et al., "Requirements for Ethernet VPN (EVPN)", 14.2 Informative References
draft-ietf-l2vpn-evpn-req-01.txt, work in progress, October 21, 2012.
[NVGRE] Sridhavan, M., et al., "NVGRE: Network Virtualization using [RFC7209] Sajassi et al., "Requirements for Ethernet VPN (EVPN)", RFC
Generic Routing Encapsulation", draft-sridharan-virtualization-nvgre- 7209, May 2014
01.txt, July 8, 2012.
[VXLAN] Dutt, D., et al, "VXLAN: A Framework for Overlaying [RFC7348] Mahalingam, M., et al, "VXLAN: A Framework for Overlaying
Virtualized Layer 2 Networks over Layer 3 Networks", draft- Virtualized Layer 2 Networks over Layer 3 Networks", RFC 7348, August
mahalingam-dutt-dcops-vxlan-02.txt, August 22, 2012. 2014
[EVPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", draft-ietf- [NVGRE] Garg, P., et al., "NVGRE: Network Virtualization using
l2vpn-evpn-02.txt, work in progress, February, 2012. Generic Routing Encapsulation", draft-sridharan-virtualization-nvgre-
07.txt, November 11, 2014
[Problem-Statement] Narten et al., "Problem Statement: Overlays for [Problem-Statement] Narten et al., "Problem Statement: Overlays for
Network Virtualization", draft-ietf-nvo3-overlay-problem-statement- Network Virtualization", draft-ietf-nvo3-overlay-problem-statement-
01, September 2012. 01, September 2012.
[L3VPN-ENDSYSTEMS] Marques et al., "BGP-signaled End-system IP/VPNs", [L3VPN-ENDSYSTEMS] Marques et al., "BGP-signaled End-system IP/VPNs",
draft-ietf-l3vpn-end-system, work in progress, October 2012. draft-ietf-l3vpn-end-system, work in progress, October 2012.
[NOV3-FRWK] Lasserre et al., "Framework for DC Network [NOV3-FRWK] Lasserre et al., "Framework for DC Network
Virtualization", draft-ietf-nvo3-framework-01.txt, work in progress, Virtualization", draft-ietf-nvo3-framework-01.txt, work in progress,
 End of changes. 73 change blocks. 
205 lines changed or deleted 196 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/