draft-ietf-bess-evpn-overlay-04.txt   draft-ietf-bess-evpn-overlay-05.txt 
skipping to change at page 1, line 16 skipping to change at page 1, line 16
Juniper Juniper
N. Bitar N. Bitar
Nokia Nokia
R. Shekhar R. Shekhar
Juniper Juniper
J. Uttaro J. Uttaro
AT&T AT&T
W. Henderickx W. Henderickx
Nokia Nokia
Expires: December 10, 2016 June 10, 2016 Expires: April 18, 2017 October 18, 2016
A Network Virtualization Overlay Solution using EVPN A Network Virtualization Overlay Solution using EVPN
draft-ietf-bess-evpn-overlay-04 draft-ietf-bess-evpn-overlay-05
Abstract Abstract
This document describes how Ethernet VPN (EVPN) [RFC7432] can be used This document describes how Ethernet VPN (EVPN) [RFC7432] can be used
as an Network Virtualization Overlay (NVO) solution and explores the as an Network Virtualization Overlay (NVO) solution and explores the
various tunnel encapsulation options over IP and their impact on the various tunnel encapsulation options over IP and their impact on the
EVPN control-plane and procedures. In particular, the following EVPN control-plane and procedures. In particular, the following
encapsulation options are analyzed: VXLAN, NVGRE, and MPLS over GRE. encapsulation options are analyzed: VXLAN, NVGRE, and MPLS over GRE.
Status of this Memo Status of this Memo
skipping to change at page 2, line 23 skipping to change at page 2, line 23
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Specification of Requirements . . . . . . . . . . . . . . . . . 5 2 Specification of Requirements . . . . . . . . . . . . . . . . . 4
3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 EVPN Features . . . . . . . . . . . . . . . . . . . . . . . . . 6 4 EVPN Features . . . . . . . . . . . . . . . . . . . . . . . . . 5
5 Encapsulation Options for EVPN Overlays . . . . . . . . . . . . 7 5 Encapsulation Options for EVPN Overlays . . . . . . . . . . . . 7
5.1 VXLAN/NVGRE Encapsulation . . . . . . . . . . . . . . . . . 7 5.1 VXLAN/NVGRE Encapsulation . . . . . . . . . . . . . . . . . 7
5.1.1 Virtual Identifiers Scope . . . . . . . . . . . . . . . 8 5.1.1 Virtual Identifiers Scope . . . . . . . . . . . . . . . 8
5.1.1.1 Data Center Interconnect with Gateway . . . . . . . 8 5.1.1.1 Data Center Interconnect with Gateway . . . . . . . 8
5.1.1.2 Data Center Interconnect without Gateway . . . . . . 9 5.1.1.2 Data Center Interconnect without Gateway . . . . . . 8
5.1.2 Virtual Identifiers to EVI Mapping . . . . . . . . . . . 9 5.1.2 Virtual Identifiers to EVI Mapping . . . . . . . . . . . 9
5.1.2.1 Auto Derivation of RT . . . . . . . . . . . . . . . 10 5.1.2.1 Auto Derivation of RT . . . . . . . . . . . . . . . 10
5.1.3 Constructing EVPN BGP Routes . . . . . . . . . . . . . 11 5.1.3 Constructing EVPN BGP Routes . . . . . . . . . . . . . 11
5.2 MPLS over GRE . . . . . . . . . . . . . . . . . . . . . . . 13 5.2 MPLS over GRE . . . . . . . . . . . . . . . . . . . . . . . 13
6 EVPN with Multiple Data Plane Encapsulations . . . . . . . . . 13 6 EVPN with Multiple Data Plane Encapsulations . . . . . . . . . 13
7 NVE Residing in Hypervisor . . . . . . . . . . . . . . . . . . 14 7 NVE Residing in Hypervisor . . . . . . . . . . . . . . . . . . 14
7.1 Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE 7.1 Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE
Encapsulation . . . . . . . . . . . . . . . . . . . . . . . 14 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . 14
7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation . . 15 7.2 Impact on EVPN Procedures for VXLAN/NVGRE Encapsulation . . 15
8 NVE Residing in ToR Switch . . . . . . . . . . . . . . . . . . 15 8 NVE Residing in ToR Switch . . . . . . . . . . . . . . . . . . 15
8.1 EVPN Multi-Homing Features . . . . . . . . . . . . . . . . 16 8.1 EVPN Multi-Homing Features . . . . . . . . . . . . . . . . 15
8.1.1 Multi-homed Ethernet Segment Auto-Discovery . . . . . . 16 8.1.1 Multi-homed Ethernet Segment Auto-Discovery . . . . . . 16
8.1.2 Fast Convergence and Mass Withdraw . . . . . . . . . . . 16 8.1.2 Fast Convergence and Mass Withdraw . . . . . . . . . . . 16
8.1.3 Split-Horizon . . . . . . . . . . . . . . . . . . . . . 16 8.1.3 Split-Horizon . . . . . . . . . . . . . . . . . . . . . 16
8.1.4 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 17 8.1.4 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 16
8.1.5 DF Election . . . . . . . . . . . . . . . . . . . . . . 17 8.1.5 DF Election . . . . . . . . . . . . . . . . . . . . . . 17
8.2 Impact on EVPN BGP Routes & Attributes . . . . . . . . . . . 18 8.2 Impact on EVPN BGP Routes & Attributes . . . . . . . . . . . 18
8.3 Impact on EVPN Procedures . . . . . . . . . . . . . . . . . 18 8.3 Impact on EVPN Procedures . . . . . . . . . . . . . . . . . 18
8.3.1 Split Horizon . . . . . . . . . . . . . . . . . . . . . 19 8.3.1 Split Horizon . . . . . . . . . . . . . . . . . . . . . 18
8.3.2 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 19 8.3.2 Aliasing and Backup-Path . . . . . . . . . . . . . . . . 19
9 Support for Multicast . . . . . . . . . . . . . . . . . . . . . 20 9 Support for Multicast . . . . . . . . . . . . . . . . . . . . . 19
10 Data Center Interconnections - DCI . . . . . . . . . . . . . . 20 10 Data Center Interconnections - DCI . . . . . . . . . . . . . . 20
10.1 DCI using GWs . . . . . . . . . . . . . . . . . . . . . . . 21 10.1 DCI using GWs . . . . . . . . . . . . . . . . . . . . . . . 20
10.2 DCI using ASBRs . . . . . . . . . . . . . . . . . . . . . . 21 10.2 DCI using ASBRs . . . . . . . . . . . . . . . . . . . . . . 21
10.2.1 ASBR Functionality with NVEs in Hypervisors . . . . . . 22 10.2.1 ASBR Functionality with NVEs in Hypervisors . . . . . . 22
10.2.2 ASBR Functionality with NVEs in TORs . . . . . . . . . 22 10.2.2 ASBR Functionality with NVEs in TORs . . . . . . . . . 22
11 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 24 11 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 25
12 Security Considerations . . . . . . . . . . . . . . . . . . . 24 12 Security Considerations . . . . . . . . . . . . . . . . . . . 25
13 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 13 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25
14 References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 14 References . . . . . . . . . . . . . . . . . . . . . . . . . . 26
14.1 Normative References . . . . . . . . . . . . . . . . . . . 25 14.1 Normative References . . . . . . . . . . . . . . . . . . . 26
14.2 Informative References . . . . . . . . . . . . . . . . . . 26 14.2 Informative References . . . . . . . . . . . . . . . . . . 26
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27
1 Introduction 1 Introduction
In the context of this document, a Network Virtualization Overlay In the context of this document, a Network Virtualization Overlay
(NVO) is a solution to address the requirements of a multi-tenant (NVO) is a solution to address the requirements of a multi-tenant
data center, especially one with virtualized hosts, e.g., Virtual data center, especially one with virtualized hosts, e.g., Virtual
Machines (VMs). The key requirements of such a solution, as described Machines (VMs) or virtual workloads. The key requirements of such a
in [Problem-Statement], are: solution, as described in [Problem-Statement], are:
- Isolation of network traffic per tenant - Isolation of network traffic per tenant
- Support for a large number of tenants (tens or hundreds of - Support for a large number of tenants (tens or hundreds of
thousands) thousands)
- Extending L2 connectivity among different VMs belonging to a given - Extending L2 connectivity among different VMs belonging to a given
tenant segment (subnet) across different PODs within a data center or tenant segment (subnet) across different PODs within a data center or
between different data centers between different data centers
skipping to change at page 4, line 37 skipping to change at page 4, line 37
This document describes how Ethernet VPN (EVPN) can be used as an NVO This document describes how Ethernet VPN (EVPN) can be used as an NVO
solution and explores applicability of EVPN functions and procedures. solution and explores applicability of EVPN functions and procedures.
In particular, it describes the various tunnel encapsulation options In particular, it describes the various tunnel encapsulation options
for EVPN over IP, and their impact on the EVPN control-plane and for EVPN over IP, and their impact on the EVPN control-plane and
procedures for two main scenarios: procedures for two main scenarios:
a) when the NVE resides in the hypervisor, and a) when the NVE resides in the hypervisor, and
b) when the NVE resides in a Top of Rack (ToR) device b) when the NVE resides in a Top of Rack (ToR) device
Note that the use of EVPN as an NVO solution does not necessarily
mandate that the BGP control-plane be running on the NVE. For such
scenarios, it is still possible to leverage the EVPN solution by
using XMPP, or alternative mechanisms, to extend the control-plane to
the NVE as discussed in [L3VPN-ENDSYSTEMS].
The possible encapsulation options for EVPN overlays that are The possible encapsulation options for EVPN overlays that are
analyzed in this document are: analyzed in this document are:
- VXLAN and NVGRE - VXLAN and NVGRE
- MPLS over GRE - MPLS over GRE
Before getting into the description of the different encapsulation Before getting into the description of the different encapsulation
options for EVPN over IP, it is important to highlight the EVPN options for EVPN over IP, it is important to highlight the EVPN
solution's main features, how those features are currently supported, solution's main features, how those features are currently supported,
and any impact that the encapsulation has on those features. and any impact that the encapsulation has on those features.
skipping to change at page 6, line 35 skipping to change at page 6, line 29
Furthermore, RR hierarchy can be leveraged to scale the number of BGP Furthermore, RR hierarchy can be leveraged to scale the number of BGP
routes on the RR. routes on the RR.
4) Auto-discovery via BGP is used to discover PE devices 4) Auto-discovery via BGP is used to discover PE devices
participating in a given VPN, PE devices participating in a given participating in a given VPN, PE devices participating in a given
redundancy group, tunnel encapsulation types, multicast tunnel type, redundancy group, tunnel encapsulation types, multicast tunnel type,
multicast members, etc. multicast members, etc.
5) All-Active multihoming is used. This allows a given customer 5) All-Active multihoming is used. This allows a given customer
device (CE) to have multiple links to multiple PEs, and traffic device (CE) to have multiple links to multiple PEs, and traffic
to/from that CE fully utilizes all of these links. This set of links to/from that CE fully utilizes all of these links.
is termed an Ethernet Segment (ES).
6) When a link between a CE and a PE fails, the PEs for that EVI are 6) When a link between a CE and a PE fails, the PEs for that EVI are
notified of the failure via the withdrawal of a single EVPN route. notified of the failure via the withdrawal of a single EVPN route.
This allows those PEs to remove the withdrawing PE as a next hop for This allows those PEs to remove the withdrawing PE as a next hop for
every MAC address associated with the failed link. This is termed every MAC address associated with the failed link. This is termed
'mass withdrawal'. 'mass withdrawal'.
7) BGP route filtering and constrained route distribution are 7) BGP route filtering and constrained route distribution are
leveraged to ensure that the control plane traffic for a given EVI is leveraged to ensure that the control plane traffic for a given EVI is
only distributed to the PEs in that EVI. only distributed to the PEs in that EVI.
skipping to change at page 7, line 16 skipping to change at page 7, line 8
9) VM Mobility mechanisms ensure that all PEs in a given EVI know 9) VM Mobility mechanisms ensure that all PEs in a given EVI know
the ES with which a given VM, as identified by its MAC and IP the ES with which a given VM, as identified by its MAC and IP
addresses, is currently associated. addresses, is currently associated.
10) Route Targets are used to allow the operator (or customer) to 10) Route Targets are used to allow the operator (or customer) to
define a spectrum of logical network topologies including mesh, hub & define a spectrum of logical network topologies including mesh, hub &
spoke, and extranets (e.g., a VPN whose sites are owned by different spoke, and extranets (e.g., a VPN whose sites are owned by different
enterprises), without the need for proprietary software or the aid of enterprises), without the need for proprietary software or the aid of
other virtual or physical devices. other virtual or physical devices.
11) Because the design goal for NVO is millions of instances per Because the design goal for NVO is millions of instances per common
common physical infrastructure, the scaling properties of the control physical infrastructure, the scaling properties of the control plane
plane for NVO are extremely important. EVPN and the extensions for NVO are extremely important. EVPN and the extensions described
described herein, are designed with this level of scalability in herein, are designed with this level of scalability in mind.
mind.
5 Encapsulation Options for EVPN Overlays 5 Encapsulation Options for EVPN Overlays
5.1 VXLAN/NVGRE Encapsulation 5.1 VXLAN/NVGRE Encapsulation
Both VXLAN and NVGRE are examples of technologies that provide a data Both VXLAN and NVGRE are examples of technologies that provide a data
plane encapsulation which is used to transport a packet over the plane encapsulation which is used to transport a packet over the
common physical IP infrastructure between Network Virtualization common physical IP infrastructure between Network Virtualization
Edges (NVEs) - e.g., VXLAN Tunnel End Points (VTEPs) in VXLAN Edges (NVEs) - e.g., VXLAN Tunnel End Points (VTEPs) in VXLAN
network. Both of these technologies include the identifier of the network. Both of these technologies include the identifier of the
skipping to change at page 10, line 19 skipping to change at page 9, line 48
where a tenant VLAN ID gets mapped to an EVPN instance (EVI). As where a tenant VLAN ID gets mapped to an EVPN instance (EVI). As
such, a BGP RD and RT is needed per VNI on every NVE. The advantage such, a BGP RD and RT is needed per VNI on every NVE. The advantage
of this model is that it allows the BGP RT constraint mechanisms to of this model is that it allows the BGP RT constraint mechanisms to
be used in order to limit the propagation and import of routes to be used in order to limit the propagation and import of routes to
only the NVEs that are interested in a given VNI. The disadvantage of only the NVEs that are interested in a given VNI. The disadvantage of
this model may be the provisioning overhead if RD and RT are not this model may be the provisioning overhead if RD and RT are not
derived automatically from VNI. derived automatically from VNI.
In this option, the MAC-VRF table is identified by the RT in the In this option, the MAC-VRF table is identified by the RT in the
control plane and by the VNI in the data-plane. In this option, the control plane and by the VNI in the data-plane. In this option, the
specific the MAC-VRF table corresponds to only a single bridge table. specific MAC-VRF table corresponds to only a single bridge table.
2. Option 2: Multiple Subnets per EVI 2. Option 2: Multiple Subnets per EVI
In this option, multiple subnets each represented by a unique VNI are In this option, multiple subnets each represented by a unique VNI are
mapped to a single EVI. For example, if a tenant has multiple mapped to a single EVI. For example, if a tenant has multiple
segments/subnets each represented by a VNI, then all the VNIs for segments/subnets each represented by a VNI, then all the VNIs for
that tenant are mapped to a single EVI - e.g., the EVI in this case that tenant are mapped to a single EVI - e.g., the EVI in this case
represents the tenant and not a subnet . This corresponds to the represents the tenant and not a subnet . This corresponds to the
VLAN-aware bundle service in [RFC7432]. The advantage of this model VLAN-aware bundle service in [RFC7432]. The advantage of this model
is that it doesn't require the provisioning of RD/RT per VNI. is that it doesn't require the provisioning of RD/RT per VNI.
However, this is a moot point if option 1 with auto-derivation is However, this is a moot point if option 1 with auto-derivation is
used. The disadvantage of this model is that routes would be imported used. The disadvantage of this model is that routes would be imported
by NVEs that may not be interested in a given VNI. by NVEs that may not be interested in a given VNI.
In this option the MAC-VRF table is identified by the RT in the In this option the MAC-VRF table is identified by the RT in the
control plane and a specific bridge table for that MAC-VRF is control plane and a specific bridge table for that MAC-VRF is
identified by the <RT, Ethernet Tag ID> in the control plane. In this identified by the <RT, Ethernet Tag ID> in the control plane. In this
option, the VNI in the data-plane is sufficient to identify a option, the VNI in the data-plane is sufficient to identify a
specific bridge table - e.g., no need to do a lookup based on VNI and specific bridge table.
Ethernet Tag ID fields to identify a bridge table.
5.1.2.1 Auto Derivation of RT 5.1.2.1 Auto Derivation of RT
When the option of a single VNI per EVI is used, it is important to When the option of a single VNI per EVI is used, it is important to
auto-derive RT for EVPN BGP routes in order to simplify configuration auto-derive RT for EVPN BGP routes in order to simplify configuration
for data center operations. RD can be auto generated as described in for data center operations. RD can be auto generated as described in
[RFC7432] and RT can be auto-derived as described next. [RFC7432] and RT can be auto-derived as described next.
Since a gateway PE as depicted in figure-1 participates in both the Since a gateway PE as depicted in figure-1 participates in both the
DCN and WAN BGP sessions, it is important that when RT values are DCN and WAN BGP sessions, it is important that when RT values are
skipping to change at page 11, line 47 skipping to change at page 11, line 33
- The remaining 4 bits of the most significant byte of the local - The remaining 4 bits of the most significant byte of the local
admin field of the RT identifies the domain-id. The default value of admin field of the RT identifies the domain-id. The default value of
domain-id is zero indicating that only a single numbering space exist domain-id is zero indicating that only a single numbering space exist
for a given technology. However, if there are more than one number for a given technology. However, if there are more than one number
space exist for a given technology (e.g., overlapping VXLAN spaces), space exist for a given technology (e.g., overlapping VXLAN spaces),
then each of the number spaces need to be identify by their then each of the number spaces need to be identify by their
corresponding domain-id starting from 1. corresponding domain-id starting from 1.
5.1.3 Constructing EVPN BGP Routes 5.1.3 Constructing EVPN BGP Routes
In EVPN, an MPLS label is distributed by the egress PE via the EVPN In EVPN, an MPLS label identifying forwarding table is distributed by
control plane and is placed in the MPLS header of a given packet by the egress PE via the EVPN control plane and is placed in the MPLS
the ingress PE. This label is used upon receipt of that packet by the header of a given packet by the ingress PE. This label is used upon
egress PE for disposition of that packet. This is very similar to the receipt of that packet by the egress PE for disposition of that
use of the VNI by the egress NVE, with the difference being that an packet. This is very similar to the use of the VNI by the egress NVE,
MPLS label has local significance while a VNI typically has global with the difference being that an MPLS label has local significance
significance. Accordingly, and specifically to support the option of while a VNI typically has global significance. Accordingly, and
locally assigned VNIs, the MPLS label field in the MAC Advertisement, specifically to support the option of locally-assigned VNIs, the MPLS
Ethernet AD per EVI, and Inclusive Multicast Ethernet Tag routes is Label1 field in the MAC/IP Advertisement route, the MPLS label field
used to carry the VNI. For the balance of this memo, the MPLS label in the Ethernet AD per EVI route, and the MPLS label field in the
field will be referred to as the VNI field. The VNI field is used for PMSI Tunnel Attribute of the Inclusive Multicast Ethernet Tag (IMET)
both local and global VNIs, and for either case the entire 24-bit route are used to carry the VNI. For the balance of this memo, the
field is used to encode the VNI value. MPLS label field will be referred to as the VNI field. The VNI field
is used for both local and global VNIs, and for either case the
entire 24-bit field is used to encode the VNI value.
For the VLAN-based service (a single VNI per MAC-VRF), the Ethernet For the VLAN-based service (a single VNI per MAC-VRF), the Ethernet
Tag field in the MAC/IP Advertisement, Ethernet AD per EVI, and Tag field in the MAC/IP Advertisement, Ethernet AD per EVI, and IMET
Inclusive Multicast route MUST be set to zero just as in the VLAN route MUST be set to zero just as in the VLAN Based service in
Based service in [RFC7432].
[RFC7432].
For the VLAN-aware bundle service (multiple VNIs per MAC-VRF with For the VLAN-aware bundle service (multiple VNIs per MAC-VRF with
each VNI associated with its own bridge table), the Ethernet Tag each VNI associated with its own bridge table), the Ethernet Tag
field in the MAC Advertisement, Ethernet AD per EVI, and Inclusive field in the MAC Advertisement, Ethernet AD per EVI, and IMET route
Multicast route MUST identify a bridge table within a MAC-VRF and the MUST identify a bridge table within a MAC-VRF and the set of Ethernet
set of Ethernet Tags for that EVI needs to be configured consistently Tags for that EVI needs to be configured consistently on all PEs
on all PEs within that EVI. For local VNIs, the value advertised in within that EVI. For locally-assigned VNIs, the value advertised in
the Ethernet Tag field MUST be set to a VID just as in the VLAN-aware the Ethernet Tag field MUST be set to a VID just as in the VLAN-aware
bundle service in [RFC7432]. Such setting must be done consistently bundle service in [RFC7432]. Such setting must be done consistently
on all PE devices participating in that EVI within a given domain. on all PE devices participating in that EVI within a given domain.
For global VNIs, the value advertised in the Ethernet Tag field For global VNIs, the value advertised in the Ethernet Tag field
SHOULD be set to a VNI as long as it matches the existing semantics SHOULD be set to a VNI as long as it matches the existing semantics
of the Ethernet Tag, i.e., it identifies a bridge table within a MAC- of the Ethernet Tag, i.e., it identifies a bridge table within a MAC-
VRF and the set of VNIs are configured consistently on each PE in VRF and the set of VNIs are configured consistently on each PE in
that EVI. that EVI.
In order to indicate that which type of data plane encapsulation In order to indicate that which type of data plane encapsulation
(i.e., VXLAN, NVGRE, MPLS, or MPLS in GRE) is to be used, the BGP (i.e., VXLAN, NVGRE, MPLS, or MPLS in GRE) is to be used, the BGP
Encapsulation extended community defined in [TUNNEL-ENCAP]and Encapsulation extended community defined in [TUNNEL-ENCAP] and
[RFC5512] is included with all EVPN routes (i.e. MAC Advertisement, [RFC5512] is included with all EVPN routes (i.e. MAC Advertisement,
Ethernet AD per EVI, Ethernet AD per ESI, Inclusive Multicast Ethernet AD per EVI, Ethernet AD per ESI, Inclusive Multicast
Ethernet Tag, and Ethernet Segment) advertised by an egress PE. Five Ethernet Tag, and Ethernet Segment) advertised by an egress PE. Five
new values have been assigned by IANA to extend the list of new values have been assigned by IANA to extend the list of
encapsulation types defined in [TUNNEL-ENCAP] and they are listed in encapsulation types defined in [TUNNEL-ENCAP] and they are listed in
section 13. section 13.
The MPLS encapsulation tunnel type, listed in section 13, is needed The MPLS encapsulation tunnel type, listed in section 13, is needed
in order to distinguish between an advertising node that only in order to distinguish between an advertising node that only
supports non-MPLS encapsulations and one that supports MPLS and non- supports non-MPLS encapsulations and one that supports MPLS and non-
skipping to change at page 13, line 18 skipping to change at page 12, line 51
The Ethernet Segment and Ethernet AD per ESI routes MAY be advertised The Ethernet Segment and Ethernet AD per ESI routes MAY be advertised
with multiple encapsulation types as long as they use the same EVPN with multiple encapsulation types as long as they use the same EVPN
multi-homing procedures - e.g., the mix of VXLAN and NVGRE multi-homing procedures - e.g., the mix of VXLAN and NVGRE
encapsulation types is a valid one but not the mix of VXLAN and MPLS encapsulation types is a valid one but not the mix of VXLAN and MPLS
encapsulation types. encapsulation types.
The Next Hop field of the MP_REACH_NLRI attribute of the route MUST The Next Hop field of the MP_REACH_NLRI attribute of the route MUST
be set to the IPv4 or IPv6 address of the NVE. The remaining fields be set to the IPv4 or IPv6 address of the NVE. The remaining fields
in each route are set as per [RFC7432]. in each route are set as per [RFC7432].
Note that the procedure defined here to use the MPLS Label field to
carry the VNI in the presence of a Tunnel Encapsulation Extended
Community specifying the use of a VNI, is aligned with the procedures
described in section 8.2.2.2 of [tunnel-encap] ("When a Valid VNI has
not been Signaled").
5.2 MPLS over GRE 5.2 MPLS over GRE
The EVPN data-plane is modeled as an EVPN MPLS client layer sitting The EVPN data-plane is modeled as an EVPN MPLS client layer sitting
over an MPLS PSN-tunnel server layer. Some of the EVPN functions over an MPLS PSN-tunnel server layer. Some of the EVPN functions
(split-horizon, aliasing, and backup-path) are tied to the MPLS (split-horizon, aliasing, and backup-path) are tied to the MPLS
client layer. If MPLS over GRE encapsulation is used, then the EVPN client layer. If MPLS over GRE encapsulation is used, then the EVPN
MPLS client layer can be carried over an IP PSN tunnel transparently. MPLS client layer can be carried over an IP PSN tunnel transparently.
Therefore, there is no impact to the EVPN procedures and associated Therefore, there is no impact to the EVPN procedures and associated
data-plane operation. data-plane operation.
skipping to change at page 13, line 45 skipping to change at page 13, line 35
extended community indicating MPLS over GRE encapsulation, as extended community indicating MPLS over GRE encapsulation, as
described in previous section. described in previous section.
6 EVPN with Multiple Data Plane Encapsulations 6 EVPN with Multiple Data Plane Encapsulations
The use of the BGP Encapsulation extended community per [TUNNEL- The use of the BGP Encapsulation extended community per [TUNNEL-
ENCAP] and [RFC5512] allows each NVE in a given EVI to know each of ENCAP] and [RFC5512] allows each NVE in a given EVI to know each of
the encapsulations supported by each of the other NVEs in that EVI. the encapsulations supported by each of the other NVEs in that EVI.
i.e., each of the NVEs in a given EVI may support multiple data plane i.e., each of the NVEs in a given EVI may support multiple data plane
encapsulations. An ingress NVE can send a frame to an egress NVE encapsulations. An ingress NVE can send a frame to an egress NVE
only if the set of encapsulations advertised by the egress NVE in the only if the set of encapsulations advertised by the egress NVE forms
subject MAC/IP Advertisement or per EVI Ethernet AD route, forms a a non-empty intersection with the set of encapsulations supported by
non-empty intersection with the set of encapsulations supported by
the ingress NVE, and it is at the discretion of the ingress NVE which the ingress NVE, and it is at the discretion of the ingress NVE which
encapsulation to choose from this intersection. (As noted in encapsulation to choose from this intersection. (As noted in
section 5.1.3, if the BGP Encapsulation extended community is not section 5.1.3, if the BGP Encapsulation extended community is not
present, then the default MPLS encapsulation or a statically present, then the default MPLS encapsulation or a locally configured
configured encapsulation is assumed.) encapsulation is assumed.)
An ingress node that uses shared multicast trees for sending An ingress node that uses shared multicast trees for sending
broadcast or multicast frames MUST maintain distinct trees for each broadcast or multicast frames MAY maintain distinct trees for each
different encapsulation type. different encapsulation type.
It is the responsibility of the operator of a given EVI to ensure It is the responsibility of the operator of a given EVI to ensure
that all of the NVEs in that EVI support at least one common that all of the NVEs in that EVI support at least one common
encapsulation. If this condition is violated, it could result in encapsulation. If this condition is violated, it could result in
service disruption or failure. The use of the BGP Encapsulation service disruption or failure. The use of the BGP Encapsulation
extended community provides a method to detect when this condition is extended community provides a method to detect when this condition is
violated but the actions to be taken are at the discretion of the violated but the actions to be taken are at the discretion of the
operator and are outside the scope of this document. operator and are outside the scope of this document.
skipping to change at page 14, line 38 skipping to change at page 14, line 26
on other servers, and thus does not require any specific protocol on other servers, and thus does not require any specific protocol
mechanisms. The most common case of this is when the NVE resides on mechanisms. The most common case of this is when the NVE resides on
the hypervisor. the hypervisor.
In the sub-sections that follow, we will discuss the impact on EVPN In the sub-sections that follow, we will discuss the impact on EVPN
procedures for the case when the NVE resides on the hypervisor and procedures for the case when the NVE resides on the hypervisor and
the VXLAN (or NVGRE) encapsulation is used. the VXLAN (or NVGRE) encapsulation is used.
7.1 Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE Encapsulation 7.1 Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE Encapsulation
In the scenario where all data centers are under a single In scenarios where different groups of data centers are under
administrative domain, and there is a single global VNI space, the RD different administrative domains, and these data centers are
MAY be set to zero in the EVPN routes. However, in the scenario where connected via one or more backbone core providers as described in
different groups of data centers are under different administrative [NOV3-Framework], the RD must be a unique value per EVI or per NVE as
domains, and these data centers are connected via one or more described in [RFC7432]. In other words, whenever there is more than
backbone core providers as described in [NOV3-Framework], the RD must one administrative domain for global VNI, then a unique RD MUST be
be a unique value per EVI or per NVE as described in [RFC7432]. In used, or whenever the VNI value have local significance, then a
other words, whenever there is more than one administrative domain unique RD MUST be used. Therefore, it is recommend to use a unique RD
for global VNI, then a non-zero RD MUST be used, or whenever the VNI as described in [RFC7432] at all time.
value have local significance, then a non-zero RD MUST be used. It is
recommend to use a non-zero RD at all time.
When the NVEs reside on the hypervisor, the EVPN BGP routes and When the NVEs reside on the hypervisor, the EVPN BGP routes and
attributes associated with multi-homing are no longer required. This attributes associated with multi-homing are no longer required. This
reduces the required routes and attributes to the following subset of reduces the required routes and attributes to the following subset of
four out of eight: four out of eight:
- MAC/IP Advertisement Route - MAC/IP Advertisement Route
- Inclusive Multicast Ethernet Tag Route - Inclusive Multicast Ethernet Tag Route
- MAC Mobility Extended Community - MAC Mobility Extended Community
- Default Gateway Extended Community - Default Gateway Extended Community
skipping to change at page 16, line 14 skipping to change at page 15, line 47
Top of Rack (ToR) switches AND the servers (where VMs are residing) Top of Rack (ToR) switches AND the servers (where VMs are residing)
are multi-homed to these ToR switches. The multi-homing may operate are multi-homed to these ToR switches. The multi-homing may operate
in All-Active or Single-Active redundancy mode. If the servers are in All-Active or Single-Active redundancy mode. If the servers are
single-homed to the ToR switches, then the scenario becomes similar single-homed to the ToR switches, then the scenario becomes similar
to that where the NVE resides on the hypervisor, as discussed in to that where the NVE resides on the hypervisor, as discussed in
Section 7, as far as the required EVPN functionality are concerned. Section 7, as far as the required EVPN functionality are concerned.
[RFC7432] defines a set of BGP routes, attributes and procedures to [RFC7432] defines a set of BGP routes, attributes and procedures to
support multi-homing. We first describe these functions and support multi-homing. We first describe these functions and
procedures, then discuss which of these are impacted by the VxLAN procedures, then discuss which of these are impacted by the VxLAN
(or NVGRE) encapsulation and what modifications are required. (or NVGRE) encapsulation and what modifications are required. As it
will be seen later in this section, the only EVPN procedure that is
impacted by IP underlay tunnels is that of split-horizon filtering
for multi-homed Ethernet Segments described in section 8.3.1.
8.1 EVPN Multi-Homing Features 8.1 EVPN Multi-Homing Features
In this section, we will recap the multi-homing features of EVPN to In this section, we will recap the multi-homing features of EVPN to
highlight the encapsulation dependencies. The section only describes highlight the encapsulation dependencies. The section only describes
the features and functions at a high-level. For more details, the the features and functions at a high-level. For more details, the
reader is to refer to [RFC7432]. reader is to refer to [RFC7432].
8.1.1 Multi-homed Ethernet Segment Auto-Discovery 8.1.1 Multi-homed Ethernet Segment Auto-Discovery
EVPN NVEs (or PEs) connected to the same Ethernet Segment (e.g. the EVPN NVEs (or PEs) connected to the same Ethernet Segment (e.g. the
same server via LAG) can automatically discover each other with same server via LAG) can automatically discover each other with
minimal to no configuration through the exchange of BGP routes. minimal to no configuration through the exchange of BGP routes.
skipping to change at page 16, line 49 skipping to change at page 16, line 35
for all MAC addresses associated with the Ethernet segment in for all MAC addresses associated with the Ethernet segment in
question. If no other NVE had advertised an Ethernet A-D route for question. If no other NVE had advertised an Ethernet A-D route for
the same segment, then the NVE that received the withdrawal simply the same segment, then the NVE that received the withdrawal simply
invalidates the MAC entries for that segment. Otherwise, the NVE invalidates the MAC entries for that segment. Otherwise, the NVE
updates the next-hop adjacency list accordingly. updates the next-hop adjacency list accordingly.
8.1.3 Split-Horizon 8.1.3 Split-Horizon
If a server is multi-homed to two or more NVEs (represented by an If a server is multi-homed to two or more NVEs (represented by an
Ethernet segment ES1) and operating in an all-active redundancy mode, Ethernet segment ES1) and operating in an all-active redundancy mode,
sends a BUM packet (ie, Broadcast, Unknown unicast, or Multicast) sends a BUM packet (ie, Broadcast, Unknown unicast, or Multicast) to
packet to one of these NVEs, then it is important to ensure the one of these NVEs, then it is important to ensure the packet is not
packet is not looped back to the server via another NVE connected to looped back to the server via another NVE connected to this server.
this server. The filtering mechanism on the NVE to prevent such loop The filtering mechanism on the NVE to prevent such loop and packet
and packet duplication is called "split horizon filtering'. duplication is called "split horizon filtering'.
8.1.4 Aliasing and Backup-Path 8.1.4 Aliasing and Backup-Path
In the case where a station is multi-homed to multiple NVEs, it is In the case where a station is multi-homed to multiple NVEs, it is
possible that only a single NVE learns a set of the MAC addresses possible that only a single NVE learns a set of the MAC addresses
associated with traffic transmitted by the station. This leads to a associated with traffic transmitted by the station. This leads to a
situation where remote NVEs receive MAC advertisement routes, for situation where remote NVEs receive MAC advertisement routes, for
these addresses, from a single NVE even though multiple NVEs are these addresses, from a single NVE even though multiple NVEs are
connected to the multi-homed station. As a result, the remote NVEs connected to the multi-homed station. As a result, the remote NVEs
are not able to effectively load-balance traffic among the NVEs are not able to effectively load-balance traffic among the NVEs
skipping to change at page 18, line 14 skipping to change at page 17, line 47
responsible for sending it broadcast, multicast, and, if configured responsible for sending it broadcast, multicast, and, if configured
for that EVI, unknown unicast frames. for that EVI, unknown unicast frames.
This is required in order to prevent duplicate delivery of multi- This is required in order to prevent duplicate delivery of multi-
destination frames to a multi-homed host or VM, in case of all-active destination frames to a multi-homed host or VM, in case of all-active
redundancy. redundancy.
In NVEs where .1Q tagged frames are received from hosts, the DF In NVEs where .1Q tagged frames are received from hosts, the DF
election is performed on host VLAN IDs (VIDs). It is assumed that for election is performed on host VLAN IDs (VIDs). It is assumed that for
a given Ethernet Segment, VIDs are unique and consistent (e.g., no a given Ethernet Segment, VIDs are unique and consistent (e.g., no
duplicate VIDs exist). duplicate VIDs exist). In NVEs where QinQ tagged frames are received
from hosts, then if NVEs are configured to identify an EVI/BD based
on both tags, then DF election is performed on both tags; however, if
NVEs are configured to identify an EVI/BD based on a single tag, then
DF election is performed based on the outer tag.
In GWs where VxLAN encapsulated frames are received, the DF election In GWs where VxLAN encapsulated frames are received, the DF election
is performed on VNIs. Again, it is assumed that for a given Ethernet is performed on VNIs. Again, it is assumed that for a given Ethernet
Segment, VNIs are unique and consistent (e.g., no duplicate VNIs Segment, VNIs are unique and consistent (e.g., no duplicate VNIs
exist). exist).
8.2 Impact on EVPN BGP Routes & Attributes 8.2 Impact on EVPN BGP Routes & Attributes
Since multi-homing is supported in this scenario, then the entire set Since multi-homing is supported in this scenario, then the entire set
of BGP routes and attributes defined in [RFC7432] are used. The of BGP routes and attributes defined in [RFC7432] are used. The
setting of the Ethernet Tag field in the MAC Advertisement, Ethernet setting of the Ethernet Tag field in the MAC Advertisement, Ethernet
AD per EVI, and Inclusive Multicast routes follows that of section AD per EVI, and Inclusive Multicast routes follows that of section
5.1.3. Furthermore, the setting of the VNI field in the MAC 5.1.3. Furthermore, the setting of the VNI field in the MAC
Advertisement and Ethernet AD per EVI routes follows that of section Advertisement and Ethernet AD per EVI routes follows that of section
5.1.3. 5.1.3.
8.3 Impact on EVPN Procedures 8.3 Impact on EVPN Procedures
Two cases need to be examined here, depending on whether the NVEs are Two cases need to be examined here, depending on whether the NVEs are
operating in Active/Standby or in All-Active redundancy. operating in Single-Active or in All-Active redundancy mode.
First, lets consider the case of Active/Standby redundancy, where the First, lets consider the case of Single-Active redundancy mode, where
hosts are multi-homed to a set of NVEs, however, only a single NVE is the hosts are multi-homed to a set of NVEs, however, only a single
active at a given point of time for a given VNI. In this case, the NVE is active at a given point of time for a given VNI. In this case,
aliasing is not required and the split-horizon may not be required, the aliasing is not required and the split-horizon filtering may not
but other functions such as multi-homed Ethernet segment auto- be required, but other functions such as multi-homed Ethernet segment
discovery, fast convergence and mass withdraw, backup path, and DF auto-discovery, fast convergence and mass withdraw, backup path, and
election are required. DF election are required.
Second, let's consider the case of All-Active redundancy. In this Second, let's consider the case of All-Active redundancy mode. In
case, out of all the EVPN multi-homing features listed in section this case, out of all the EVPN multi-homing features listed in
8.1, the use of the VXLAN or NVGRE encapsulation impacts the split- section 8.1, the use of the VXLAN or NVGRE encapsulation impacts the
horizon and aliasing features, since those two rely on the MPLS split-horizon and aliasing features, since those two rely on the MPLS
client layer. Given that this MPLS client layer is absent with these client layer. Given that this MPLS client layer is absent with these
types of encapsulations, alternative procedures and mechanisms are types of encapsulations, alternative procedures and mechanisms are
needed to provide the required functions. Those are discussed in needed to provide the required functions. Those are discussed in
detail next. detail next.
8.3.1 Split Horizon 8.3.1 Split Horizon
In EVPN, an MPLS label is used for split-horizon filtering to support In EVPN, an MPLS label is used for split-horizon filtering to support
All-Active multi-homing where an ingress NVE adds a label All-Active multi-homing where an ingress NVE adds a label
corresponding to the site of origin (aka ESI Label) when corresponding to the site of origin (aka ESI Label) when
skipping to change at page 19, line 48 skipping to change at page 19, line 38
In order to prevent unhealthy interactions between the split horizon In order to prevent unhealthy interactions between the split horizon
procedures defined in [RFC7432] and the local bias procedures procedures defined in [RFC7432] and the local bias procedures
described in this document, a mix of MPLS over GRE encapsulations on described in this document, a mix of MPLS over GRE encapsulations on
the one hand and VXLAN/NVGRE encapsulations on the other on a given the one hand and VXLAN/NVGRE encapsulations on the other on a given
Ethernet Segment is prohibited. Ethernet Segment is prohibited.
8.3.2 Aliasing and Backup-Path 8.3.2 Aliasing and Backup-Path
The Aliasing and the Backup-Path procedures for VXLAN/NVGRE The Aliasing and the Backup-Path procedures for VXLAN/NVGRE
encapsulation is very similar to the ones for MPLS. In case of MPLS, encapsulation is very similar to the ones for MPLS. In case of MPLS,
two different Ethernet A-D routes are used for this purpose. The one Ethernet A-D route per EVI is used for Aliasing when the
used for Aliasing has a VPN scope (per EVI) and carries a VPN label corresponding Ethernet Segment operates in All-Active multi-homing,
but the one used for Backup-Path has Ethernet segment scope (per ES) and the same route is used for Backup-Path when the corresponding
and doesn't carry any VPN specific info (e.g., Ethernet Tag and MPLS Ethernet Segment operates in Single-Active multi-homing. In case of
label are set to zero). In case of VxLAN/NVGRE, the same two routes VxLAN/NVGRE, the same route is used for the Aliasing and the Backup-
are used for the Aliasing and the Backup-Path. In case of Aliasing, Path with the difference that the Ethernet Tag and VNI fields in
the Ethernet Tag and VNI fields in Ethernet A-D per EVI route is set Ethernet A-D per EVI route is set as described in section 5.1.3.
as described in section 5.1.3.
9 Support for Multicast 9 Support for Multicast
The E-VPN Inclusive Multicast BGP route is used to discover the The E-VPN Inclusive Multicast BGP route is used to discover the
multicast tunnels among the endpoints associated with a given EVI multicast tunnels among the endpoints associated with a given EVI
(e.g., given VNI) for VLAN-based service and a given <EVI,VLAN> for (e.g., given VNI) for VLAN-based service and a given <EVI,VLAN> for
VLAN-aware bundle service. The Ethernet Tag field of this route is VLAN-aware bundle service. The Ethernet Tag field of this route is
set as described in section 5.1.3. The Originating router's IP set as described in section 5.1.3. The Originating router's IP
address field is set to the NVE's IP address. This route is tagged address field is set to the NVE's IP address. This route is tagged
with the PMSI Tunnel attribute, which is used to encode the type of with the PMSI Tunnel attribute, which is used to encode the type of
multicast tunnel to be used as well as the multicast tunnel multicast tunnel to be used as well as the multicast tunnel
identifier. The tunnel encapsulation is encoded by adding the BGP identifier. The tunnel encapsulation is encoded by adding the BGP
Encapsulation extended community as per section 5.1.1. The following Encapsulation extended community as per section 5.1.1. For example,
tunnel types as defined in [RFC6514] can be used in the PMSI tunnel the PMSI Tunnel attribute may indicate the multicast tunnel is of
attribute for VXLAN/NVGRE: type PIM-SM; whereas, the BGP Encapsulation extended community may
indicate the encapsulation for that tunnel is of type VxLAN. The
following tunnel types as defined in [RFC6514] can be used in the
PMSI tunnel attribute for VXLAN/NVGRE:
+ 3 - PIM-SSM Tree + 3 - PIM-SSM Tree
+ 4 - PIM-SM Tree + 4 - PIM-SM Tree
+ 5 - BIDIR-PIM Tree + 5 - BIDIR-PIM Tree
+ 6 - Ingress Replication + 6 - Ingress Replication
Except for Ingress Replication, this multicast tunnel is used by the Except for Ingress Replication, this multicast tunnel is used by the
PE originating the route for sending multicast traffic to other PEs, PE originating the route for sending multicast traffic to other PEs,
and is used by PEs that receive this route for receiving the traffic and is used by PEs that receive this route for receiving the traffic
originated by hosts connected to the PE that originated the route. originated by hosts connected to the PE that originated the route.
In the scenario where the multicast tunnel is a tree, both the In the scenario where the multicast tunnel is a tree, both the
Inclusive as well as the Aggregate Inclusive variants may be used. In Inclusive as well as the Aggregate Inclusive variants may be used. In
the former case, a multicast tree is dedicated to a VNI. Whereas, in the former case, a multicast tree is dedicated to a VNI. Whereas, in
the latter, a multicast tree is shared among multiple VNIs. This is the latter, a multicast tree is shared among multiple VNIs. For VNI-
done by having the NVEs advertise multiple Inclusive Multicast routes based service, the Aggregate Inclusive mode is accomplished by having
with different VNI encoded in the Ethernet Tag field, but with the the NVEs advertise multiple IMET routes with different Route Targets
same tunnel identifier encoded in the PMSI Tunnel attribute. (one per VNI) but with the same tunnel identifier encoded in the PMSI
tunnel attribute. For VNI-aware bundle service, the Aggregate
Inclusive mode is accomplished by having the NVEs advertise multiple
IMET routes with different VNI encoded in the Ethernet Tag field, but
with the same tunnel identifier encoded in the PMSI Tunnel attribute.
10 Data Center Interconnections - DCI 10 Data Center Interconnections - DCI
For DCI, the following two main scenarios are considered when For DCI, the following two main scenarios are considered when
connecting data centers running evpn-overlay (as described here) over connecting data centers running evpn-overlay (as described here) over
MPLS/IP core network: MPLS/IP core network:
- Scenario 1: DCI using GWs - Scenario 1: DCI using GWs
- Scenario 2: DCI using ASBRs - Scenario 2: DCI using ASBRs
The following two subsections describe the operations for each of The following two subsections describe the operations for each of
these scenarios. these scenarios.
10.1 DCI using GWs 10.1 DCI using GWs
This is the typical scenario for interconnecting data centers over This is the typical scenario for interconnecting data centers over
WAN. In this scenario, EVPN routes are terminated and processed in WAN. In this scenario, EVPN routes are terminated and processed in
each GW and MAC/IP routes are always re-advertised from DC to WAN but each GW and MAC/IP routes are always re-advertised from DC to WAN but
from WAN to DC, they are not re-advertised if unknown MAC address from WAN to DC, they are not re-advertised if unknown MAC address
(and default IP address) are utilized in NVEs. In this scenario, each (and default IP address) are utilized in NVEs. In this scenario, each
GW maintains a MAC-VRF (and/or IP-VRF) for each EVI. The main GW maintains a MAC-VRF (and/or IP-VRF) for each EVI. The main
advantage of this approach is that NVEs do not need to maintain MAC advantage of this approach is that NVEs do not need to maintain MAC
and IP addresses from any remote data centers when default IP route and IP addresses from any remote data centers when default IP route
and unknown MAC routes are used - i.e., they only need to maintain and unknown MAC routes are used - i.e., they only need to maintain
routes that are local to their own DC. When default IP route and routes that are local to their own DC. When default IP route and
skipping to change at page 24, line 12 skipping to change at page 24, line 6
In the above example, the PE3 receives two Aliasing routes with the In the above example, the PE3 receives two Aliasing routes with the
same BGP next hop (ASBR2) but different RDs. One of the Alias route same BGP next hop (ASBR2) but different RDs. One of the Alias route
has the same RD as the advertised MAC route (M1). PE3 follows the has the same RD as the advertised MAC route (M1). PE3 follows the
route resolution procedure specified in [RFC7432] upon receiving the route resolution procedure specified in [RFC7432] upon receiving the
two Aliasing route - ie, it resolves M1 to <ES, EVI1> and two Aliasing route - ie, it resolves M1 to <ES, EVI1> and
subsequently it resolves <ES,EVI1> to a BGP path list with two paths subsequently it resolves <ES,EVI1> to a BGP path list with two paths
along with the corresponding VNIs/MPLS labels (one associated with along with the corresponding VNIs/MPLS labels (one associated with
PE1 and the other associated with PE2). It should be noted that even PE1 and the other associated with PE2). It should be noted that even
though both paths are advertised by the same BGP next hop (ASRB2), though both paths are advertised by the same BGP next hop (ASRB2),
the receiving PE3 can handle them properly. Therefore, M1 is the receiving PE3 can handle them properly. Therefore, M1 is
reachable via two paths. This creates two end-to-end LSPs from PE3 to reachable via two paths. This creates two end-to-end LSPs, from PE3
PE1 for M1 such that when PE3 wants to forward traffic destined to to PE1 and from PE3 to PE2, for M1 such that when PE3 wants to
M1, it can load balanced between the two paths. Although route forward traffic destined to M1, it can load balanced between the two
resolution for Aliasing routes with the same BGP next hop is not LSPs. Although route resolution for Aliasing routes with the same BGP
explicitly mentioned in [RFC7432], the is the expected operation and next hop is not explicitly mentioned in [RFC7432], this is the
thus it is elaborated here. expected operation and thus it is elaborated here.
When the AC between the PE2 and the CE fails and PE2 sends NLRI When the AC between the PE2 and the CE fails and PE2 sends NLRI
withdrawal for Ether A-D per EVI routes and these withdrawals get withdrawal for Ether A-D per EVI routes and these withdrawals get
propagated and received by the PE3, the PE3 removes the Aliasing propagated and received by the PE3, the PE3 removes the Aliasing
route and updates the path list - ie, it removes the path route and updates the path list - ie, it removes the path
corresponding to the PE2. Therefore, all the corresponding MAC routes corresponding to the PE2. Therefore, all the corresponding MAC routes
for that <ES,EVI> that point to that path list will now have the for that <ES,EVI> that point to that path list will now have the
updated path list with a single path associated with PE1. This action updated path list with a single path associated with PE1. This action
can be considered as the mass-withdraw at the per-EVI level. The can be considered as the mass-withdraw at the per-EVI level. The
mass-withdraw at per-EVI level has longer convergence time than the mass-withdraw at per-EVI level has longer convergence time than the
mass-withdraw at per-ES level; however, it is much faster than the mass-withdraw at per-ES level; however, it is much faster than the
convergence time when the withdraw is done on a per-MAC basis. convergence time when the withdraw is done on a per-MAC basis.
If a PE becomes detached from a given ES, then in addition to
withdrawing its previously advertised Ethernet AD Per ES routes, it
MUST also withdraw its previously advertised Ethernet AD Per EVI
routes for that ES. For a remote PE that is separated from the
withdrawing PE by one or more EVPN inter-AS option B ASBRs, the
withdrawal of the Ethernet AD Per ES routes is not actionable.
However, a remote PE is able to correlate a previously advertised
Ethernet AD Per EVI route with any MAC/IP Advertisement routes also
advertised by the withdrawing PE for that <ES, EVI, BD>. Hence, when
it receives the withdrawal of an Ethernet AD Per EVI route, it SHOULD
remove the withdrawing PE as a next-hop for all MAC addresses
associated with that <ES, EVI, BD>.
In the previous example, when the AC between PE2 and the CE fails,
PE2 will withdraw its Ethernet AD Per ES and Per EVI routes. When
PE3 receives the withdrawal of an Ethernet AD Per EVI route, it
removes PE2 as a valid next-hop for all MAC addresses associated with
the corresponding <ES, EVI, BD>. Therefore, all the MAC next-hops
for that <ES,EVI, BD> will now have a single next-hop, viz the LSP to
PE1.
In summary, it can be seen that aliasing (and backup path) In summary, it can be seen that aliasing (and backup path)
functionality should work as is for inter-AS option B without functionality should work as is for inter-AS option B without
requiring any addition functionality in ASBRs or PEs. However, the requiring any addition functionality in ASBRs or PEs. However, the
mass-withdraw functionality falls back from per-ES mode to per-EVI mass-withdraw functionality falls back from per-ES mode to per-EVI
mode for inter-AS option B - i.e., PEs receiving mass-withdraw route mode for inter-AS option B - i.e., PEs receiving mass-withdraw route
from the same AS use Ether A-D per ES route; whereas, PEs receiving from the same AS take action on Ether A-D per ES route; whereas, PEs
mass-withdraw route from different AS use Ether A-D per EVI route. receiving mass-withdraw route from different AS take action on Ether
A-D per EVI route.
11 Acknowledgement 11 Acknowledgement
The authors would like to thank Aldrin Isaac, David Smith, John The authors would like to thank Aldrin Isaac, David Smith, John
Mullooly, Thomas Nadeau for their valuable comments and feedback. The Mullooly, Thomas Nadeau for their valuable comments and feedback. The
authors would also like to thank Jakob Heitz for his contribution on authors would also like to thank Jakob Heitz for his contribution on
section 10.2. section 10.2.
12 Security Considerations 12 Security Considerations
skipping to change at page 26, line 36 skipping to change at page 27, line 5
2014 2014
[NVGRE] Garg, P., et al., "NVGRE: Network Virtualization using [NVGRE] Garg, P., et al., "NVGRE: Network Virtualization using
Generic Routing Encapsulation", draft-sridharan-virtualization-nvgre- Generic Routing Encapsulation", draft-sridharan-virtualization-nvgre-
07.txt, November 11, 2014 07.txt, November 11, 2014
[Problem-Statement] Narten et al., "Problem Statement: Overlays for [Problem-Statement] Narten et al., "Problem Statement: Overlays for
Network Virtualization", draft-ietf-nvo3-overlay-problem-statement- Network Virtualization", draft-ietf-nvo3-overlay-problem-statement-
01, September 2012. 01, September 2012.
[L3VPN-ENDSYSTEMS] Marques et al., "BGP-signaled End-system IP/VPNs",
draft-ietf-l3vpn-end-system, work in progress, October 2012.
[NOV3-FRWK] Lasserre et al., "Framework for DC Network [NOV3-FRWK] Lasserre et al., "Framework for DC Network
Virtualization", draft-ietf-nvo3-framework-01.txt, work in progress, Virtualization", draft-ietf-nvo3-framework-01.txt, work in progress,
October 2012. October 2012.
[DCI-EVPN-OVERLAY] Rabadan et al., "Interconnect Solution for EVPN [DCI-EVPN-OVERLAY] Rabadan et al., "Interconnect Solution for EVPN
Overlay networks", draft-ietf-bess-dci-evpn-overlay-02, work in Overlay networks", draft-ietf-bess-dci-evpn-overlay-02, work in
progress, February 29, 2016. progress, February 29, 2016.
[TUNNEL-ENCAP] Rosen et al., "The BGP Tunnel Encapsulation [TUNNEL-ENCAP] Rosen et al., "The BGP Tunnel Encapsulation
Attribute", draft-ietf-idr-tunnel-encaps-02, work in progress, May Attribute", draft-ietf-idr-tunnel-encaps-02, work in progress, May
 End of changes. 43 change blocks. 
117 lines changed or deleted 145 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/