draft-ietf-bess-evpn-prefix-advertisement-09.txt   draft-ietf-bess-evpn-prefix-advertisement-10.txt 
skipping to change at page 1, line 14 skipping to change at page 1, line 14
Internet Draft W. Henderickx Internet Draft W. Henderickx
Intended status: Standards Track Nokia Intended status: Standards Track Nokia
J. Drake J. Drake
W. Lin W. Lin
Juniper Juniper
A. Sajassi A. Sajassi
Cisco Cisco
Expires: May 25, 2018 November 21, 2017 Expires: August 31, 2018 February 27, 2018
IP Prefix Advertisement in EVPN IP Prefix Advertisement in EVPN
draft-ietf-bess-evpn-prefix-advertisement-09 draft-ietf-bess-evpn-prefix-advertisement-10
Abstract Abstract
EVPN provides a flexible control plane that allows intra-subnet EVPN provides a flexible control plane that allows intra-subnet
connectivity in an MPLS and/or NVO-based network. In some networks, connectivity in an MPLS and/or NVO-based network. In some networks,
there is also a need for a dynamic and efficient inter-subnet there is also a need for a dynamic and efficient inter-subnet
connectivity across Tenant Systems and End Devices that can be connectivity across Tenant Systems and End Devices that can be
physical or virtual and do not necessarily participate in dynamic physical or virtual and do not necessarily participate in dynamic
routing protocols. This document defines a new EVPN route type for routing protocols. This document defines a new EVPN route type for
the advertisement of IP Prefixes and explains some use-case examples the advertisement of IP Prefixes and explains some use-case examples
skipping to change at page 2, line 7 skipping to change at page 2, line 7
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on May 25, 2018. This Internet-Draft will expire on August 31, 2018.
Copyright Notice Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Introduction and Problem Statement . . . . . . . . . . . . . . 4 2. Introduction and Problem Statement . . . . . . . . . . . . . . 4
2.1 Inter-Subnet Connectivity Requirements in Data Centers . . . 4 2.1 Inter-Subnet Connectivity Requirements in Data Centers . . . 5
2.2 The Requirement for a New EVPN Route Type . . . . . . . . . 7 2.2 The Requirement for a New EVPN Route Type . . . . . . . . . 7
3. The BGP EVPN IP Prefix Route . . . . . . . . . . . . . . . . . 8 3. The BGP EVPN IP Prefix Route . . . . . . . . . . . . . . . . . 8
3.1 IP Prefix Route Encoding . . . . . . . . . . . . . . . . . . 9 3.1 IP Prefix Route Encoding . . . . . . . . . . . . . . . . . . 9
3.2 Overlay Indexes and Recursive Lookup Resolution . . . . . . 10 3.2 Overlay Indexes and Recursive Lookup Resolution . . . . . . 11
4. Overlay Index Use-Cases . . . . . . . . . . . . . . . . . . . . 13 4. Overlay Index Use-Cases . . . . . . . . . . . . . . . . . . . . 14
4.1 TS IP Address Overlay Index Use-Case . . . . . . . . . . . . 13 4.1 TS IP Address Overlay Index Use-Case . . . . . . . . . . . . 14
4.2 Floating IP Overlay Index Use-Case . . . . . . . . . . . . . 15 4.2 Floating IP Overlay Index Use-Case . . . . . . . . . . . . . 16
4.3 Bump-in-the-Wire Use-Case . . . . . . . . . . . . . . . . . 17 4.3 Bump-in-the-Wire Use-Case . . . . . . . . . . . . . . . . . 18
4.4 IP-VRF-to-IP-VRF Model . . . . . . . . . . . . . . . . . . . 20 4.4 IP-VRF-to-IP-VRF Model . . . . . . . . . . . . . . . . . . . 21
4.4.1 Interface-less IP-VRF-to-IP-VRF Model . . . . . . . . . 21 4.4.1 Interface-less IP-VRF-to-IP-VRF Model . . . . . . . . . 22
4.4.2 Interface-ful IP-VRF-to-IP-VRF with SBD IRB . . . . . . 24 4.4.2 Interface-ful IP-VRF-to-IP-VRF with SBD IRB . . . . . . 25
4.4.3 Interface-ful IP-VRF-to-IP-VRF with Unnumbered SBD IRB . 27 4.4.3 Interface-ful IP-VRF-to-IP-VRF with Unnumbered SBD IRB . 28
5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6. Conventions used in this document . . . . . . . . . . . . . . . 31 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 32
7. Security Considerations . . . . . . . . . . . . . . . . . . . . 31 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 32
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 31 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 8.1 Normative References . . . . . . . . . . . . . . . . . . . . 32
9.1 Normative References . . . . . . . . . . . . . . . . . . . . 31 8.2 Informative References . . . . . . . . . . . . . . . . . . . 33
9.2 Informative References . . . . . . . . . . . . . . . . . . . 32 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 33
10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 32 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 34
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 32 11. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 34
12. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 32
1. Terminology 1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
GW IP: Gateway IP Address. GW IP: Gateway IP Address.
IPL: IP address length. IPL: IP address length.
ML: MAC address length. ML: MAC address length.
NVE: Network Virtualization Edge. NVE: Network Virtualization Edge.
TS: Tenant System. TS: Tenant System.
skipping to change at page 3, line 48 skipping to change at page 4, line 5
MAC-VRF: A Virtual Routing and Forwarding table for Media Access MAC-VRF: A Virtual Routing and Forwarding table for Media Access
Control (MAC) addresses on an NVE/PE, as per [RFC7432]. Control (MAC) addresses on an NVE/PE, as per [RFC7432].
BD: Broadcast Domain. As per [RFC7432], an EVI consists of a single BD: Broadcast Domain. As per [RFC7432], an EVI consists of a single
or multiple BDs. In case of VLAN-bundle and VLAN-based service or multiple BDs. In case of VLAN-bundle and VLAN-based service
models (see [RFC7432]), a BD is equivalent to an EVI. In case of models (see [RFC7432]), a BD is equivalent to an EVI. In case of
VLAN-aware bundle service model, an EVI contains multiple BDs. VLAN-aware bundle service model, an EVI contains multiple BDs.
Also, in this document, BD and subnet are equivalent terms. Also, in this document, BD and subnet are equivalent terms.
BD route-target: refers to the Broadcast Domain assigned route-
target. In case of VLAN-aware bundle service model, all the BD
instances in the MAC-VRF share the same route-target.
BT: Bridge Table. The instantiation of a BD in a MAC-VRF. BT: Bridge Table. The instantiation of a BD in a MAC-VRF.
IP-VRF: A VPN Routing and Forwarding table for IP routes on an IP-VRF: A VPN Routing and Forwarding table for IP routes on an
NVE/PE. The IP routes could be populated by EVPN and IP-VPN NVE/PE. The IP routes could be populated by EVPN and IP-VPN
address families. address families.
IRB: Integrated Routing and Bridging interface. It connects an IP-VRF IRB: Integrated Routing and Bridging interface. It connects an IP-VRF
to a BD (or subnet). to a BD (or subnet).
SBD: Supplementary Broadcast Domain. A BD that does not have any ACs, SBD: Supplementary Broadcast Domain. A BD that does not have any ACs,
only IRB interfaces, and it is used to provide connectivity among only IRB interfaces, and it is used to provide connectivity among
all the IP-VRFs of the tenant. The SBD is only required in IP-VRF- all the IP-VRFs of the tenant. The SBD is only required in IP-VRF-
to-IP-VRF use-cases (see section 4.4.). to-IP-VRF use-cases (see section 4.4.).
VNI: Virtual Network Identifier.
SN: Subnet.
DGW: Data Center Gateway.
GARP: Gratuitous Address Resolution Protocol.
2. Introduction and Problem Statement 2. Introduction and Problem Statement
Inter-subnet connectivity is used for certain tenants within the Data Inter-subnet connectivity is used for certain tenants within the Data
Center. [EVPN-INTERSUBNET] defines some fairly common inter-subnet Center. [EVPN-INTERSUBNET] defines some fairly common inter-subnet
forwarding scenarios where TSes can exchange packets with TSes forwarding scenarios where TSes can exchange packets with TSes
located in remote subnets. In order to achieve this, located in remote subnets. In order to achieve this,
[EVPN-INTERSUBNET] describes how MAC/IPs encoded in TS RT-2 routes [EVPN-INTERSUBNET] describes how MAC/IPs encoded in TS RT-2 routes
are not only used to populate MAC-VRF and overlay ARP tables, but are not only used to populate MAC-VRF and overlay ARP tables, but
also IP-VRF tables with the encoded TS host routes (/32 or /128). In also IP-VRF tables with the encoded TS host routes (/32 or /128). In
some cases, EVPN may advertise IP Prefixes and therefore provide some cases, EVPN may advertise IP Prefixes and therefore provide
skipping to change at page 4, line 42 skipping to change at page 5, line 12
route type is justified, sections 3, 4 and 5 will describe this route route type is justified, sections 3, 4 and 5 will describe this route
type and how it is used in some specific use cases. type and how it is used in some specific use cases.
2.1 Inter-Subnet Connectivity Requirements in Data Centers 2.1 Inter-Subnet Connectivity Requirements in Data Centers
[RFC7432] is used as the control plane for a Network Virtualization [RFC7432] is used as the control plane for a Network Virtualization
Overlay (NVO3) solution in Data Centers (DC), where Network Overlay (NVO3) solution in Data Centers (DC), where Network
Virtualization Edge (NVE) devices can be located in Hypervisors or Virtualization Edge (NVE) devices can be located in Hypervisors or
TORs, as described in [EVPN-OVERLAY]. TORs, as described in [EVPN-OVERLAY].
If we use the term Tenant System (TS) to designate a physical or If the term Tenant System (TS) is used to designate a physical or
virtual system identified by MAC and maybe IP addresses, and virtual system identified by MAC and maybe IP addresses, and
connected to a BD by an Attachment Circuit, the following connected to a BD by an Attachment Circuit, the following
considerations apply: considerations apply:
o The Tenant Systems may be Virtual Machines (VMs) that generate o The Tenant Systems may be Virtual Machines (VMs) that generate
traffic from their own MAC and IP. traffic from their own MAC and IP.
o The Tenant Systems may be Virtual Appliance entities (VAs) that o The Tenant Systems may be Virtual Appliance entities (VAs) that
forward traffic to/from IP addresses of different End Devices forward traffic to/from IP addresses of different End Devices
sitting behind them. sitting behind them.
skipping to change at page 7, line 9 skipping to change at page 7, line 9
from/to the subnets and hosts sitting behind them (SN1, SN2, SN3, from/to the subnets and hosts sitting behind them (SN1, SN2, SN3,
IP4 and IP5). Their IP addresses (IP2 and IP3) belong to the BD-10 IP4 and IP5). Their IP addresses (IP2 and IP3) belong to the BD-10
subnet and they can also generate/receive traffic. When these VAs subnet and they can also generate/receive traffic. When these VAs
receive packets destined to their own MAC addresses (M2 and M3) receive packets destined to their own MAC addresses (M2 and M3)
they will route the packets to the proper subnet or host. These VAs they will route the packets to the proper subnet or host. These VAs
do not support routing protocols to advertise the subnets connected do not support routing protocols to advertise the subnets connected
to them and can move to a different server and NVE when the Cloud to them and can move to a different server and NVE when the Cloud
Management System decides to do so. These VAs may also support Management System decides to do so. These VAs may also support
redundancy mechanisms for some subnets, similar to VRRP, where a redundancy mechanisms for some subnets, similar to VRRP, where a
floating IP is owned by the master VA and only the master VA floating IP is owned by the master VA and only the master VA
forwards traffic to a given subnet. E.g.: vIP23 in figure 1 is a forwards traffic to a given subnet. E.g.: vIP23 in Figure 1 is a
floating IP that can be owned by TS2 or TS3 depending on who the floating IP that can be owned by TS2 or TS3 depending on who the
master is. Only the master will forward traffic to SN1. master is. Only the master will forward traffic to SN1.
o Integrated Routing and Bridging interfaces IRB1, IRB2 and IRB3 have o Integrated Routing and Bridging interfaces IRB1, IRB2 and IRB3 have
their own IP addresses that belong to the BD-10 subnet too. These their own IP addresses that belong to the BD-10 subnet too. These
IRB interfaces connect the BD-10 subnet to Virtual Routing and IRB interfaces connect the BD-10 subnet to Virtual Routing and
Forwarding (IP-VRF) instances that can route the traffic to other Forwarding (IP-VRF) instances that can route the traffic to other
subnets for the same tenant (within the DC or at the other end of subnets for the same tenant (within the DC or at the other end of
the WAN). the WAN).
o TS4 is a layer-2 VA that provides connectivity to subnets SN5, SN6 o TS4 is a layer-2 VA that provides connectivity to subnets SN5, SN6
and SN7, but does not have an IP address itself in the BD-10. TS4 and SN7, but does not have an IP address itself in the BD-10. TS4
is connected to a physical port on NVE5 assigned to Ethernet is connected to a physical port on NVE5 assigned to Ethernet
Segment Identifier 4. Segment Identifier 4.
All the above DC use cases require inter-subnet forwarding and For a BD that an ingress NVE is attached to, "Overlay Index" is
therefore the individual host routes and subnets: defined as an identifier that the ingress EVPN NVE requires in order
to forward packets to a subnet or host in a remote subnet. As an
example, vIP23 (Figure 1) is an Overlay Index that any NVE attached
to BD-10 needs to know in order to forward packets to SN1. IRB3 IP
address is an Overlay Index required to get to SN4, and ESI4 is an
Overlay Index needed to forward traffic to SN5. In other words, the
Overlay Index is a next-hop in the overlay address space that can be
an IP address, a MAC address or an ESI. When advertised along with an
IP Prefix, the Overlay Index requires a recursive resolution to find
out to what egress NVE the EVPN packets need to be sent.
All the DC use cases in Figure 1 require inter-subnet forwarding and
therefore, the individual host routes and subnets:
a) MUST be advertised from the NVEs (since VAs and VMs do not a) MUST be advertised from the NVEs (since VAs and VMs do not
participate in dynamic routing protocols) and participate in dynamic routing protocols) and
b) MAY be associated to an Overlay Index that can be a VA IP address, b) MAY be associated to an Overlay Index that can be a VA IP address,
a floating IP address, a MAC address or an ESI. An Overlay Index a floating IP address, a MAC address or an ESI. The Overlay Index
is a next-hop that requires a recursive resolution and it is is further discussed in section 3.2.
described in section 3.2.
2.2 The Requirement for a New EVPN Route Type 2.2 The Requirement for a New EVPN Route Type
[RFC7432] defines a MAC/IP route (also referred as RT-2) where a MAC [RFC7432] defines a MAC/IP route (also referred as RT-2) where a MAC
address can be advertised together with an IP address length (IPL) address can be advertised together with an IP address length (IPL)
and IP address (IP). While a variable IPL might have been used to and IP address (IP). While a variable IPL might have been used to
indicate the presence of an IP prefix in a route type 2, there are indicate the presence of an IP prefix in a route type 2, there are
several specific use cases in which using this route type to deliver several specific use cases in which using this route type to deliver
IP Prefixes is not suitable. IP Prefixes is not suitable.
One example of such use cases is the "floating IP" example described One example of such use cases is the "floating IP" example described
in section 2.1. In this example we need to decouple the advertisement in section 2.1. In this example it is needed to decouple the
of the prefixes from the advertisement of MAC address of either M2 or advertisement of the prefixes from the advertisement of MAC address
M3", otherwise the solution gets highly inefficient and does not of either M2 or M3, otherwise the solution gets highly inefficient
scale. and does not scale.
E.g.: if we are advertising 1k prefixes from M2 (using RT-2) and the E.g.: if 1k prefixes are advertised from M2 (using RT-2) and the
floating IP owner changes from M2 to M3, we would need to withdraw 1k floating IP owner changes from M2 to M3, 1k routes would be withdrawn
routes from M2 and re-advertise 1k routes from M3. However if we use from M2 and re-advertise 1k routes from M3. However if a separate
a separate route type, we can advertise the 1k routes associated to route type is used, 1k routes can be advertised as associated to the
the floating IP address (vIP23) and only one RT-2 for advertising the floating IP address (vIP23) and only one RT-2 for advertising the
ownership of the floating IP, i.e. vIP23 and M2 in the route type 2. ownership of the floating IP, i.e. vIP23 and M2 in the route type 2.
When the floating IP owner changes from M2 to M3, a single RT-2 When the floating IP owner changes from M2 to M3, a single RT-2
withdraw/update is required to indicate the change. The remote DGW withdraw/update is required to indicate the change. The remote DGW
will not change any of the 1k prefixes associated to vIP23, but will will not change any of the 1k prefixes associated to vIP23, but will
only update the ARP resolution entry for vIP23 (now pointing at M3). only update the ARP resolution entry for vIP23 (now pointing at M3).
Other reasons to decouple the IP Prefix advertisement from the MAC/IP Other reasons to decouple the IP Prefix advertisement from the MAC/IP
route are listed below: route are listed below:
o Clean identification, operation and troubleshooting of IP Prefixes, o Clean identification, operation and troubleshooting of IP Prefixes,
independent of and not subject to the interpretation of the IPL and independent of and not subject to the interpretation of the IPL and
the IP value. E.g.: a default IP route 0.0.0.0/0 must always be the IP value. E.g.: a default IP route 0.0.0.0/0 must always be
easily and clearly distinguished from the absence of IP easily and clearly distinguished from the absence of IP
information. information.
o In MAC/IP routes, the MAC information is part of the NLRI, so if IP o In MAC/IP routes, the MAC information is part of the NLRI, so if IP
Prefixes were to be advertised using MAC/IP routes, the MAC Prefixes were to be advertised using MAC/IP routes, the MAC
information would always be present and part of the route key. information would always be present and part of the route key.
The following sections describe how EVPN is extended with a new route The following sections describe how EVPN is extended with a route
type for the advertisement of IP prefixes and how this route is used type for the advertisement of IP prefixes and how this route is used
to address the current and future inter-subnet connectivity to address the inter-subnet connectivity requirements existing in the
requirements existing in the Data Center. Data Center.
3. The BGP EVPN IP Prefix Route 3. The BGP EVPN IP Prefix Route
The current BGP EVPN NLRI as defined in [RFC7432] is shown below: The current BGP EVPN NLRI as defined in [RFC7432] is shown below:
+-----------------------------------+ +-----------------------------------+
| Route Type (1 octet) | | Route Type (1 octet) |
+-----------------------------------+ +-----------------------------------+
| Length (1 octet) | | Length (1 octet) |
+-----------------------------------+ +-----------------------------------+
| Route Type specific (variable) | | Route Type specific (variable) |
+-----------------------------------+ +-----------------------------------+
Figure 2 BGP EVPN NLRI
Where the route type field can contain one of the following specific Where the route type field can contain one of the following specific
values (refer to the IANA "EVPN Route Types registry): values (refer to the IANA "EVPN Route Types" registry):
+ 1 - Ethernet Auto-Discovery (A-D) route + 1 - Ethernet Auto-Discovery (A-D) route
+ 2 - MAC/IP advertisement route + 2 - MAC/IP advertisement route
+ 3 - Inclusive Multicast Route + 3 - Inclusive Multicast Route
+ 4 - Ethernet Segment Route + 4 - Ethernet Segment Route
This document defines an additional route type that IANA has added to This document defines an additional route type that IANA has added to
the registry, and will be used for the advertisement of IP Prefixes: the registry, and will be used for the advertisement of IP Prefixes:
+ 5 - IP Prefix Route + 5 - IP Prefix Route
The support for this new route type is OPTIONAL. According to Section 5.4 in [RFC7606], a node that doesn't recognize
the Route Type 5 (RT-5) will ignore it. Therefore an NVE following
Since this new route type is OPTIONAL, an implementation not this document can still be attached to a BD where an NVE ignoring RT-
supporting it MUST ignore the route, based on the unknown route type 5s is attached to. Regular [RFC7432] procedures would apply in that
value, as specified by Section 5.4 in [RFC7606]. case for both NVEs. In case two or more NVEs are attached to
different BDs of the same tenant, they MUST support RT-5 for the
proper Inter-Subnet Forwarding operation of the tenant.
The detailed encoding of this route and associated procedures are The detailed encoding of this route and associated procedures are
described in the following sections. described in the following sections.
3.1 IP Prefix Route Encoding 3.1 IP Prefix Route Encoding
An IP Prefix advertisement route NLRI consists of the following An IP Prefix Route Type consists of the following fields:
fields:
+---------------------------------------+ +---------------------------------------+
| RD (8 octets) | | RD (8 octets) |
+---------------------------------------+ +---------------------------------------+
|Ethernet Segment Identifier (10 octets)| |Ethernet Segment Identifier (10 octets)|
+---------------------------------------+ +---------------------------------------+
| Ethernet Tag ID (4 octets) | | Ethernet Tag ID (4 octets) |
+---------------------------------------+ +---------------------------------------+
| IP Prefix Length (1 octet) | | IP Prefix Length (1 octet) |
+---------------------------------------+ +---------------------------------------+
| IP Prefix (4 or 16 octets) | | IP Prefix (4 or 16 octets) |
+---------------------------------------+ +---------------------------------------+
| GW IP Address (4 or 16 octets) | | GW IP Address (4 or 16 octets) |
+---------------------------------------+ +---------------------------------------+
| MPLS Label (3 octets) | | MPLS Label (3 octets) |
+---------------------------------------+ +---------------------------------------+
Figure 3 EVPN IP Prefix route NLRI
Where: Where:
o RD, Ethernet Tag ID and MPLS Label fields will be used as defined o RD and Ethernet Tag ID MUST be used as defined in [RFC7432] and
in [RFC7432] and [EVPN-OVERLAY]. [EVPN-OVERLAY]. The MPLS Label field is set to either an MPLS label
or a VNI, as described in [EVPN-OVERLAY] for other EVPN route
types.
o The Ethernet Segment Identifier will be a non-zero 10-byte o The Ethernet Segment Identifier MUST be a non-zero 10-byte
identifier if the ESI is used as an Overlay Index (see the identifier if the ESI is used as an Overlay Index (see the
definition of Overlay Index in section 3.2). It will be zero definition of Overlay Index in section 3.2). It MUST be zero
otherwise. otherwise. The ESI format is described in [RFC7432].
o The IP Prefix Length can be set to a value between 0 and 32 (bits) o The IP Prefix Length can be set to a value between 0 and 32 (bits)
for ipv4 and between 0 and 128 for ipv6, and specifies the number for IPv4 and between 0 and 128 for IPv6, and specifies the number
of bits in the Prefix. of bits in the Prefix. The value MUST NOT be greater than 128.
o The IP Prefix will be a 32 or 128-bit field (ipv4 or ipv6). The o The IP Prefix is a 4 or 16-octet field (IPv4 or IPv6). The size of
size of this field does not depend on the value of the IP Prefix this field MUST NOT be 4 octets if the IP Prefix Length value is
Length field. greater than 32 bits.
o The GW IP (Gateway IP Address) will be a 32 or 128-bit field (ipv4 o The GW (Gateway) IP Address field is a 4 or 16-octet field (IPv4 or
or ipv6), and will encode an IP address as an overlay index for the IPv6), and will encode a valid IP address as an Overlay Index for
IP Prefixes. The GW IP field SHOULD be zero if it is not used as an the IP Prefixes. The GW IP field MUST be zero if it is not used as
Overlay Index. Refer to section 3.2 for the definition and use of an Overlay Index. Refer to section 3.2 for the definition and use
the Overlay Index. of the Overlay Index.
o The MPLS Label field is encoded as 3 octets, where the high-order o The MPLS Label field is encoded as 3 octets, where the high-order
20 bits contain the label value. When sending, the label value 20 bits contain the label value. When sending, the label value
SHOULD be zero if recursive resolution based on overlay index is SHOULD be zero if recursive resolution based on overlay index is
used. If the received MPLS Label value is zero, the route MUST used. If the received MPLS Label value is zero, the route MUST
contain an Overlay Index and the ingress NVE/PE MUST do recursive contain an Overlay Index and the ingress NVE/PE MUST do recursive
resolution to find the egress NVE/PE. If the received Label value resolution to find the egress NVE/PE. If the received Label is zero
is non-zero, the route will not be used for recursive resolution and the route does not contain an Overlay Index, it MUST be treat-
unless a local policy says so. as-withdraw [RFC7606]. If the received Label value is non-zero, the
route will not be used for recursive resolution unless a local
policy says so.
o The total route length will indicate the type of prefix (ipv4 or o The total route length will indicate the type of prefix (IPv4 or
ipv6) and the type of GW IP address (ipv4 or ipv6). Note that the IPv6) and the type of GW IP address (IPv4 or IPv6). Note that the
IP Prefix + the GW IP should have a length of either 64 or 256 IP Prefix + the GW IP should have a length of either 64 or 256
bits, but never 160 bits (ipv4 and ipv6 mixed values are not bits, but never 160 bits (IPv4 and IPv6 mixed values are not
allowed). allowed).
The RD, Eth-Tag ID, IP Prefix Length and IP Prefix will be part of The RD, Ethernet Tag ID, IP Prefix Length and IP Prefix are part of
the route key used by BGP to compare routes. The rest of the fields the route key used by BGP to compare routes. The rest of the fields
will not be part of the route key. are not part of the route key.
An IP Prefix Route MAY be sent along with a Router's MAC Extended An IP Prefix Route MAY be sent along with a Router's MAC Extended
Community (defined in [EVPN-INTERSUBNET]) to carry the MAC address Community (defined in [EVPN-INTERSUBNET]) to carry the MAC address
that is used as the overlay index. Note that the MAC address may be that is used as the overlay index. Note that the MAC address may be
that of an TS. that of an TS.
3.2 Overlay Indexes and Recursive Lookup Resolution 3.2 Overlay Indexes and Recursive Lookup Resolution
RT-5 routes support recursive lookup resolution through the use of RT-5 routes support recursive lookup resolution through the use of
Overlay Indexes as follows: Overlay Indexes as follows:
skipping to change at page 11, line 44 skipping to change at page 12, line 27
the IP address field of its NLRI. the IP address field of its NLRI.
. If the RT-5 specifies a MAC address as the Overlay Index, . If the RT-5 specifies a MAC address as the Overlay Index,
recursive resolution can only be done if the NVE has received and recursive resolution can only be done if the NVE has received and
installed an RT-2 (MAC/IP route) specifying that MAC address in installed an RT-2 (MAC/IP route) specifying that MAC address in
the MAC address field of its NLRI. the MAC address field of its NLRI.
Note that the RT-1 or RT-2 routes needed for the recursive Note that the RT-1 or RT-2 routes needed for the recursive
resolution may arrive before or after the given RT-5 route. resolution may arrive before or after the given RT-5 route.
o Irrespective of the recursive resolution, if there is no IGP or BGP o Irrespective of the recursive resolution, if there is no IGP or BGP
route to the BGP next-hop of an RT-5, BGP SHOULD fail to install route to the BGP next-hop of an RT-5, BGP MUST fail to install the
the RT-5 even if the Overlay Index can be resolved. RT-5 even if the Overlay Index can be resolved.
o The ESI and GW IP fields MAY both be zero, however they MUST NOT o The ESI and GW IP fields may both be zero, however they MUST NOT
both be non-zero at the same time. A route containing a non-zero GW both be non-zero at the same time. A route containing a non-zero GW
IP and a non-zero ESI (at the same time) will be treated as- IP and a non-zero ESI (at the same time) SHOULD be treat-as-
withdraw. withdraw [RFC7606].
o If either the ESI or GW IP are non-zero, then one of them is the
Overlay Index, regardless of whether the Router's MAC Extended
Community is present or the value of the Label.
The indirection provided by the Overlay Index and its recursive The indirection provided by the Overlay Index and its recursive
lookup resolution is required to achieve fast convergence in case of lookup resolution is required to achieve fast convergence in case of
a failure of the object represented by the Overlay Index (see the a failure of the object represented by the Overlay Index (see the
example described in section 2.2). example described in section 2.2).
Table 1 shows the different RT-5 field combinations allowed by this Table 1 shows the different RT-5 field combinations allowed by this
specification and what Overlay Index must be used by the receiving specification and what Overlay Index must be used by the receiving
NVE/PE in each case. When the Overlay Index is "None" in Table 1, the NVE/PE in each case. Those cases where there is no Overlay Index, are
indicated as "None" in Table 1. If there is no Overlay Index the
receiving NVE/PE will not perform any recursive resolution, and the receiving NVE/PE will not perform any recursive resolution, and the
actual next-hop is given by the RT-5's BGP next-hop. actual next-hop is given by the RT-5's BGP next-hop.
+----------+----------+----------+------------+----------------+ +----------+----------+----------+------------+----------------+
| ESI | GW-IP | MAC* | Label | Overlay Index | | ESI | GW IP | MAC* | Label | Overlay Index |
|--------------------------------------------------------------| |--------------------------------------------------------------|
| Non-Zero | Zero | Zero | Don't Care | ESI | | Non-Zero | Zero | Zero | Don't Care | ESI |
| Non-Zero | Zero | Non-Zero | Don't Care | ESI | | Non-Zero | Zero | Non-Zero | Don't Care | ESI |
| Zero | Non-Zero | Zero | Don't Care | GW-IP | | Zero | Non-Zero | Zero | Don't Care | GW IP |
| Zero | Zero | Non-Zero | Zero | MAC | | Zero | Zero | Non-Zero | Zero | MAC |
| Zero | Zero | Non-Zero | Non-Zero | MAC or None** | | Zero | Zero | Non-Zero | Non-Zero | MAC or None** |
| Zero | Zero | Zero | Non-Zero | None*** | | Zero | Zero | Zero | Non-Zero | None*** |
+----------+----------+----------+------------+----------------+ +----------+----------+----------+------------+----------------+
Table 1 - RT-5 fields and Indicated Overlay Index Table 1 - RT-5 fields and Indicated Overlay Index
Table NOTES: Table NOTES:
* MAC with Zero value means no Router's MAC extended community is * MAC with Zero value means no Router's MAC extended community is
present along with the RT-5. Non-Zero indicates that the extended present along with the RT-5. Non-Zero indicates that the extended
community is present and carries a valid MAC address. Examples of community is present and carries a valid MAC address. The
invalid MAC addresses are broadcast or multicast MAC addresses. encoding of a MAC address MUST be the 6-octet MAC address
The presence of the Router's MAC extended community alone is not specified by [802.1Q] and [802.1D-REV]. Examples of invalid MAC
enough to indicate the use of the MAC address as the overlay addresses are broadcast or multicast MAC addresses. The route
index, since the extended community can be used for other MUST be treat-as-withdraw in case of an invalid MAC address. The
presence of the Router's MAC extended community alone is not
enough to indicate the use of the MAC address as the Overlay
Index, since the extended community can be used for other
purposes. purposes.
** In this case, the Overlay Index may be the RT-5's MAC address or ** In this case, the Overlay Index may be the RT-5's MAC address or
None, depending on the local policy of the receiving NVE/PE. None, depending on the local policy of the receiving NVE/PE. Note
that the advertising NVE/PE that sets the Overlay Index SHOULD
advertise an RT-2 for the MAC Overlay Index if there are
receiving NVE/PEs configured to use the MAC as the Overlay Index.
This case in Table 1 is used in the IP-VRF-to-IP-VRF
implementations described in 4.4.1 and 4.4.3. The support of a
MAC Overlay Index in this model is OPTIONAL.
*** The Overlay Index is None. This is a special case used for IP- *** The Overlay Index is None. This is a special case used for IP-
VRF-to-IP-VRF where the NVE/PEs are connected by IP NVO tunnels VRF-to-IP-VRF where the NVE/PEs are connected by IP NVO tunnels
as opposed to Ethernet NVO tunnels. as opposed to Ethernet NVO tunnels.
Table 2 shows the different inter-subnet use-cases described in this Table 2 shows the different inter-subnet use-cases described in this
document and the corresponding coding of the Overlay Index in the document and the corresponding coding of the Overlay Index in the
route type 5 (RT-5). route type 5 (RT-5).
+---------+---------------------+----------------------------+ +---------+---------------------+----------------------------+
skipping to change at page 13, line 17 skipping to change at page 14, line 17
+-------------------------------+----------------------------+ +-------------------------------+----------------------------+
| 4.1 | TS IP address | GW IP | | 4.1 | TS IP address | GW IP |
| 4.2 | Floating IP address | GW IP | | 4.2 | Floating IP address | GW IP |
| 4.3 | "Bump in the wire" | ESI or MAC | | 4.3 | "Bump in the wire" | ESI or MAC |
| 4.4 | IP-VRF-to-IP-VRF | GW IP, MAC or None | | 4.4 | IP-VRF-to-IP-VRF | GW IP, MAC or None |
+---------+---------------------+----------------------------+ +---------+---------------------+----------------------------+
Table 2 - Use-cases and Overlay Indexes for Recursive Resolution Table 2 - Use-cases and Overlay Indexes for Recursive Resolution
The above use-cases are representative of the different Overlay The above use-cases are representative of the different Overlay
Indexes supported by RT-5 (GW IP, ESI, MAC or None). Any other use- Indexes supported by RT-5 (GW IP, ESI, MAC or None).
case using a given Overlay Index, SHOULD follow the procedures
described in this document for the same Overlay Index.
4. Overlay Index Use-Cases 4. Overlay Index Use-Cases
This section describes some use-cases for the Overlay Index types This section describes some use-cases for the Overlay Index types
used with the IP Prefix route. used with the IP Prefix route.
4.1 TS IP Address Overlay Index Use-Case 4.1 TS IP Address Overlay Index Use-Case
The following figure illustrates an example of inter-subnet Figure 4 illustrates an example of inter-subnet forwarding for
forwarding for subnets sitting behind Virtual Appliances (on TS2 and subnets sitting behind Virtual Appliances (on TS2 and TS3).
TS3).
IP4---+ NVE2 DGW1 IP4---+ NVE2 DGW1
| +-----------+ +---------+ +-------------+ | +-----------+ +---------+ +-------------+
SN2---TS2(VA)--| (BD-10) |-| |----| (BD-10) | SN2---TS2(VA)--| (BD-10) |-| |----| (BD-10) |
| IP2/M2 +-----------+ | | | IRB1\ | | IP2/M2 +-----------+ | | | IRB1\ |
-+---+ | | | (IP-VRF)|---+ -+---+ | | | (IP-VRF)|---+
| | | +-------------+ _|_ | | | +-------------+ _|_
SN1 | VXLAN/ | ( ) SN1 | VXLAN/ | ( )
| | nvGRE | DGW2 ( WAN ) | | nvGRE | DGW2 ( WAN )
-+---+ NVE3 | | +-------------+ (___) -+---+ NVE3 | | +-------------+ (___)
| IP3/M3 +-----------+ | |----| (BD-10) | | | IP3/M3 +-----------+ | |----| (BD-10) | |
SN3---TS3(VA)--| (BD-10) |-| | | IRB2\ | | SN3---TS3(VA)--| (BD-10) |-| | | IRB2\ | |
| +-----------+ +---------+ | (IP-VRF)|---+ | +-----------+ +---------+ | (IP-VRF)|---+
IP5---+ +-------------+ IP5---+ +-------------+
Figure 2 TS IP address use-case Figure 4 TS IP address use-case
An example of inter-subnet forwarding between subnet SN1/24 and a An example of inter-subnet forwarding between subnet SN1/24 and a
subnet sitting in the WAN is described below. NVE2, NVE3, DGW1 and subnet sitting in the WAN is described below. NVE2, NVE3, DGW1 and
DGW2 are running BGP EVPN. TS2 and TS3 do not participate in dynamic DGW2 are running BGP EVPN. TS2 and TS3 do not participate in dynamic
routing protocols, and they only have a static route to forward the routing protocols, and they only have a static route to forward the
traffic to the WAN. We assume SN1/24 is dual-homed to NVE2 and NVE3. traffic to the WAN. SN1/24 is dual-homed to NVE2 and NVE3.
In this case, a GW IP is used as an Overlay Index. Although a In this case, a GW IP is used as an Overlay Index. Although a
different Overlay Index type could have been used, this use-case different Overlay Index type could have been used, this use-case
assumes that the operator knows the VA's IP addresses beforehand, assumes that the operator knows the VA's IP addresses beforehand,
whereas the VA's MAC address is unknown and the VA's ESI is zero. whereas the VA's MAC address is unknown and the VA's ESI is zero.
Because of this, the GW IP is the suitable Overlay Index to be used Because of this, the GW IP is the suitable Overlay Index to be used
with the RT-5s. The NVEs know the GW IP to be used for a given Prefix with the RT-5s. The NVEs know the GW IP to be used for a given Prefix
by policy. by policy.
(1) NVE2 advertises the following BGP routes on behalf of TS2: (1) NVE2 advertises the following BGP routes on behalf of TS2:
o Route type 2 (MAC/IP route) containing: ML=48, M=M2, IPL=32, o Route type 2 (MAC/IP route) containing: ML=48 (MAC Address
Length), M=M2 (MAC Address), IPL=32 (IP Address Length),
IP=IP2 and [RFC5512] BGP Encapsulation Extended Community with IP=IP2 and [RFC5512] BGP Encapsulation Extended Community with
the corresponding Tunnel-type. The MAC and IP addresses may be the corresponding Tunnel-type. The MAC and IP addresses may be
learned via ARP-snooping (ND-snooping if IPv6). learned via ARP-snooping (ND-snooping if IPv6).
o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
ESI=0, GW IP address=IP2. The prefix and GW IP are learned by ESI=0, GW IP address=IP2. The prefix and GW IP are learned by
policy. policy.
(2) Similarly, NVE3 advertises the following BGP routes on behalf of (2) Similarly, NVE3 advertises the following BGP routes on behalf of
TS3: TS3:
skipping to change at page 14, line 49 skipping to change at page 15, line 46
route is imported and M2 is added to the BD-10 along with its route is imported and M2 is added to the BD-10 along with its
corresponding tunnel information. For instance, if VXLAN is corresponding tunnel information. For instance, if VXLAN is
used, the VTEP will be derived from the MAC/IP route BGP next- used, the VTEP will be derived from the MAC/IP route BGP next-
hop and VNI from the MPLS Label1 field. IP2 - M2 is added to hop and VNI from the MPLS Label1 field. IP2 - M2 is added to
the ARP table. Similarly, M3 is added to BD-10 and IP3 - M3 to the ARP table. Similarly, M3 is added to BD-10 and IP3 - M3 to
the ARP table. the ARP table.
o Based on the BD-10 route-target in DGW1 and DGW2, the IP o Based on the BD-10 route-target in DGW1 and DGW2, the IP
Prefix route is also imported and SN1/24 is added to the IP- Prefix route is also imported and SN1/24 is added to the IP-
VRF with Overlay Index IP2 pointing at the local BD-10. In VRF with Overlay Index IP2 pointing at the local BD-10. In
this example, we assume the RT-5 from NVE2 is preferred over this example, it is assumed that the RT-5 from NVE2 is
the RT-5 from NVE3. If both routes were equally preferable and preferred over the RT-5 from NVE3. If both routes were equally
ECMP enabled, SN1/24 would also be added to the routing table preferable and ECMP enabled, SN1/24 would also be added to the
with Overlay Index IP3. routing table with Overlay Index IP3.
(4) When DGW1 receives a packet from the WAN with destination IPx, (4) When DGW1 receives a packet from the WAN with destination IPx,
where IPx belongs to SN1/24: where IPx belongs to SN1/24:
o A destination IP lookup is performed on the DGW1 IP-VRF o A destination IP lookup is performed on the DGW1 IP-VRF
routing table and Overlay Index=IP2 is found. Since IP2 is an routing table and Overlay Index=IP2 is found. Since IP2 is an
Overlay Index a recursive route resolution is required for Overlay Index a recursive route resolution is required for
IP2. IP2.
o IP2 is resolved to M2 in the ARP table, and M2 is resolved to o IP2 is resolved to M2 in the ARP table, and M2 is resolved to
skipping to change at page 16, line 4 skipping to change at page 16, line 49
still points at Overlay Index IP2 in the routing table, but IP2 still points at Overlay Index IP2 in the routing table, but IP2
will be simply resolved to a different tunnel, based on the will be simply resolved to a different tunnel, based on the
outcome of the MAC mobility procedures for the MAC/IP route outcome of the MAC mobility procedures for the MAC/IP route
IP2/M2. IP2/M2.
Note that in the opposite direction, TS2 will send traffic based on Note that in the opposite direction, TS2 will send traffic based on
its static-route next-hop information (IRB1 and/or IRB2), and regular its static-route next-hop information (IRB1 and/or IRB2), and regular
EVPN procedures will be applied. EVPN procedures will be applied.
4.2 Floating IP Overlay Index Use-Case 4.2 Floating IP Overlay Index Use-Case
Sometimes Tenant Systems (TS) work in active/standby mode where an Sometimes Tenant Systems (TS) work in active/standby mode where an
upstream floating IP - owned by the active TS - is used as the upstream floating IP - owned by the active TS - is used as the
Overlay Index to get to some subnets behind. This redundancy mode, Overlay Index to get to some subnets behind. This redundancy mode,
already introduced in section 2.1 and 2.2, is illustrated in Figure already introduced in section 2.1 and 2.2, is illustrated in Figure
3. 5.
NVE2 DGW1 NVE2 DGW1
+-----------+ +---------+ +-------------+ +-----------+ +---------+ +-------------+
+---TS2(VA)--| (BD-10) |-| |----| (BD-10) | +---TS2(VA)--| (BD-10) |-| |----| (BD-10) |
| IP2/M2 +-----------+ | | | IRB1\ | | IP2/M2 +-----------+ | | | IRB1\ |
| <-+ | | | (IP-VRF)|---+ | <-+ | | | (IP-VRF)|---+
| | | | +-------------+ _|_ | | | | +-------------+ _|_
SN1 vIP23 (floating) | VXLAN/ | ( ) SN1 vIP23 (floating) | VXLAN/ | ( )
| | | nvGRE | DGW2 ( WAN ) | | | nvGRE | DGW2 ( WAN )
| <-+ NVE3 | | +-------------+ (___) | <-+ NVE3 | | +-------------+ (___)
| IP3/M3 +-----------+ | |----| (BD-10) | | | IP3/M3 +-----------+ | |----| (BD-10) | |
+---TS3(VA)--| (BD-10) |-| | | IRB2\ | | +---TS3(VA)--| (BD-10) |-| | | IRB2\ | |
+-----------+ +---------+ | (IP-VRF)|---+ +-----------+ +---------+ | (IP-VRF)|---+
+-------------+ +-------------+
Figure 3 Floating IP Overlay Index for redundant TS Figure 5 Floating IP Overlay Index for redundant TS
In this use-case, a GW IP is used as an Overlay Index for the same In this use-case, a GW IP is used as an Overlay Index for the same
reasons as in 4.1. However, this GW IP is a floating IP that belongs reasons as in 4.1. However, this GW IP is a floating IP that belongs
to the active TS. Assuming TS2 is the active TS and owns IP23: to the active TS. Assuming TS2 is the active TS and owns vIP23:
(1) NVE2 advertises the following BGP routes for TS2: (1) NVE2 advertises the following BGP routes for TS2:
o Route type 2 (MAC/IP route) containing: ML=48, M=M2, IPL=32, o Route type 2 (MAC/IP route) containing: ML=48, M=M2, IPL=32,
IP=IP23 (and BGP Encapsulation Extended Community). The MAC IP=vIP23 (and BGP Encapsulation Extended Community). The MAC
and IP addresses may be learned via ARP-snooping. and IP addresses may be learned via ARP-snooping.
o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
ESI=0, GW IP address=IP23. The prefix and GW IP are learned by ESI=0, GW IP address=vIP23. The prefix and GW IP are learned
policy. by policy.
(2) NVE3 advertises the following BGP route for TS3 (it does not (2) NVE3 advertises the following BGP route for TS3 (it does not
advertise an RT-2 for IP23/M3): advertise an RT-2 for vIP23/M3):
o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1, o Route type 5 (IP Prefix route) containing: IPL=24, IP=SN1,
ESI=0, GW IP address=IP23. The prefix and GW IP are learned by ESI=0, GW IP address=vIP23. The prefix and GW IP are learned
policy. by policy.
(3) DGW1 and DGW2 import both received routes based on the route- (3) DGW1 and DGW2 import both received routes based on the route-
target: target:
o M2 is added to the BD-10 FIB along with its corresponding o M2 is added to the BD-10 FIB along with its corresponding
tunnel information. For the VXLAN use case, the VTEP will be tunnel information. For the VXLAN use case, the VTEP will be
derived from the MAC/IP route BGP next-hop and VNI from the derived from the MAC/IP route BGP next-hop and VNI from the
VNI/VSID field. IP23 - M2 is added to the ARP table. VNI/VSID field. vIP23 - M2 is added to the ARP table.
o SN1/24 is added to the IP-VRF in DGW1 and DGW2 with Overlay o SN1/24 is added to the IP-VRF in DGW1 and DGW2 with Overlay
index IP23 pointing at M2 in the local BD-10. index vIP23 pointing at M2 in the local BD-10.
(4) When DGW1 receives a packet from the WAN with destination IPx, (4) When DGW1 receives a packet from the WAN with destination IPx,
where IPx belongs to SN1/24: where IPx belongs to SN1/24:
o A destination IP lookup is performed on the DGW1 IP-VRF o A destination IP lookup is performed on the DGW1 IP-VRF
routing table and Overlay Index=IP23 is found. Since IP23 is routing table and Overlay Index=vIP23 is found. Since vIP23 is
an Overlay Index, a recursive route resolution for IP23 is an Overlay Index, a recursive route resolution for vIP23 is
required. required.
o IP23 is resolved to M2 in the ARP table, and M2 is resolved to o vIP23 is resolved to M2 in the ARP table, and M2 is resolved
the tunnel information given by the BD (remote VTEP and VNI to the tunnel information given by the BD (remote VTEP and VNI
for the VXLAN case). for the VXLAN case).
o The IP packet destined to IPx is encapsulated with: o The IP packet destined to IPx is encapsulated with:
. Source inner MAC = IRB1 MAC. . Source inner MAC = IRB1 MAC.
. Destination inner MAC = M2. . Destination inner MAC = M2.
. Tunnel information provided by the BD FIB (VNI, VTEP IPs . Tunnel information provided by the BD FIB (VNI, VTEP IPs
and MACs for the VXLAN case). and MACs for the VXLAN case).
skipping to change at page 17, line 42 skipping to change at page 18, line 40
o Based on the tunnel information (VNI for the VXLAN case), the o Based on the tunnel information (VNI for the VXLAN case), the
BD-10 context is identified for a MAC lookup. BD-10 context is identified for a MAC lookup.
o Encapsulation is stripped-off and based on a MAC lookup o Encapsulation is stripped-off and based on a MAC lookup
(assuming MAC forwarding on the egress NVE), the packet is (assuming MAC forwarding on the egress NVE), the packet is
forwarded to TS2, where it will be properly routed. forwarded to TS2, where it will be properly routed.
(6) When the redundancy protocol running between TS2 and TS3 appoints (6) When the redundancy protocol running between TS2 and TS3 appoints
TS3 as the new active TS for SN1, TS3 will now own the floating TS3 as the new active TS for SN1, TS3 will now own the floating
IP23 and will signal this new ownership (GARP message or vIP23 and will signal this new ownership (GARP message or
similar). Upon receiving the new owner's notification, NVE3 will similar). Upon receiving the new owner's notification, NVE3 will
issue a route type 2 for M3-IP23 and NVE2 will withdraw the RT-2 issue a route type 2 for M3-vIP23 and NVE2 will withdraw the RT-2
for M2-IP23. DGW1 and DGW2 will update their ARP tables with the for M2-vIP23. DGW1 and DGW2 will update their ARP tables with the
new MAC resolving the floating IP. No changes are made in the IP- new MAC resolving the floating IP. No changes are made in the IP-
VRF routing table. VRF routing table.
4.3 Bump-in-the-Wire Use-Case 4.3 Bump-in-the-Wire Use-Case
Figure 5 illustrates an example of inter-subnet forwarding for an IP
Figure 6 illustrates an example of inter-subnet forwarding for an IP
Prefix route that carries a subnet SN1. In this use-case, TS2 and TS3 Prefix route that carries a subnet SN1. In this use-case, TS2 and TS3
are layer-2 VA devices without any IP address that can be included as are layer-2 VA devices without any IP address that can be included as
an Overlay Index in the GW IP field of the IP Prefix route. Their MAC an Overlay Index in the GW IP field of the IP Prefix route. Their MAC
addresses are M2 and M3 respectively and are connected to BD-10. Note addresses are M2 and M3 respectively and are connected to BD-10. Note
that IRB1 and IRB2 (in DGW1 and DGW2 respectively) have IP addresses that IRB1 and IRB2 (in DGW1 and DGW2 respectively) have IP addresses
in a subnet different than SN1. in a subnet different than SN1.
NVE2 DGW1 NVE2 DGW1
M2 +-----------+ +---------+ +-------------+ M2 +-----------+ +---------+ +-------------+
+---TS2(VA)--| (BD-10) |-| |----| (BD-10) | +---TS2(VA)--| (BD-10) |-| |----| (BD-10) |
skipping to change at page 18, line 26 skipping to change at page 19, line 23
| + | | | (IP-VRF)|---+ | + | | | (IP-VRF)|---+
| | | | +-------------+ _|_ | | | | +-------------+ _|_
SN1 | | VXLAN/ | ( ) SN1 | | VXLAN/ | ( )
| | | nvGRE | DGW2 ( WAN ) | | | nvGRE | DGW2 ( WAN )
| + NVE3 | | +-------------+ (___) | + NVE3 | | +-------------+ (___)
| ESI23 +-----------+ | |----| (BD-10) | | | ESI23 +-----------+ | |----| (BD-10) | |
+---TS3(VA)--| (BD-10) |-| | | IRB2\ | | +---TS3(VA)--| (BD-10) |-| | | IRB2\ | |
M3 +-----------+ +---------+ | (IP-VRF)|---+ M3 +-----------+ +---------+ | (IP-VRF)|---+
+-------------+ +-------------+
Figure 5 Bump-in-the-wire use-case Figure 6 Bump-in-the-wire use-case
Since neither TS2 nor TS3 can participate in any dynamic routing Since neither TS2 nor TS3 can participate in any dynamic routing
protocol and have no IP address assigned, there are two potential protocol and have no IP address assigned, there are two potential
Overlay Index types that can be used when advertising SN1: Overlay Index types that can be used when advertising SN1:
a) an ESI, i.e. ESI23, that can be provisioned on the attachment a) an ESI, i.e. ESI23, that can be provisioned on the attachment
ports of NVE2 and NVE3, as shown in Figure 5. ports of NVE2 and NVE3, as shown in Figure 6.
b) or the VA's MAC address, that can be added to NVE2 and NVE3 by b) or the VA's MAC address, that can be added to NVE2 and NVE3 by
policy. policy.
The advantage of using an ESI as Overlay Index as opposed to the VA's The advantage of using an ESI as Overlay Index as opposed to the VA's
MAC address, is that the forwarding to the egress NVE can be done MAC address, is that the forwarding to the egress NVE can be done
purely based on the state of the AC in the ES (notified by the AD purely based on the state of the AC in the ES (notified by the AD
per-EVI route) and all the EVPN multi-homing redundancy mechanisms per-EVI route) and all the EVPN multi-homing redundancy mechanisms
can be re-used. For instance, the [RFC7432] mass-withdrawal mechanism can be re-used. For instance, the [RFC7432] mass-withdrawal mechanism
for fast failure detection and propagation can be used. This section for fast failure detection and propagation can be used. This section
assumes that an ESI Overlay Index is used in this use-case but it assumes that an ESI Overlay Index is used in this use-case but it
skipping to change at page 20, line 19 skipping to change at page 21, line 16
MAC address, as opposed to the NVE/PE's MAC address. MAC address, as opposed to the NVE/PE's MAC address.
. Tunnel information for the NVO tunnel is provided by the . Tunnel information for the NVO tunnel is provided by the
Ethernet A-D route per-EVI for ESI23 (VNI and VTEP IP for Ethernet A-D route per-EVI for ESI23 (VNI and VTEP IP for
the VXLAN case). the VXLAN case).
(5) When the packet arrives at NVE2: (5) When the packet arrives at NVE2:
o Based on the tunnel demultiplexer information (VNI for the o Based on the tunnel demultiplexer information (VNI for the
VXLAN case), the BD-10 context is identified for a MAC lookup VXLAN case), the BD-10 context is identified for a MAC lookup
(assuming MAC disposition model) or the VNI MAY directly (assuming MAC disposition model) or the VNI may directly
identify the egress interface (for a label or VNI disposition identify the egress interface (for a label or VNI disposition
model). model).
o Encapsulation is stripped-off and based on a MAC lookup o Encapsulation is stripped-off and based on a MAC lookup
(assuming MAC forwarding on the egress NVE) or a VNI lookup (assuming MAC forwarding on the egress NVE) or a VNI lookup
(in case of VNI forwarding), the packet is forwarded to TS2, (in case of VNI forwarding), the packet is forwarded to TS2,
where it will be forwarded to SN1. where it will be forwarded to SN1.
(6) If the redundancy protocol running between TS2 and TS3 follows an (6) If the redundancy protocol running between TS2 and TS3 follows an
active/standby model and there is a failure, appointing TS3 as active/standby model and there is a failure, appointing TS3 as
skipping to change at page 21, line 12 skipping to change at page 22, line 10
2. Traffic destined to IP subnets sitting behind the TS, e.g. SN1 or 2. Traffic destined to IP subnets sitting behind the TS, e.g. SN1 or
SN2. SN2.
In order to provide connectivity for (1), MAC/IP routes (RT-2) are In order to provide connectivity for (1), MAC/IP routes (RT-2) are
needed so that IRB or TS MACs and IPs can be distributed. needed so that IRB or TS MACs and IPs can be distributed.
Connectivity type (2) is accomplished by the exchange of IP Prefix Connectivity type (2) is accomplished by the exchange of IP Prefix
routes (RT-5) for IPs and subnets sitting behind certain Overlay routes (RT-5) for IPs and subnets sitting behind certain Overlay
Indexes, e.g. GW IP or ESI or TS MAC. Indexes, e.g. GW IP or ESI or TS MAC.
In some cases, IP Prefix routes may be advertised for subnets and IPs In some cases, IP Prefix routes may be advertised for subnets and IPs
sitting behind an IRB. We refer to this use-case as the "IP-VRF-to- sitting behind an IRB. This use-case is referred to as the "IP-VRF-
IP-VRF" model. to-IP-VRF" model.
[EVPN-INTERSUBNET] defines an asymmetric IRB model and a symmetric [EVPN-INTERSUBNET] defines an asymmetric IRB model and a symmetric
IRB model, based on the required lookups at the ingress and egress IRB model, based on the required lookups at the ingress and egress
NVE: the asymmetric model requires an ip-lookup and a mac-lookup at NVE: the asymmetric model requires an ip-lookup and a mac-lookup at
the ingress NVE, whereas only a mac-lookup is needed at the egress the ingress NVE, whereas only a mac-lookup is needed at the egress
NVE; the symmetric model requires ip and mac lookups at both, ingress NVE; the symmetric model requires ip and mac lookups at both, ingress
and egress NVE. From that perspective, the IP-VRF-to-IP-VRF use-case and egress NVE. From that perspective, the IP-VRF-to-IP-VRF use-case
described in this section is a symmetric IRB model. described in this section is a symmetric IRB model.
Note that, in an IP-VRF-to-IP-VRF scenario, out of the many subnets Note that, in an IP-VRF-to-IP-VRF scenario, out of the many subnets
skipping to change at page 21, line 47 skipping to change at page 22, line 45
1) Interface-less model: no SBD and no overlay indexes required. 1) Interface-less model: no SBD and no overlay indexes required.
2) Interface-ful with SBD IRB model: it requires SBD, as well as GW 2) Interface-ful with SBD IRB model: it requires SBD, as well as GW
IP addresses as overlay indexes. IP addresses as overlay indexes.
3) Interface-ful with unnumbered SBD IRB model: it requires SBD, as 3) Interface-ful with unnumbered SBD IRB model: it requires SBD, as
well as MAC addresses as overlay indexes. well as MAC addresses as overlay indexes.
Inter-subnet IP multicast is outside the scope of this document. Inter-subnet IP multicast is outside the scope of this document.
4.4.1 Interface-less IP-VRF-to-IP-VRF Model 4.4.1 Interface-less IP-VRF-to-IP-VRF Model
Figure 6 will be used for the description of this model. Figure 7 will be used for the description of this model.
NVE1(M1) NVE1(M1)
+------------+ +------------+
IP1+----| (BD-1) | DGW1(M3) IP1+----| (BD-1) | DGW1(M3)
| \ | +---------+ +--------+ | \ | +---------+ +--------+
| (IP-VRF)|----| |-|(IP-VRF)|----+ | (IP-VRF)|----| |-|(IP-VRF)|----+
| / | | | +--------+ | | / | | | +--------+ |
+---| (BD-2) | | | _+_ +---| (BD-2) | | | _+_
| +------------+ | | ( ) | +------------+ | | ( )
SN1| | VXLAN/ | ( WAN )--H1 SN1| | VXLAN/ | ( WAN )--H1
| NVE2(M2) | nvGRE/ | (___) | NVE2(M2) | nvGRE/ | (___)
| +------------+ | MPLS | + | +------------+ | MPLS | +
+---| (BD-2) | | | DGW2(M4) | +---| (BD-2) | | | DGW2(M4) |
| \ | | | +--------+ | | \ | | | +--------+ |
| (IP-VRF)|----| |-|(IP-VRF)|----+ | (IP-VRF)|----| |-|(IP-VRF)|----+
| / | +---------+ +--------+ | / | +---------+ +--------+
SN2+----| (BD-3) | SN2+----| (BD-3) |
+------------+ +------------+
Figure 6 Interface-less IP-VRF-to-IP-VRF model Figure 7 Interface-less IP-VRF-to-IP-VRF model
In this case: In this case:
a) The NVEs and DGWs must provide connectivity between hosts in SN1, a) The NVEs and DGWs must provide connectivity between hosts in SN1,
SN2, IP1 and hosts sitting at the other end of the WAN, for SN2, IP1 and hosts sitting at the other end of the WAN, for
example, H1. We assume the DGWs import/export IP and/or VPN-IP example, H1. It is assumed that the DGWs import/export IP and/or
routes from/to the WAN. VPN-IP routes from/to the WAN.
b) The IP-VRF instances in the NVE/DGWs are directly connected b) The IP-VRF instances in the NVE/DGWs are directly connected
through NVO tunnels, and no IRBs and/or BD instances are through NVO tunnels, and no IRBs and/or BD instances are
instantiated to connect the IP-VRFs. instantiated to connect the IP-VRFs.
c) The solution must provide layer-3 connectivity among the IP-VRFs c) The solution must provide layer-3 connectivity among the IP-VRFs
for Ethernet NVO tunnels, for instance, VXLAN or nvGRE. for Ethernet NVO tunnels, for instance, VXLAN or nvGRE.
d) The solution may provide layer-3 connectivity among the IP-VRFs d) The solution may provide layer-3 connectivity among the IP-VRFs
for IP NVO tunnels, for example, VXLAN GPE (with IP payload). for IP NVO tunnels, for example, VXLAN GPE (with IP payload).
In order to meet the above requirements, the EVPN route type 5 will In order to meet the above requirements, the EVPN route type 5 will
be used to advertise the IP Prefixes, along with the Router's MAC be used to advertise the IP Prefixes, along with the Router's MAC
Extended Community as defined in [EVPN-INTERSUBNET] if the Extended Community as defined in [EVPN-INTERSUBNET] if the
advertising NVE/DGW uses Ethernet NVO tunnels. Each NVE/DGW will advertising NVE/DGW uses Ethernet NVO tunnels. Each NVE/DGW will
advertise an RT-5 for each of its prefixes with the following fields: advertise an RT-5 for each of its prefixes with the following fields:
o RD as per [RFC7432]. o RD as per [RFC7432].
o Eth-Tag ID=0. o Ethernet Tag ID=0.
o IP address length and IP address, as explained in the previous o IP address length and IP address, as explained in the previous
sections. sections.
o GW IP address=0. o GW IP address=0.
o ESI=0 o ESI=0
o MPLS label or VNI corresponding to the IP-VRF. o MPLS label or VNI corresponding to the IP-VRF.
Each RT-5 will be sent with a route-target identifying the tenant Each RT-5 will be sent with a route-target identifying the tenant
(IP-VRF) and two BGP extended communities: (IP-VRF) and two BGP extended communities:
o The first one is the BGP Encapsulation Extended Community, as o The first one is the BGP Encapsulation Extended Community, as
per [RFC5512], identifying the tunnel type. per [RFC5512], identifying the tunnel type.
o The second one is the Router's MAC Extended Community as per o The second one is the Router's MAC Extended Community as per
[EVPN-INTERSUBNET] containing the MAC address associated to [EVPN-INTERSUBNET] containing the MAC address associated to
the NVE advertising the route. This MAC address identifies the the NVE advertising the route. This MAC address identifies the
NVE/DGW and MAY be re-used for all the IP-VRFs in the NVE. The NVE/DGW and MAY be re-used for all the IP-VRFs in the NVE. The
Router's MAC Extended Community MUST be sent if the route is Router's MAC Extended Community must be sent if the route is
associated to an Ethernet NVO tunnel, for instance, VXLAN. If associated to an Ethernet NVO tunnel, for instance, VXLAN. If
the route is associated to an IP NVO tunnel, for instance the route is associated to an IP NVO tunnel, for instance
VXLAN GPE with IP payload, the Router's MAC Extended Community VXLAN GPE with IP payload, the Router's MAC Extended Community
SHOULD NOT be sent. should not be sent.
The following example illustrates the procedure to advertise and The following example illustrates the procedure to advertise and
forward packets to SN1/24 (ipv4 prefix advertised from NVE1): forward packets to SN1/24 (IPv4 prefix advertised from NVE1):
(1) NVE1 advertises the following BGP route: (1) NVE1 advertises the following BGP route:
o Route type 5 (IP Prefix route) containing: o Route type 5 (IP Prefix route) containing:
. IPL=24, IP=SN1, Label=10. . IPL=24, IP=SN1, Label=10.
. GW IP= SHOULD be set to 0. . GW IP= set to 0.
. [RFC5512] BGP Encapsulation Extended Community. . [RFC5512] BGP Encapsulation Extended Community.
. Router's MAC Extended Community that contains M1. . Router's MAC Extended Community that contains M1.
. Route-target identifying the tenant (IP-VRF). . Route-target identifying the tenant (IP-VRF).
(2) DGW1 imports the received routes from NVE1: (2) DGW1 imports the received routes from NVE1:
o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5 o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5
skipping to change at page 24, line 42 skipping to change at page 25, line 42
IP-VRF). IP-VRF).
o An IP lookup is performed in the routing context, where SN1 o An IP lookup is performed in the routing context, where SN1
turns out to be a local subnet associated to BD-2. A turns out to be a local subnet associated to BD-2. A
subsequent lookup in the ARP table and the BD FIB will provide subsequent lookup in the ARP table and the BD FIB will provide
the forwarding information for the packet in BD-2. the forwarding information for the packet in BD-2.
The model described above is called Interface-less model since the The model described above is called Interface-less model since the
IP-VRFs are connected directly through tunnels and they don't require IP-VRFs are connected directly through tunnels and they don't require
those tunnels to be terminated in SBDs instead, like in sections those tunnels to be terminated in SBDs instead, like in sections
4.4.2 or 4.4.3. An EVPN IP-VRF-to-IP-VRF implementation is REQUIRED 4.4.2 or 4.4.3.
to support the ingress and egress procedures described in this
section.
4.4.2 Interface-ful IP-VRF-to-IP-VRF with SBD IRB 4.4.2 Interface-ful IP-VRF-to-IP-VRF with SBD IRB
Figure 7 will be used for the description of this model. Figure 8 will be used for the description of this model.
NVE1 NVE1
+------------+ DGW1 +------------+ DGW1
IP10+---+(BD-1) | +---------------+ +------------+ IP10+---+(BD-1) | +---------------+ +------------+
| \ | | | | | | \ | | | | |
|(IP-VRF)-(SBD)| |(SBD)-(IP-VRF)|-----+ |(IP-VRF)-(SBD)| |(SBD)-(IP-VRF)|-----+
| / IRB(IP1/M1) IRB(IP3/M3) | | | / IRB(IP1/M1) IRB(IP3/M3) | |
+---+(BD-2) | | | +------------+ _+_ +---+(BD-2) | | | +------------+ _+_
| +------------+ | | ( ) | +------------+ | | ( )
SN1| | VXLAN/ | ( WAN )--H1 SN1| | VXLAN/ | ( WAN )--H1
| NVE2 | nvGRE/ | (___) | NVE2 | nvGRE/ | (___)
| +------------+ | MPLS | DGW2 + | +------------+ | MPLS | DGW2 +
+---+(BD-2) | | | +------------+ | +---+(BD-2) | | | +------------+ |
| \ | | | | | | | \ | | | | | |
|(IP-VRF)-(SBD)| |(SBD)-(IP-VRF)|-----+ |(IP-VRF)-(SBD)| |(SBD)-(IP-VRF)|-----+
| / IRB(IP2/M2) IRB(IP4/M4) | | / IRB(IP2/M2) IRB(IP4/M4) |
SN2+----+(BD-3) | +---------------+ +------------+ SN2+----+(BD-3) | +---------------+ +------------+
+------------+ +------------+
Figure 7 Interface-ful with SBD IRB model Figure 8 Interface-ful with SBD IRB model
In this model: In this model:
a) As in section 4.4.1, the NVEs and DGWs must provide connectivity a) As in section 4.4.1, the NVEs and DGWs must provide connectivity
between hosts in SN1, SN2, IP1 and hosts sitting at the other end between hosts in SN1, SN2, IP10 and hosts sitting at the other end
of the WAN. of the WAN.
b) However, the NVE/DGWs are now connected through Ethernet NVO b) However, the NVE/DGWs are now connected through Ethernet NVO
tunnels terminated in the SBD instance. The IP-VRFs use IRB tunnels terminated in the SBD instance. The IP-VRFs use IRB
interfaces for their connectivity to the SBD. interfaces for their connectivity to the SBD.
c) Each SBD IRB has an IP and a MAC address, where the IP address c) Each SBD IRB has an IP and a MAC address, where the IP address
must be reachable from other NVEs or DGWs. must be reachable from other NVEs or DGWs.
d) The SBD is attached to all the NVE/DGWs in the tenant domain BDs. d) The SBD is attached to all the NVE/DGWs in the tenant domain BDs.
skipping to change at page 25, line 50 skipping to change at page 26, line 50
e) The solution must provide layer-3 connectivity for Ethernet NVO e) The solution must provide layer-3 connectivity for Ethernet NVO
tunnels, for instance, VXLAN or nvGRE. tunnels, for instance, VXLAN or nvGRE.
EVPN type 5 routes will be used to advertise the IP Prefixes, whereas EVPN type 5 routes will be used to advertise the IP Prefixes, whereas
EVPN RT-2 routes will advertise the MAC/IP addresses of each SBD IRB EVPN RT-2 routes will advertise the MAC/IP addresses of each SBD IRB
interface. Each NVE/DGW will advertise an RT-5 for each of its interface. Each NVE/DGW will advertise an RT-5 for each of its
prefixes with the following fields: prefixes with the following fields:
o RD as per [RFC7432]. o RD as per [RFC7432].
o Eth-Tag ID=0. o Ethernet Tag ID=0.
o IP address length and IP address, as explained in the previous o IP address length and IP address, as explained in the previous
sections. sections.
o GW IP address=IRB-IP (this is the Overlay Index that will be o GW IP address=IRB-IP (this is the Overlay Index that will be
used for the recursive route resolution). used for the recursive route resolution).
o ESI=0 o ESI=0
o Label value SHOULD be zero since the RT-5 route requires a o Label value should be zero since the RT-5 route requires a
recursive lookup resolution to an RT-2 route. It is ignored on recursive lookup resolution to an RT-2 route. It is ignored on
reception, and, when forwarding packets, the MPLS label or VNI reception, and, when forwarding packets, the MPLS label or VNI
from the RT-2's MPLS Label1 field is used. from the RT-2's MPLS Label1 field is used.
Each RT-5 will be sent with a route-target identifying the tenant Each RT-5 will be sent with a route-target identifying the tenant
(IP-VRF). The Router's MAC Extended Community SHOULD NOT be sent in (IP-VRF). The Router's MAC Extended Community should not be sent in
this case. this case.
The following example illustrates the procedure to advertise and The following example illustrates the procedure to advertise and
forward packets to SN1/24 (ipv4 prefix advertised from NVE1): forward packets to SN1/24 (IPv4 prefix advertised from NVE1):
(1) NVE1 advertises the following BGP routes: (1) NVE1 advertises the following BGP routes:
o Route type 5 (IP Prefix route) containing: o Route type 5 (IP Prefix route) containing:
. IPL=24, IP=SN1, Label= SHOULD be set to 0. . IPL=24, IP=SN1, Label= SHOULD be set to 0.
. GW IP=IP1 (sBD IRB's IP) . GW IP=IP1 (sBD IRB's IP)
. Route-target identifying the tenant (IP-VRF). . Route-target identifying the tenant (IP-VRF).
o Route type 2 (MAC/IP route for the SBD IRB) containing: o Route type 2 (MAC/IP route for the SBD IRB) containing:
. ML=48, M=M1, IPL=32, IP=IP1, Label=10. . ML=48, M=M1, IPL=32, IP=IP1, Label=10.
. A [RFC5512] BGP Encapsulation Extended Community. . A [RFC5512] BGP Encapsulation Extended Community.
. Route-target identifying the SBD. This route-target MAY be . Route-target identifying the SBD. This route-target may be
the same as the one used with the RT-5. the same as the one used with the RT-5.
(2) DGW1 imports the received routes from NVE1: (2) DGW1 imports the received routes from NVE1:
o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5 o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5
route-target. route-target.
. Since GW IP is different from zero, the GW IP (IP1) will be . Since GW IP is different from zero, the GW IP (IP1) will be
used as the Overlay Index for the recursive route resolution used as the Overlay Index for the recursive route resolution
to the RT-2 carrying IP1. to the RT-2 carrying IP1.
skipping to change at page 27, line 32 skipping to change at page 28, line 32
o An IP lookup is performed in the routing context, where SN1 o An IP lookup is performed in the routing context, where SN1
turns out to be a local subnet associated to BD-2. A turns out to be a local subnet associated to BD-2. A
subsequent lookup in the ARP table and the BD FIB will provide subsequent lookup in the ARP table and the BD FIB will provide
the forwarding information for the packet in BD-2. the forwarding information for the packet in BD-2.
The model described above is called 'Interface-ful with SBD IRB The model described above is called 'Interface-ful with SBD IRB
model' since the tunnels connecting the DGWs and NVEs need to be model' since the tunnels connecting the DGWs and NVEs need to be
terminated into the SBD. The SBD is connected to the IP-VRFs via SBD terminated into the SBD. The SBD is connected to the IP-VRFs via SBD
IRB interfaces, and that allows the recursive resolution of RT-5s to IRB interfaces, and that allows the recursive resolution of RT-5s to
GW IP addresses. An EVPN IP-VRF-to-IP-VRF implementation is REQUIRED GW IP addresses.
to support the ingress and egress procedures described in this
section.
4.4.3 Interface-ful IP-VRF-to-IP-VRF with Unnumbered SBD IRB 4.4.3 Interface-ful IP-VRF-to-IP-VRF with Unnumbered SBD IRB
Figure 8 will be used for the description of this model. Note that Figure 9 will be used for the description of this model. Note that
this model is similar to the one described in section 4.4.2, only this model is similar to the one described in section 4.4.2, only
without IP addresses on the SBD IRB interfaces. without IP addresses on the SBD IRB interfaces.
NVE1 NVE1
+------------+ DGW1 +------------+ DGW1
IP1+----+(BD-1) | +---------------+ +------------+ IP1+----+(BD-1) | +---------------+ +------------+
| \ | | | | | | \ | | | | |
|(IP-VRF)-(SBD)| (SBD)-(IP-VRF) |-----+ |(IP-VRF)-(SBD)| (SBD)-(IP-VRF) |-----+
| / IRB(M1)| | IRB(M3) | | | / IRB(M1)| | IRB(M3) | |
+---+(BD-2) | | | +------------+ _+_ +---+(BD-2) | | | +------------+ _+_
skipping to change at page 28, line 23 skipping to change at page 29, line 23
SN1| | VXLAN/ | ( WAN )--H1 SN1| | VXLAN/ | ( WAN )--H1
| NVE2 | nvGRE/ | (___) | NVE2 | nvGRE/ | (___)
| +------------+ | MPLS | DGW2 + | +------------+ | MPLS | DGW2 +
+---+(BD-2) | | | +------------+ | +---+(BD-2) | | | +------------+ |
| \ | | | | | | | \ | | | | | |
|(IP-VRF)-(SBD)| (SBD)-(IP-VRF) |-----+ |(IP-VRF)-(SBD)| (SBD)-(IP-VRF) |-----+
| / IRB(M2)| | IRB(M4) | | / IRB(M2)| | IRB(M4) |
SN2+----+(BD-3) | +---------------+ +------------+ SN2+----+(BD-3) | +---------------+ +------------+
+------------+ +------------+
Figure 8 Interface-ful with unnumbered SBD IRB model Figure 9 Interface-ful with unnumbered SBD IRB model
In this model: In this model:
a) As in section 4.4.1 and 4.4.2, the NVEs and DGWs must provide a) As in section 4.4.1 and 4.4.2, the NVEs and DGWs must provide
connectivity between hosts in SN1, SN2, IP1 and hosts sitting at connectivity between hosts in SN1, SN2, IP1 and hosts sitting at
the other end of the WAN. the other end of the WAN.
b) As in section 4.4.2, the NVE/DGWs are connected through Ethernet b) As in section 4.4.2, the NVE/DGWs are connected through Ethernet
NVO tunnels terminated in the SBD instance. The IP-VRFs use IRB NVO tunnels terminated in the SBD instance. The IP-VRFs use IRB
interfaces for their connectivity to the SBD. interfaces for their connectivity to the SBD.
skipping to change at page 29, line 9 skipping to change at page 30, line 9
This model will also make use of the RT-5 recursive resolution. EVPN This model will also make use of the RT-5 recursive resolution. EVPN
type 5 routes will advertise the IP Prefixes along with the Router's type 5 routes will advertise the IP Prefixes along with the Router's
MAC Extended Community used for the recursive lookup, whereas EVPN MAC Extended Community used for the recursive lookup, whereas EVPN
RT-2 routes will advertise the MAC addresses of each SBD IRB RT-2 routes will advertise the MAC addresses of each SBD IRB
interface (this time without an IP). interface (this time without an IP).
Each NVE/DGW will advertise an RT-5 for each of its prefixes with the Each NVE/DGW will advertise an RT-5 for each of its prefixes with the
same fields as described in 4.4.2 except for: same fields as described in 4.4.2 except for:
o GW IP address= SHOULD be set to 0. o GW IP address= set to 0.
Each RT-5 will be sent with a route-target identifying the tenant Each RT-5 will be sent with a route-target identifying the tenant
(IP-VRF) and the Router's MAC Extended Community containing the MAC (IP-VRF) and the Router's MAC Extended Community containing the MAC
address associated to SBD IRB interface. This MAC address MAY be re- address associated to SBD IRB interface. This MAC address may be re-
used for all the IP-VRFs in the NVE. used for all the IP-VRFs in the NVE.
The example is similar to the one in section 4.4.2: The example is similar to the one in section 4.4.2:
(1) NVE1 advertises the following BGP routes: (1) NVE1 advertises the following BGP routes:
o Route type 5 (IP Prefix route) containing the same values as o Route type 5 (IP Prefix route) containing the same values as
in the example in section 4.4.2, except for: in the example in section 4.4.2, except for:
. GW IP= SHOULD be set to 0. . GW IP= SHOULD be set to 0.
skipping to change at page 30, line 20 skipping to change at page 31, line 20
o NVE1 will identify the IP-VRF for an IP-lookup based on the o NVE1 will identify the IP-VRF for an IP-lookup based on the
Label and the inner MAC DA. Label and the inner MAC DA.
o An IP lookup is performed in the routing context, where SN1 o An IP lookup is performed in the routing context, where SN1
turns out to be a local subnet associated to BD-2. A turns out to be a local subnet associated to BD-2. A
subsequent lookup in the ARP table and the BD FIB will provide subsequent lookup in the ARP table and the BD FIB will provide
the forwarding information for the packet in BD-2. the forwarding information for the packet in BD-2.
The model described above is called Interface-ful with SBD IRB model The model described above is called Interface-ful with SBD IRB model
(as in section 4.4.2), only this time the SBD IRB does not have an IP (as in section 4.4.2), only this time the SBD IRB does not have an IP
address. This model is OPTIONAL for an EVPN IP-VRF-to-IP-VRF address.
implementation.
5. Conclusions 5. Conclusions
An EVPN route (type 5) for the advertisement of IP Prefixes is An EVPN route (type 5) for the advertisement of IP Prefixes is
described in this document. This new route type has a differentiated described in this document. This new route type has a differentiated
role from the RT-2 route and addresses the Data Center (or NVO-based role from the RT-2 route and addresses the Data Center (or NVO-based
networks in general) inter-subnet connectivity scenarios described in networks in general) inter-subnet connectivity scenarios described in
this document. Using this new RT-5, an IP Prefix may be advertised this document. Using this new RT-5, an IP Prefix may be advertised
along with an Overlay Index that can be a GW IP address, a MAC or an along with an Overlay Index that can be a GW IP address, a MAC or an
ESI, or without an Overlay Index, in which case the BGP next-hop will ESI, or without an Overlay Index, in which case the BGP next-hop will
point at the egress NVE/ASBR/ABR and the MAC in the Router's MAC point at the egress NVE/ASBR/ABR and the MAC in the Router's MAC
Extended Community will provide the inner MAC destination address to Extended Community will provide the inner MAC destination address to
be used. As discussed throughout the document, the EVPN RT-2 does not be used. As discussed throughout the document, the EVPN RT-2 does not
meet the requirements for all the DC use cases, therefore this EVPN meet the requirements for all the DC use cases, therefore this EVPN
route type 5 is required. route type 5 is required.
The EVPN route type 5 decouples the IP Prefix advertisements from the The EVPN route type 5 decouples the IP Prefix advertisements from the
MAC/IP route advertisements in EVPN, hence: MAC/IP route advertisements in EVPN, hence:
a) Allows the clean and clear advertisements of ipv4 or ipv6 prefixes a) Allows the clean and clear advertisements of IPv4 or IPv6 prefixes
in an NLRI with no MAC addresses. in an NLRI with no MAC addresses.
b) Since the route type is different from the MAC/IP Advertisement b) Since the route type is different from the MAC/IP Advertisement
route, the current [RFC7432] procedures do not need to be route, the current [RFC7432] procedures do not need to be
modified. modified.
c) Allows a flexible implementation where the prefix can be linked to c) Allows a flexible implementation where the prefix can be linked to
different types of Overlay/Underlay Indexes: overlay IP address, different types of Overlay/Underlay Indexes: overlay IP address,
overlay MAC addresses, overlay ESI, underlay BGP next-hops, etc. overlay MAC addresses, overlay ESI, underlay BGP next-hops, etc.
d) An EVPN implementation not requiring IP Prefixes can simply d) An EVPN implementation not requiring IP Prefixes can simply
discard them by looking at the route type value. discard them by looking at the route type value.
6. Conventions used in this document 6. Security Considerations
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", This document provides a set of procedures to achieve Inter-Subnet
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this Forwarding across NVEs or PEs attached to a group of BDs that belong
document are to be interpreted as described in RFC-2119 [RFC2119]. to the same tenant (or VPN). The security considerations discussed in
[RFC7432] apply to the Intra-Subnet Forwarding or communication
within each of those BDs. In addition, the security considerations in
[RFC4364] should also be understood, since this document and
[RFC4364] may be used in similar applications.
7. Security Considerations Contrary to [RFC4364], this document does not describe PE/CE route
distribution techniques, but rather considers the CEs as TSes or VAs
that do not run dynamic routing protocols. This can be considered a
security advantage, since dynamic routing protocols can be blocked on
the NVE/PE ACs.
The security considerations discussed in [RFC7432] apply to this In this document, the RT-5 may use a regular BGP Next Hop for its
document. resolution or an Overlay Index that requires a recursive resolution
to a different EVPN route (an RT-2 or an RT-1). In the latter case,
it is worth noting that any action that end up filtering or modifying
the RT-2/RT-1 routes used to convey the Overlay Indexes, will modify
the resolution of the RT-5 and therefore the forwarding of packets to
the remote subnet.
8. IANA Considerations 7. IANA Considerations
As requested by this document, value 5 in the "EVPN Route Types" As requested by this document, value 5 in the "EVPN Route Types"
registry defined by [RFC7432] has been allocated: registry defined by [RFC7432] has been allocated:
Value Description Reference Value Description Reference
5 IP Prefix route [this document] 5 IP Prefix route [this document]
9. References 8. References
9.1 Normative References
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 8.1 Normative References
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006,
<http://www.rfc-editor.org/info/rfc4364>.
[RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A.,
Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet
VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, <http://www.rfc- VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, <http://www.rfc-
editor.org/info/rfc7432>. editor.org/info/rfc7432>.
[RFC7606] Chen, E., Scudder, J., Mohapatra, P., and K. Patel,
"Revised Error Handling for BGP UPDATE Messages", RFC 7606, August
2015, <http://www.rfc-editor.org/info/rfc7606>.
[RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation
Subsequent Address Family Identifier (SAFI) and the BGP Tunnel Subsequent Address Family Identifier (SAFI) and the BGP Tunnel
Encapsulation Attribute", RFC 5512, DOI 10.17487/RFC5512, April 2009, Encapsulation Attribute", RFC 5512, DOI 10.17487/RFC5512, April 2009,
<http://www.rfc-editor.org/info/rfc5512>. <http://www.rfc-editor.org/info/rfc5512>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March
1997, <http://www.rfc-editor.org/info/rfc2119>. 1997, <http://www.rfc-editor.org/info/rfc2119>.
9.2 Informative References [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC2119
Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017,
<http://www.rfc-editor.org/info/rfc8174>.
[EVPN-OVERLAY] Sajassi-Drake et al., "A Network Virtualization
Overlay Solution using EVPN", draft-ietf-bess-evpn-overlay-12.txt,
work in progress, February, 2018.
[EVPN-INTERSUBNET] Sajassi et al., "IP Inter-Subnet Forwarding in [EVPN-INTERSUBNET] Sajassi et al., "IP Inter-Subnet Forwarding in
EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03.txt, work in EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03.txt, work in
progress, February, 2017 progress, February, 2017
[EVPN-OVERLAY] Sajassi-Drake et al., "A Network Virtualization 8.2 Informative References
Overlay Solution using EVPN", draft-ietf-bess-evpn-overlay-08.txt,
work in progress, March, 2017
10. Acknowledgments [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006,
<http://www.rfc-editor.org/info/rfc4364>.
[RFC7606] Chen, E., Scudder, J., Mohapatra, P., and K. Patel,
"Revised Error Handling for BGP UPDATE Messages", RFC 7606, August
2015, <http://www.rfc-editor.org/info/rfc7606>.
[802.1D-REV] "IEEE Standard for Local and metropolitan area networks
- Media Access Control (MAC) Bridges", IEEE Std. 802.1D, June 2004.
[802.1Q] "IEEE Standard for Local and metropolitan area networks -
Media Access Control (MAC) Bridges and Virtual Bridged Local Area
Networks", IEEE Std 802.1Q(tm), 2014 Edition, November 2014.
9. Acknowledgments
The authors would like to thank Mukul Katiyar and Jeffrey Zhang for The authors would like to thank Mukul Katiyar and Jeffrey Zhang for
their valuable feedback and contributions. The following people also their valuable feedback and contributions. The following people also
helped improving this document with their feedback: Tony Przygienda helped improving this document with their feedback: Tony Przygienda
and Thomas Morin. Special THANK YOU to Eric Rosen for his detailed and Thomas Morin. Special THANK YOU to Eric Rosen for his detailed
review, it really helped improve the readability and clarify the review, it really helped improve the readability and clarify the
concepts. concepts. Thank you to Alvaro Retana for his thorough review.
11. Contributors 10. Contributors
In addition to the authors listed on the front page, the following In addition to the authors listed on the front page, the following
co-authors have also contributed to this document: co-authors have also contributed to this document:
Senthil Sathappan Senthil Sathappan
Florin Balus Florin Balus
Aldrin Isaac Aldrin Isaac
Senad Palislamovic Senad Palislamovic
Samir Thoria
12. Authors' Addresses 11. Authors' Addresses
Jorge Rabadan (Editor) Jorge Rabadan (Editor)
Nokia Nokia
777 E. Middlefield Road 777 E. Middlefield Road
Mountain View, CA 94043 USA Mountain View, CA 94043 USA
Email: jorge.rabadan@nokia.com Email: jorge.rabadan@nokia.com
Wim Henderickx Wim Henderickx
Nokia Nokia
Email: wim.henderickx@nokia.com Email: wim.henderickx@nokia.com
 End of changes. 106 change blocks. 
181 lines changed or deleted 253 lines changed or added

This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/