draft-ietf-bess-evpn-optimized-ir-04.txt   draft-ietf-bess-evpn-optimized-ir-05.txt 
skipping to change at page 1, line 16 skipping to change at page 1, line 16
W. Lin W. Lin
Juniper Juniper
M. Katiyar M. Katiyar
Versa Networks Versa Networks
A. Sajassi A. Sajassi
Cisco Cisco
Expires: April 3, 2019 September 30, 2018 Expires: April 21, 2019 October 18, 2018
Optimized Ingress Replication solution for EVPN Optimized Ingress Replication solution for EVPN
draft-ietf-bess-evpn-optimized-ir-04 draft-ietf-bess-evpn-optimized-ir-05
Abstract Abstract
Network Virtualization Overlay (NVO) networks using EVPN as control Network Virtualization Overlay (NVO) networks using EVPN as control
plane may use ingress replication (IR) or PIM-based trees to convey plane may use Ingress Replication (IR) or PIM (Protocol Independent
the overlay BUM traffic. PIM provides an efficient solution to avoid Multicast) based trees to convey the overlay BUM traffic. PIM
sending multiple copies of the same packet over the same physical provides an efficient solution to avoid sending multiple copies of
link, however it may not always be deployed in the NVO core network. the same packet over the same physical link, however it may not
IR avoids the dependency on PIM in the NVO network core. While IR always be deployed in the NVO core network. IR avoids the dependency
provides a simple multicast transport, some NVO networks with on PIM in the NVO network core. While IR provides a simple multicast
demanding multicast applications require a more efficient solution transport, some NVO networks with demanding multicast applications
without PIM in the core. This document describes a solution to require a more efficient solution without PIM in the core. This
optimize the efficiency of IR in NVO networks. document describes a solution to optimize the efficiency of IR in NVO
networks.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 2, line 13 skipping to change at page 2, line 14
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on April 3, 2019. This Internet-Draft will expire on April 21, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Solution requirements . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology and Conventions . . . . . . . . . . . . . . . . . . 4
3. EVPN BGP Attributes for optimized-IR . . . . . . . . . . . . . 5 3. Solution requirements . . . . . . . . . . . . . . . . . . . . . 5
4. Non-selective Assisted-Replication (AR) Solution Description . 8 4. EVPN BGP Attributes for optimized-IR . . . . . . . . . . . . . 6
4.1. Non-selective AR-REPLICATOR procedures . . . . . . . . . . 8 5. Non-selective Assisted-Replication (AR) Solution Description . 9
4.2. Non-selective AR-LEAF procedures . . . . . . . . . . . . . 9 5.1. Non-selective AR-REPLICATOR procedures . . . . . . . . . . 10
4.3. RNVE procedures . . . . . . . . . . . . . . . . . . . . . . 11 5.2. Non-selective AR-LEAF procedures . . . . . . . . . . . . . 11
4.4. Forwarding behavior in non-selective AR EVIs . . . . . . . 11 5.3. RNVE procedures . . . . . . . . . . . . . . . . . . . . . . 12
4.4.1. Broadcast and Multicast forwarding behavior . . . . . . 11 5.4. Forwarding behavior in non-selective AR EVIs . . . . . . . 13
4.4.1.1. Non-selective AR-REPLICATOR BM forwarding . . . . . 11 5.4.1. Broadcast and Multicast forwarding behavior . . . . . . 13
4.4.1.2. Non-selective AR-LEAF BM forwarding . . . . . . . . 12 5.4.1.1. Non-selective AR-REPLICATOR BM forwarding . . . . . 13
4.4.1.3. RNVE BM forwarding . . . . . . . . . . . . . . . . 12 5.4.1.2. Non-selective AR-LEAF BM forwarding . . . . . . . . 14
4.4.2. Unknown unicast forwarding behavior . . . . . . . . . . 13 5.4.1.3. RNVE BM forwarding . . . . . . . . . . . . . . . . 14
4.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast 5.4.2. Unknown unicast forwarding behavior . . . . . . . . . . 14
forwarding . . . . . . . . . . . . . . . . . . . . 13 5.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast
4.4.2.2. RNVE Unknown unicast forwarding . . . . . . . . . . 13 forwarding . . . . . . . . . . . . . . . . . . . . 15
5. Selective Assisted-Replication (AR) Solution Description . . . 13 5.4.2.2. RNVE Unknown unicast forwarding . . . . . . . . . . 15
5.1. Selective AR-REPLICATOR procedures . . . . . . . . . . . . 14
5.2. Selective AR-LEAF procedures . . . . . . . . . . . . . . . 15 6. Selective Assisted-Replication (AR) Solution Description . . . 15
5.3. Forwarding behavior in selective AR EVIs . . . . . . . . . 16 6.1. Selective AR-REPLICATOR procedures . . . . . . . . . . . . 15
5.3.1. Selective AR-REPLICATOR BM forwarding . . . . . . . . . 16 6.2. Selective AR-LEAF procedures . . . . . . . . . . . . . . . 17
5.3.2. Selective AR-LEAF BM forwarding . . . . . . . . . . . . 17 6.3. Forwarding behavior in selective AR EVIs . . . . . . . . . 18
6. Pruned-Flood-Lists (PFL) . . . . . . . . . . . . . . . . . . . 18 6.3.1. Selective AR-REPLICATOR BM forwarding . . . . . . . . . 18
6.1. A PFL example . . . . . . . . . . . . . . . . . . . . . . . 18 6.3.2. Selective AR-LEAF BM forwarding . . . . . . . . . . . . 19
7. AR Procedures for single-IP AR-REPLICATORS . . . . . . . . . . 19 7. Pruned-Flood-Lists (PFL) . . . . . . . . . . . . . . . . . . . 20
8. AR Procedures and EVPN All-Active Multi-homing Split-Horizon . 20 7.1. A PFL example . . . . . . . . . . . . . . . . . . . . . . . 20
8.1. Ethernet Segments on AR-LEAF nodes . . . . . . . . . . . . 20 8. AR Procedures for single-IP AR-REPLICATORS . . . . . . . . . . 21
8.2. Ethernet Segments on AR-REPLICATOR nodes . . . . . . . . . 21 9. AR Procedures and EVPN All-Active Multi-homing Split-Horizon . 22
9. Benefits of the optimized-IR solution . . . . . . . . . . . . . 21 9.1. Ethernet Segments on AR-LEAF nodes . . . . . . . . . . . . 22
10. Conventions used in this document . . . . . . . . . . . . . . 21 9.2. Ethernet Segments on AR-REPLICATOR nodes . . . . . . . . . 23
11. Security Considerations . . . . . . . . . . . . . . . . . . . 22 10. Benefits of the optimized-IR solution . . . . . . . . . . . . 23
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 11. Security Considerations . . . . . . . . . . . . . . . . . . . 24
13. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 22 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24
14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
14.1 Normative References . . . . . . . . . . . . . . . . . . . 23 13.1 Normative References . . . . . . . . . . . . . . . . . . . 24
14.2 Informative References . . . . . . . . . . . . . . . . . . 24 13.2 Informative References . . . . . . . . . . . . . . . . . . 25
15.0 Contributors . . . . . . . . . . . . . . . . . . . . . . . 24 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 25
16. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 24 15. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25
17. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 24 16. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 25
1. Introduction 1. Introduction
EVPN may be used as the control plane for a Network Virtualization Ethernet Virtual Private Networks (EVPN) may be used as the control
Overlay (NVO) network. Network Virtualization Edge (NVE) devices and plane for a Network Virtualization Overlay (NVO) network. Network
PEs that are part of the same EVI use Ingress Replication (IR) or Virtualization Edge (NVE) devices and Provider Edges (PEs) that are
part of the same EVPN Instance (EVI) use Ingress Replication (IR) or
PIM-based trees to transport the tenant's BUM traffic. In NVO PIM-based trees to transport the tenant's BUM traffic. In NVO
networks where PIM-based trees cannot be used, IR is the only networks where PIM-based trees cannot be used, IR is the only option.
alternative. Examples of these situations are NVO networks where the Examples of these situations are NVO networks where the core nodes
core nodes don't support PIM or the network operator does not want to don't support PIM or the network operator does not want to run PIM in
run PIM in the core. the core.
In some use-cases, the amount of replication for BUM (Broadcast, In some use-cases, the amount of replication for BUM (Broadcast,
Unknown unicast and Multicast traffic) is kept under control on the Unknown unicast and Multicast traffic) is kept under control on the
NVEs due to the following fairly common assumptions: NVEs due to the following fairly common assumptions:
a) Broadcast is greatly reduced due to the proxy-ARP and proxy-ND a) Broadcast is greatly reduced due to the proxy ARP (Address
Resolution Protocol) and proxy ND (Neighbor Discovery)
capabilities supported by EVPN on the NVEs. Some NVEs can even capabilities supported by EVPN on the NVEs. Some NVEs can even
provide DHCP-server functions for the attached Tenant Systems (TS) provide Dynamic Host Configuration Protocol(DHCP) server functions
reducing the broadcast even further. for the attached Tenant Systems (TS) reducing the broadcast even
further.
b) Unknown unicast traffic is greatly reduced in virtualized NVO b) Unknown unicast traffic is greatly reduced in virtualized NVO
networks where all the MAC and IP addresses are learnt in the networks where all the MAC and IP addresses are learnt in the
control plane. control plane.
c) Multicast applications are not used. c) Multicast applications are not used.
If the above assumptions are true for a given NVO network, then IR If the above assumptions are true for a given NVO network, then IR
provides a simple solution for multi-destination traffic. However, provides a simple solution for multi-destination traffic. However,
the statement c) above is not always true and multicast applications the statement c) above is not always true and multicast applications
are required in many use-cases. are required in many use-cases.
When the multicast sources are attached to NVEs residing in When the multicast sources are attached to NVEs residing in
hypervisors or low-performance-replication TORs, the ingress hypervisors or low-performance-replication TORs Top Of the Rack
replication of a large amount of multicast traffic to a significant switches), the ingress replication of a large amount of multicast
number of remote NVEs/PEs can seriously degrade the performance of traffic to a significant number of remote NVEs/PEs can seriously
the NVE and impact the application. degrade the performance of the NVE and impact the application.
This document describes a solution that makes use of two IR This document describes a solution that makes use of two IR
optimizations: optimizations:
i) Assisted-Replication (AR) i) Assisted-Replication (AR)
ii) Pruned-Flood-Lists (PFL) ii) Pruned-Flood-Lists (PFL)
Both optimizations may be used together or independently so that the Both optimizations may be used together or independently so that the
performance and efficiency of the network to transport multicast can performance and efficiency of the network to transport multicast can
be improved. Both solutions require some extensions to [RFC7432] that be improved. Both solutions require some extensions to [RFC7432] that
are described in section 3. are described in section 3.
Section 2 lists the requirements of the combined optimized-IR Section 2 lists the requirements of the combined optimized-IR
solution, whereas sections 4 and 5 describe the Assisted-Replication solution, whereas sections 4 and 5 describe the Assisted-Replication
(AR) solution, and section 6 the Pruned-Flood-Lists (PFL) solution. (AR) solution, and section 6 the Pruned-Flood-Lists (PFL) solution.
2. Solution requirements 2. Terminology and Conventions
The IR optimization solution (optimized-IR hereafter) MUST meet the The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
following requirements: "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
a) The solution MUST provide an IR optimization for BM (Broadcast and The following terminology is used throughout the document:
AC: Attachment Circuit
Regular-IR: Refers to Regular Ingress Replication, where the source
NVE/PE sends a copy to each remote NVE/PE part of the EVI.
AR-IP: IP address owned by the AR-REPLICATOR and used to
differentiate the ingress traffic that must follow the AR
procedures.
IR-IP: IP address used for Ingress Replication as in [RFC7432].
AR-VNI: VNI advertised by the AR-REPLICATOR along with the
Replicator-AR route. It is used to identify the ingress
packets that must follow AR procedures ONLY in the Single-IP
AR-REPLICATOR case.
IR-VNI: VNI advertised along with the RT-3 for IR.
AR forwarding mode: for an AR-LEAF, it means sending an AC BM packet
to a single AR-REPLICATOR with tunnel destination IP AR-IP.
For an AR-REPLICATOR, it means sending a BM packet to a
selective number or all the overlay tunnels when the packet
was previously received from an overlay tunnel.
IR forwarding mode: it refers to the Ingress Replication behavior
explained in [RFC7432]. It means sending an AC BM packet copy
to each remote PE/NVE in the EVI and sending an overlay BM
packet only to the ACs and not other overlay tunnels.
PTA: PMSI Tunnel Attribute
RT-3: EVPN Route Type 3, Inclusive Multicast Ethernet Tag route
RT-11: EVPN Route Type 11, Leaf Auto-Discovery (AD) route
VXLAN: Virtual Extensible LAN
GRE: Generic Routing Encapsulation
NVGRE: Network Virtualization using Generic Routing Encapsulation
GENEVE: Generic Network Virtualization Encapsulation
NVO: Network Virtualization Overlay
NVE: Network Virtualization Edge
VNI: VXLAN Network Identifier
EVI: EVPN Instance. An EVPN instance spanning the Provider Edge (PE)
devices participating in that EVPN
3. Solution requirements
The IR optimization solution specified in this document (optimized-IR
hereafter) meets the following requirements:
a) The solution provides an IR optimization for BM (Broadcast and
Multicast) traffic, while preserving the packet order for unicast Multicast) traffic, while preserving the packet order for unicast
applications, i.e. known and unknown unicast traffic SHALL follow applications, i.e., known and unknown unicast traffic should
the same path. follow the same path.
b) The solution MUST be compatible with [RFC7432] and [RFC8365] and b) The solution is compatible with [RFC7432] and [RFC8365] and has no
have no impact on the EVPN procedures for BM traffic. In impact on the EVPN procedures for BM traffic. In particular, the
particular, the solution SHOULD support the following EVPN solution supports the following EVPN functions:
functions:
o All-active multi-homing, including the split-horizon and o All-active multi-homing, including the split-horizon and
Designated Forwarder (DF) functions. Designated Forwarder (DF) functions.
o Single-active multi-homing, including the DF function. o Single-active multi-homing, including the DF function.
o Handling of multi-destination traffic and processing of o Handling of multi-destination traffic and processing of
broadcast and multicast as per [RFC7432]. broadcast and multicast as per [RFC7432].
c) The solution MUST be backwards compatible with existing NVEs using c) The solution is backwards compatible with existing NVEs using a
a non-optimized version of IR. A given EVI can have NVEs/PEs non-optimized version of IR. A given EVI can have NVEs/PEs
supporting regular-IR and optimized-IR. supporting regular-IR and optimized-IR.
d) The solution MUST be independent of the NVO specific data plane d) The solution is independent of the NVO specific data plane
encapsulation and the virtual identifiers being used, e.g.: VXLAN encapsulation and the virtual identifiers being used, e.g.: VXLAN
VNIs, NVGRE VSIDs or MPLS labels. VNIs, NVGRE VSIDs or MPLS labels, as long as the tunnel is IP-
based.
3. EVPN BGP Attributes for optimized-IR 4. EVPN BGP Attributes for optimized-IR
This solution proposes some changes to the [RFC7432] Inclusive This solution extends the [RFC7432] Inclusive Multicast Ethernet Tag
Multicast Ethernet Tag routes and attributes so that an NVE/PE can routes and attributes so that an NVE/PE can signal its optimized-IR
signal its optimized-IR capabilities. capabilities.
The Inclusive Multicast Ethernet Tag route (RT-3) and its PMSI Tunnel The Inclusive Multicast Ethernet Tag route (RT-3) and its PMSI Tunnel
Attribute's (PTA) general format used in [RFC7432] are shown below: Attribute's (PTA) general format used in [RFC7432] are shown below:
+---------------------------------+ +---------------------------------+
| RD (8 octets) | | RD (8 octets) |
+---------------------------------+ +---------------------------------+
| Ethernet Tag ID (4 octets) | | Ethernet Tag ID (4 octets) |
+---------------------------------+ +---------------------------------+
| IP Address Length (1 octet) | | IP Address Length (1 octet) |
skipping to change at page 5, line 50 skipping to change at page 7, line 30
+---------------------------------+ +---------------------------------+
| MPLS Label (3 octets) | | MPLS Label (3 octets) |
+---------------------------------+ +---------------------------------+
| Tunnel Identifier (variable) | | Tunnel Identifier (variable) |
+---------------------------------+ +---------------------------------+
The Flags field is defined as follows: The Flags field is defined as follows:
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+--+-+-+ +-+-+-+-+-+--+-+-+
|rsved| T |BM|U|L| |rsvd | T |BM|U|L|
+-+-+-+-+-+--+-+-+ +-+-+-+-+-+--+-+-+
Where a new type field (for AR) and two new flags (for PFL signaling) Where a new type field (for AR) and two new flags (for PFL signaling)
are defined: are defined:
- T is the AR Type field (2 bits) that defines the AR role of the - T is the AR Type field (2 bits) that defines the AR role of the
advertising router: advertising router:
+ 00 (decimal 0) = RNVE (non-AR support) + 00 (decimal 0) = RNVE (non-AR support)
skipping to change at page 8, line 5 skipping to change at page 9, line 32
flags. Note that these BM/U flags may be used to optimize the flags. Note that these BM/U flags may be used to optimize the
delivery of multi-destination traffic and its use SHOULD be an delivery of multi-destination traffic and its use SHOULD be an
administrative choice, and independent of the AR role. administrative choice, and independent of the AR role.
Non-optimized-IR nodes will be unaware of the new PMSI attribute flag Non-optimized-IR nodes will be unaware of the new PMSI attribute flag
definition as well as the new Tunnel Type (AR), i.e. they will ignore definition as well as the new Tunnel Type (AR), i.e. they will ignore
the information contained in the flags field for any RT-3 and will the information contained in the flags field for any RT-3 and will
ignore the RT-3 routes with an unknown Tunnel Type (type AR in this ignore the RT-3 routes with an unknown Tunnel Type (type AR in this
case). case).
4. Non-selective Assisted-Replication (AR) Solution Description 5. Non-selective Assisted-Replication (AR) Solution Description
The following figure illustrates an example NVO network where the The following figure illustrates an example NVO network where the
non-selective AR function is enabled. Three different roles are non-selective AR function is enabled. Three different roles are
defined for a given EVI: AR-REPLICATOR, AR-LEAF and RNVE (Regular defined for a given EVI: AR-REPLICATOR, AR-LEAF and RNVE (Regular
NVE). The solution is called "non-selective" because the chosen AR- NVE). The solution is called "non-selective" because the chosen AR-
REPLICATOR for a given flow MUST replicate the multicast traffic to REPLICATOR for a given flow MUST replicate the multicast traffic to
'all' the NVE/PEs in the EVI except for the source NVE/PE. 'all' the NVE/PEs in the EVI except for the source NVE/PE.
( ) ( )
(_ WAN _) (_ WAN _)
skipping to change at page 8, line 41 skipping to change at page 10, line 32
Hypervisor| TOR | NVE2 |Hypervisor Hypervisor| TOR | NVE2 |Hypervisor
+---------+-+ +-----+-----+ +-+---------+ +---------+-+ +-----+-----+ +-+---------+
| (EVI-1) | | (EVI-1) | | (EVI-1) | | (EVI-1) | | (EVI-1) | | (EVI-1) |
| LEAF | | RNVE | | LEAF | | LEAF | | RNVE | | LEAF |
+--+-----+--+ +--+-----+--+ +--+-----+--+ +--+-----+--+ +--+-----+--+ +--+-----+--+
| | | | | | | | | | | |
VM11 VM12 TS3 TS4 VM31 VM32 VM11 VM12 TS3 TS4 VM31 VM32
Figure 1 Optimized-IR scenario Figure 1 Optimized-IR scenario
4.1. Non-selective AR-REPLICATOR procedures 5.1. Non-selective AR-REPLICATOR procedures
An AR-REPLICATOR is defined as an NVE/PE capable of replicating An AR-REPLICATOR is defined as an NVE/PE capable of replicating
ingress BM (Broadcast and Multicast) traffic received on an overlay ingress BM (Broadcast and Multicast) traffic received on an overlay
tunnel to other overlay tunnels and local Attachment Circuits (ACs). tunnel to other overlay tunnels and local Attachment Circuits (ACs).
The AR-REPLICATOR signals its role in the control plane and The AR-REPLICATOR signals its role in the control plane and
understands where the other roles (AR-LEAF nodes, RNVEs and other AR- understands where the other roles (AR-LEAF nodes, RNVEs and other AR-
REPLICATORs) are located. A given AR-enabled EVI service may have REPLICATORs) are located. A given AR-enabled EVI service may have
zero, one or more AR-REPLICATORs. In our example in figure 1, PE1 and zero, one or more AR-REPLICATORs. In our example in figure 1, PE1 and
PE2 are defined as AR-REPLICATORs. The following considerations apply PE2 are defined as AR-REPLICATORs. The following considerations apply
to the AR-REPLICATOR role: to the AR-REPLICATOR role:
skipping to change at page 9, line 36 skipping to change at page 11, line 27
o If the destination IP is the AR-REPLICATOR AR-IP Address the o If the destination IP is the AR-REPLICATOR AR-IP Address the
node MUST replicate the packet to local ACs and overlay node MUST replicate the packet to local ACs and overlay
tunnels (excluding the overlay tunnel to the source of the tunnels (excluding the overlay tunnel to the source of the
packet). When replicating to remote AR-REPLICATORs the tunnel packet). When replicating to remote AR-REPLICATORs the tunnel
destination IP will be an IR-IP. That will be an indication destination IP will be an IR-IP. That will be an indication
for the remote AR-REPLICATOR that it MUST NOT replicate to for the remote AR-REPLICATOR that it MUST NOT replicate to
overlay tunnels. The tunnel source IP used by the AR- overlay tunnels. The tunnel source IP used by the AR-
REPLICATOR MUST be its IR-IP. REPLICATOR MUST be its IR-IP.
4.2. Non-selective AR-LEAF procedures 5.2. Non-selective AR-LEAF procedures
AR-LEAF is defined as an NVE/PE that - given its poor replication AR-LEAF is defined as an NVE/PE that - given its poor replication
performance - sends all the BM traffic to an AR-REPLICATOR that can performance - sends all the BM traffic to an AR-REPLICATOR that can
replicate the traffic further on its behalf. It MAY signal its AR- replicate the traffic further on its behalf. It MAY signal its AR-
LEAF capability in the control plane and understands where the other LEAF capability in the control plane and understands where the other
roles are located (AR-REPLICATOR and RNVEs). A given service can have roles are located (AR-REPLICATOR and RNVEs). A given service can have
zero, one or more AR-LEAF nodes. Figure 1 shows NVE1 and NVE3 (both zero, one or more AR-LEAF nodes. Figure 1 shows NVE1 and NVE3 (both
residing in hypervisors) acting as AR-LEAF. The following residing in hypervisors) acting as AR-LEAF. The following
considerations apply to the AR-LEAF role: considerations apply to the AR-LEAF role:
skipping to change at page 11, line 8 skipping to change at page 12, line 49
NOT replicate these control plane packets to other overlay NOT replicate these control plane packets to other overlay
tunnels since they will use the regular IR-IP Address. tunnels since they will use the regular IR-IP Address.
e) The use of an AR-REPLICATOR-activation-timer (in seconds) on the e) The use of an AR-REPLICATOR-activation-timer (in seconds) on the
AR-LEAF nodes is RECOMMENDED. Upon receiving a new Replicator-AR AR-LEAF nodes is RECOMMENDED. Upon receiving a new Replicator-AR
route where the AR-REPLICATOR is selected, the AR-LEAF will run a route where the AR-REPLICATOR is selected, the AR-LEAF will run a
timer before programming the new AR-REPLICATOR. This will give the timer before programming the new AR-REPLICATOR. This will give the
AR-REPLICATOR some time to program the AR-LEAF nodes before the AR-REPLICATOR some time to program the AR-LEAF nodes before the
AR-LEAF sends BM traffic. AR-LEAF sends BM traffic.
4.3. RNVE procedures 5.3. RNVE procedures
RNVE (Regular Network Virtualization Edge node) is defined as an RNVE (Regular Network Virtualization Edge node) is defined as an
NVE/PE without AR-REPLICATOR or AR-LEAF capabilities that does IR as NVE/PE without AR-REPLICATOR or AR-LEAF capabilities that does IR as
described in [RFC7432]. The RNVE does not signal any AR role and is described in [RFC7432]. The RNVE does not signal any AR role and is
unaware of the AR-REPLICATOR/LEAF roles in the EVI. The RNVE will unaware of the AR-REPLICATOR/LEAF roles in the EVI. The RNVE will
ignore the Flags in the Regular-IR routes and will ignore the ignore the Flags in the Regular-IR routes and will ignore the
Replicator-AR routes (due to an unknown tunnel type in the PTA) and Replicator-AR routes (due to an unknown tunnel type in the PTA) and
the Leaf-AD routes (due to the IP-address-specific route-target). the Leaf-AD routes (due to the IP-address-specific route-target).
This role provides EVPN with the backwards compatibility required in This role provides EVPN with the backwards compatibility required in
optimized-IR EVIs. Figure 1 shows NVE2 as RNVE. optimized-IR EVIs. Figure 1 shows NVE2 as RNVE.
4.4. Forwarding behavior in non-selective AR EVIs 5.4. Forwarding behavior in non-selective AR EVIs
In AR EVIs, BM (Broadcast and Multicast) traffic between two NVEs may In AR EVIs, BM (Broadcast and Multicast) traffic between two NVEs may
follow a different path than unicast traffic. This solution proposes follow a different path than unicast traffic. This solution
the replication of BM through the AR-REPLICATOR node, whereas recommends the replication of BM through the AR-REPLICATOR node,
unknown/known unicast will be delivered directly from the source node whereas unknown/known unicast will be delivered directly from the
to the destination node without being replicated by any intermediate source node to the destination node without being replicated by any
node. Unknown unicast SHALL follow the same path as known unicast intermediate node. Unknown unicast SHALL follow the same path as
traffic in order to avoid packet reordering for unicast applications known unicast traffic in order to avoid packet reordering for unicast
and simplify the control and data plane procedures. Section 4.4.1. applications and simplify the control and data plane procedures.
describes the expected forwarding behavior for BM traffic in nodes Section 4.4.1. describes the expected forwarding behavior for BM
acting as AR-REPLICATOR, AR-LEAF and RNVE. Section 4.4.2. describes traffic in nodes acting as AR-REPLICATOR, AR-LEAF and RNVE. Section
the forwarding behavior for unknown unicast traffic. 4.4.2. describes the forwarding behavior for unknown unicast traffic.
Note that known unicast forwarding is not impacted by this solution. Note that known unicast forwarding is not impacted by this solution.
4.4.1. Broadcast and Multicast forwarding behavior 5.4.1. Broadcast and Multicast forwarding behavior
The expected behavior per role is described in this section. The expected behavior per role is described in this section.
4.4.1.1. Non-selective AR-REPLICATOR BM forwarding 5.4.1.1. Non-selective AR-REPLICATOR BM forwarding
The AR-REPLICATORs will build a flooding list composed of ACs and The AR-REPLICATORs will build a flooding list composed of ACs and
overlay tunnels to remote nodes in the EVI. Some of those overlay overlay tunnels to remote nodes in the EVI. Some of those overlay
tunnels MAY be flagged as non-BM receivers based on the BM flag tunnels MAY be flagged as non-BM receivers based on the BM flag
received from the remote nodes in the EVI. received from the remote nodes in the EVI.
o When an AR-REPLICATOR receives a BM packet on an AC, it will o When an AR-REPLICATOR receives a BM packet on an AC, it will
forward the BM packet to its flooding list (including local ACs and forward the BM packet to its flooding list (including local ACs and
remote NVE/PEs), skipping the non-BM overlay tunnels. remote NVE/PEs), skipping the non-BM overlay tunnels.
skipping to change at page 12, line 19 skipping to change at page 14, line 11
forward the BM packet to its flooding list (ACs and overlay forward the BM packet to its flooding list (ACs and overlay
tunnels) excluding the non-BM overlay tunnels. The AR-REPLICATOR tunnels) excluding the non-BM overlay tunnels. The AR-REPLICATOR
will do source squelching to ensure the traffic is not sent back will do source squelching to ensure the traffic is not sent back
to the originating AR-LEAF. to the originating AR-LEAF.
- If the destination IP matches its IR-IP, the AR-REPLICATOR will - If the destination IP matches its IR-IP, the AR-REPLICATOR will
skip all the overlay tunnels from the flooding list, i.e. it skip all the overlay tunnels from the flooding list, i.e. it
will only replicate to local ACs. This is the regular IR will only replicate to local ACs. This is the regular IR
behavior described in [RFC7432]. behavior described in [RFC7432].
4.4.1.2. Non-selective AR-LEAF BM forwarding 5.4.1.2. Non-selective AR-LEAF BM forwarding
The AR-LEAF nodes will build two flood-lists: The AR-LEAF nodes will build two flood-lists:
1) Flood-list #1 - composed of ACs and an AR-REPLICATOR-set of 1) Flood-list #1 - composed of ACs and an AR-REPLICATOR-set of
overlay tunnels. The AR-REPLICATOR-set is defined as one or more overlay tunnels. The AR-REPLICATOR-set is defined as one or more
overlay tunnels to the AR-IP Addresses of the remote AR- overlay tunnels to the AR-IP Addresses of the remote AR-
REPLICATOR(s) in the EVI. The selection of more than one AR- REPLICATOR(s) in the EVI. The selection of more than one AR-
REPLICATOR is described in section 4.2. and it is a local AR- REPLICATOR is described in section 4.2. and it is a local AR-
LEAF decision. LEAF decision.
skipping to change at page 12, line 47 skipping to change at page 14, line 39
to flood-list #2. to flood-list #2.
o If the AR-REPLICATOR-set is NOT empty, the AR-LEAF will send the o If the AR-REPLICATOR-set is NOT empty, the AR-LEAF will send the
packet to flood-list #1, where only one of the overlay tunnels of packet to flood-list #1, where only one of the overlay tunnels of
the AR-REPLICATOR-set is used. the AR-REPLICATOR-set is used.
When an AR-LEAF receives a BM packet on an overlay tunnel, will When an AR-LEAF receives a BM packet on an overlay tunnel, will
forward the BM packet to its local ACs and never to an overlay forward the BM packet to its local ACs and never to an overlay
tunnel. This is the regular IR behavior described in [RFC7432]. tunnel. This is the regular IR behavior described in [RFC7432].
4.4.1.3. RNVE BM forwarding 5.4.1.3. RNVE BM forwarding
The RNVE is completely unaware of the AR-REPLICATORs, AR-LEAF nodes The RNVE is completely unaware of the AR-REPLICATORs, AR-LEAF nodes
and BM/U flags (that information is ignored). Its forwarding behavior and BM/U flags (that information is ignored). Its forwarding behavior
is the regular IR behavior described in [RFC7432]. Any regular non-AR is the regular IR behavior described in [RFC7432]. Any regular non-AR
node is fully compatible with the RNVE role described in this node is fully compatible with the RNVE role described in this
document. document.
4.4.2. Unknown unicast forwarding behavior 5.4.2. Unknown unicast forwarding behavior
The expected behavior is described in this section. The expected behavior is described in this section.
4.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast forwarding 5.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast forwarding
While the forwarding behavior in AR-REPLICATORs and AR-LEAF nodes is While the forwarding behavior in AR-REPLICATORs and AR-LEAF nodes is
different for BM traffic, as far as Unknown unicast traffic different for BM traffic, as far as Unknown unicast traffic
forwarding is concerned, AR-LEAF nodes behave exactly in the same way forwarding is concerned, AR-LEAF nodes behave exactly in the same way
as AR-REPLICATORs do. as AR-REPLICATORs do.
The AR-REPLICATOR/LEAF nodes will build a flood-list composed of ACs The AR-REPLICATOR/LEAF nodes will build a flood-list composed of ACs
and overlay tunnels to the IR-IP Addresses of the remote nodes in the and overlay tunnels to the IR-IP Addresses of the remote nodes in the
EVI. Some of those overlay tunnels MAY be flagged as non-U (Unknown EVI. Some of those overlay tunnels MAY be flagged as non-U (Unknown
unicast) receivers based on the U flag received from the remote nodes unicast) receivers based on the U flag received from the remote nodes
skipping to change at page 13, line 33 skipping to change at page 15, line 27
o When an AR-REPLICATOR/LEAF receives an unknown packet on an AC, it o When an AR-REPLICATOR/LEAF receives an unknown packet on an AC, it
will forward the unknown packet to its flood-list, skipping the will forward the unknown packet to its flood-list, skipping the
non-U overlay tunnels. non-U overlay tunnels.
o When an AR-REPLICATOR/LEAF receives an unknown packet on an overlay o When an AR-REPLICATOR/LEAF receives an unknown packet on an overlay
tunnel will forward the unknown packet to its local ACs and never tunnel will forward the unknown packet to its local ACs and never
to an overlay tunnel. This is the regular IR behavior described in to an overlay tunnel. This is the regular IR behavior described in
[RFC7432]. [RFC7432].
4.4.2.2. RNVE Unknown unicast forwarding 5.4.2.2. RNVE Unknown unicast forwarding
As described for BM traffic, the RNVE is completely unaware of the As described for BM traffic, the RNVE is completely unaware of the
REPLICATORs, LEAF nodes and BM/U flags (that information is ignored). REPLICATORs, LEAF nodes and BM/U flags (that information is ignored).
Its forwarding behavior is the regular IR behavior described in Its forwarding behavior is the regular IR behavior described in
[RFC7432], also for Unknown unicast traffic. Any regular non-AR node [RFC7432], also for Unknown unicast traffic. Any regular non-AR node
is fully compatible with the RNVE role described in this document. is fully compatible with the RNVE role described in this document.
5. Selective Assisted-Replication (AR) Solution Description 6. Selective Assisted-Replication (AR) Solution Description
Figure 1 is also used to describe the selective AR solution, however Figure 1 is also used to describe the selective AR solution, however
in this section we consider NVE2 as one more AR-LEAF for EVI-1. The in this section we consider NVE2 as one more AR-LEAF for EVI-1. The
solution is called "selective" because a given AR-REPLICATOR MUST solution is called "selective" because a given AR-REPLICATOR MUST
replicate the BM traffic to only the AR-LEAF that requested the replicate the BM traffic to only the AR-LEAF that requested the
replication (as opposed to all the AR-LEAF nodes) and MAY replicate replication (as opposed to all the AR-LEAF nodes) and MAY replicate
the BM traffic to the RNVEs. The same AR roles defined in section 4 the BM traffic to the RNVEs. The same AR roles defined in section 4
are used here, however the procedures are slightly different. are used here, however the procedures are slightly different.
The following sub-sections describe the differences in the procedures The following sub-sections describe the differences in the procedures
of AR-REPLICATOR/LEAFs compared to the non-selective AR solution. of AR-REPLICATOR/LEAFs compared to the non-selective AR solution.
There is no change on the RNVEs. There is no change on the RNVEs.
5.1. Selective AR-REPLICATOR procedures 6.1. Selective AR-REPLICATOR procedures
In our example in figure 1, PE1 and PE2 are defined as Selective AR- In our example in figure 1, PE1 and PE2 are defined as Selective AR-
REPLICATORs. The following considerations apply to the Selective AR- REPLICATORs. The following considerations apply to the Selective AR-
REPLICATOR role: REPLICATOR role:
a) The Selective AR-REPLICATOR capability SHOULD be an administrative a) The Selective AR-REPLICATOR capability SHOULD be an administrative
choice in any NVE/PE that is part of an AR-enabled EVI, as the AR choice in any NVE/PE that is part of an AR-enabled EVI, as the AR
role itself. This administrative option MAY be implemented as a role itself. This administrative option MAY be implemented as a
system level option as opposed to as a per-MAC-VRF option. system level option as opposed to as a per-MAC-VRF option.
b) Each AR-REPLICATOR will build a list of AR-REPLICATOR, AR-LEAF and b) Each AR-REPLICATOR will build a list of AR-REPLICATOR, AR-LEAF and
skipping to change at page 15, line 23 skipping to change at page 17, line 16
to all the RNVEs. to all the RNVEs.
+ overlay tunnels to the remote Selective AR-REPLICATORs if + overlay tunnels to the remote Selective AR-REPLICATORs if
the tunnel source IP is an IR-IP of its own AR-LEAF-set (in the tunnel source IP is an IR-IP of its own AR-LEAF-set (in
any other case, the AR-REPLICATOR MUST NOT replicate the BM any other case, the AR-REPLICATOR MUST NOT replicate the BM
traffic to remote AR-REPLICATORs), where the tunnel traffic to remote AR-REPLICATORs), where the tunnel
destination IP is the AR-IP of the remote Selective AR- destination IP is the AR-IP of the remote Selective AR-
REPLICATOR. The tunnel destination IP AR-IP will be an REPLICATOR. The tunnel destination IP AR-IP will be an
indication for the remote Selective AR-REPLICATOR that the indication for the remote Selective AR-REPLICATOR that the
packet needs further replication to its AR-LEAFs. packet needs further replication to its AR-LEAFs.
5.2. Selective AR-LEAF procedures 6.2. Selective AR-LEAF procedures
A Selective AR-LEAF chooses a single Selective AR-REPLICATOR per EVI A Selective AR-LEAF chooses a single Selective AR-REPLICATOR per EVI
and: and:
o Sends all the EVI BM traffic to that AR-REPLICATOR and o Sends all the EVI BM traffic to that AR-REPLICATOR and
o Expects to receive the BM traffic for a given EVI from the same AR- o Expects to receive the BM traffic for a given EVI from the same AR-
REPLICATOR. REPLICATOR.
In the example of Figure 1, we consider NVE1/NVE2/NVE3 as Selective In the example of Figure 1, we consider NVE1/NVE2/NVE3 as Selective
AR-LEAFs. NVE1 selects PE1 as its Selective AR-REPLICATOR. If that is AR-LEAFs. NVE1 selects PE1 as its Selective AR-REPLICATOR. If that is
skipping to change at page 16, line 34 skipping to change at page 18, line 27
timer expires, the Selective AR-LEAF will resume its AR mode timer expires, the Selective AR-LEAF will resume its AR mode
with the new Selective AR-REPLICATOR. with the new Selective AR-REPLICATOR.
All the AR-LEAFs in an EVI are expected to be configured as either All the AR-LEAFs in an EVI are expected to be configured as either
selective or non-selective. A mix of selective and non-selective AR- selective or non-selective. A mix of selective and non-selective AR-
LEAFs SHOULD NOT coexist in the same EVI. In case there is a non- LEAFs SHOULD NOT coexist in the same EVI. In case there is a non-
selective AR-LEAF, its BM traffic sent to a selective AR-REPLICATOR selective AR-LEAF, its BM traffic sent to a selective AR-REPLICATOR
will not be replicated to other AR-LEAFs that are not in its will not be replicated to other AR-LEAFs that are not in its
Selective AR-LEAF-set. Selective AR-LEAF-set.
5.3. Forwarding behavior in selective AR EVIs 6.3. Forwarding behavior in selective AR EVIs
This section describes the differences of the selective AR forwarding This section describes the differences of the selective AR forwarding
mode compared to the non-selective mode. Compared to section 4.4, mode compared to the non-selective mode. Compared to section 4.4,
there are no changes for the forwarding behavior in RNVEs or for there are no changes for the forwarding behavior in RNVEs or for
unknown unicast traffic. unknown unicast traffic.
5.3.1. Selective AR-REPLICATOR BM forwarding 6.3.1. Selective AR-REPLICATOR BM forwarding
The Selective AR-REPLICATORs will build two flood-lists: The Selective AR-REPLICATORs will build two flood-lists:
1) Flood-list #1 - composed of ACs and overlay tunnels to the 1) Flood-list #1 - composed of ACs and overlay tunnels to the
remote nodes in the EVI, always using the IR-IPs in the tunnel remote nodes in the EVI, always using the IR-IPs in the tunnel
destination IP addresses. Some of those overlay tunnels MAY be destination IP addresses. Some of those overlay tunnels MAY be
flagged as non-BM receivers based on the BM flag received from flagged as non-BM receivers based on the BM flag received from
the remote nodes in the EVI. the remote nodes in the EVI.
2) Flood-list #2 - composed of ACs, a Selective AR-LEAF-set and a 2) Flood-list #2 - composed of ACs, a Selective AR-LEAF-set and a
skipping to change at page 17, line 48 skipping to change at page 19, line 43
flooding list, i.e. it will only replicate to local ACs. This is flooding list, i.e. it will only replicate to local ACs. This is
the regular-IR behavior described in [RFC7432]. the regular-IR behavior described in [RFC7432].
In any case, non-BM overlay tunnels are excluded from flood-lists In any case, non-BM overlay tunnels are excluded from flood-lists
and, also, source squelching is always done in order to ensure the and, also, source squelching is always done in order to ensure the
traffic is not sent back to the originating source. If the traffic is not sent back to the originating source. If the
encapsulation is MPLSoGRE (or MPLSoUDP) and the EVI label is not the encapsulation is MPLSoGRE (or MPLSoUDP) and the EVI label is not the
bottom of the stack, the AR-REPLICATOR MUST copy the rest of the bottom of the stack, the AR-REPLICATOR MUST copy the rest of the
labels when forwarding them to the egress overlay tunnels. labels when forwarding them to the egress overlay tunnels.
5.3.2. Selective AR-LEAF BM forwarding 6.3.2. Selective AR-LEAF BM forwarding
The Selective AR-LEAF nodes will build two flood-lists: The Selective AR-LEAF nodes will build two flood-lists:
1) Flood-list #1 - composed of ACs and the overlay tunnel to the 1) Flood-list #1 - composed of ACs and the overlay tunnel to the
selected AR-REPLICATOR (using the AR-IP as the tunnel selected AR-REPLICATOR (using the AR-IP as the tunnel
destination IP). destination IP).
2) Flood-list #2 - composed of ACs and overlay tunnels to the 2) Flood-list #2 - composed of ACs and overlay tunnels to the
remote IR-IP Addresses. remote IR-IP Addresses.
When an AR-LEAF receives a BM packet on an AC, it will check if there When an AR-LEAF receives a BM packet on an AC, it will check if there
is any selected AR-REPLICATOR. If there is, flood-list #1 will be is any selected AR-REPLICATOR. If there is, flood-list #1 will be
used. Otherwise, flood-list #2 will. used. Otherwise, flood-list #2 will.
When an AR-LEAF receives a BM packet on an overlay tunnel, will When an AR-LEAF receives a BM packet on an overlay tunnel, will
forward the BM packet to its local ACs and never to an overlay forward the BM packet to its local ACs and never to an overlay
tunnel. This is the regular IR behavior described in [RFC7432]. tunnel. This is the regular IR behavior described in [RFC7432].
6. Pruned-Flood-Lists (PFL) 7. Pruned-Flood-Lists (PFL)
In addition to AR, the second optimization supported by this solution In addition to AR, the second optimization supported by this solution
is the ability for the all the EVI nodes to signal Pruned-Flood-Lists is the ability for the all the EVI nodes to signal Pruned-Flood-Lists
(PFL). As described in section 3, an EVPN node can signal a given (PFL). As described in section 3, an EVPN node can signal a given
value for the BM and U PFL flags in the IR Inclusive Multicast value for the BM and U PFL flags in the IR Inclusive Multicast
Routes, where: Routes, where:
+ BM= Broadcast and Multicast (BM) flag. BM=1 means "prune-me" from + BM= Broadcast and Multicast (BM) flag. BM=1 means "prune-me" from
the BM flood-list. BM=0 means regular behavior. the BM flood-list. BM=0 means regular behavior.
skipping to change at page 18, line 43 skipping to change at page 20, line 37
The ability to signal these PFL flags is an administrative choice. The ability to signal these PFL flags is an administrative choice.
Upon receiving a non-zero PFL flag, a node MAY decide to honor the Upon receiving a non-zero PFL flag, a node MAY decide to honor the
PFL flag and remove the sender from the corresponding flood-list. A PFL flag and remove the sender from the corresponding flood-list. A
given EVI node receiving BUM traffic on an overlay tunnel MUST given EVI node receiving BUM traffic on an overlay tunnel MUST
replicate the traffic normally, regardless of the signaled PFL replicate the traffic normally, regardless of the signaled PFL
flags. flags.
This optimization MAY be used along with the AR solution. This optimization MAY be used along with the AR solution.
6.1. A PFL example 7.1. A PFL example
In order to illustrate the use of the solution described in this In order to illustrate the use of the solution described in this
document, we will assume that EVI-1 in figure 1 is optimized-IR document, we will assume that EVI-1 in figure 1 is optimized-IR
enabled and: enabled and:
o PE1 and PE2 are administratively configured as AR-REPLICATORs, due o PE1 and PE2 are administratively configured as AR-REPLICATORs, due
to their high-performance replication capabilities. PE1 and PE2 to their high-performance replication capabilities. PE1 and PE2
will send a Replicator-AR route with BM/U flags = 00. will send a Replicator-AR route with BM/U flags = 00.
o NVE1 and NVE3 are administratively configured as AR-LEAF nodes, due o NVE1 and NVE3 are administratively configured as AR-LEAF nodes, due
skipping to change at page 19, line 39 skipping to change at page 21, line 34
(3) Any Unknown unicast packet sent from VM31 will be forwarded by (3) Any Unknown unicast packet sent from VM31 will be forwarded by
NVE3 to NVE2, PE1 and PE2 but not NVE1. The solution avoids the NVE3 to NVE2, PE1 and PE2 but not NVE1. The solution avoids the
unnecessary replication to NVE1, since the destination of the unnecessary replication to NVE1, since the destination of the
unknown traffic cannot be at NVE1. unknown traffic cannot be at NVE1.
(4) Any Unknown unicast packet sent from TS1 will be forwarded by PE1 (4) Any Unknown unicast packet sent from TS1 will be forwarded by PE1
to the WAN link, PE2 and NVE2 but not to NVE1 and NVE3, since the to the WAN link, PE2 and NVE2 but not to NVE1 and NVE3, since the
target of the unknown traffic cannot be at those NVEs. target of the unknown traffic cannot be at those NVEs.
7. AR Procedures for single-IP AR-REPLICATORS 8. AR Procedures for single-IP AR-REPLICATORS
The procedures explained in sections 4 (Non-selective AR) and 5 The procedures explained in sections 4 (Non-selective AR) and 5
(Selective AR) assume that the AR-REPLICATOR can use two local (Selective AR) assume that the AR-REPLICATOR can use two local
routable IP addresses to terminate and originate NVO tunnels, i.e. routable IP addresses to terminate and originate NVO tunnels, i.e.
IR-IP and AR-IP addresses. This is usually the case for PE-based AR- IR-IP and AR-IP addresses. This is usually the case for PE-based AR-
REPLICATOR nodes. REPLICATOR nodes.
In some cases, the AR-REPLICATOR node does not support more than one In some cases, the AR-REPLICATOR node does not support more than one
IP address to terminate and originate NVO tunnels, i.e. the IR-IP and IP address to terminate and originate NVO tunnels, i.e. the IR-IP and
AR-IP are the same IP addresses. This may be the case in some AR-IP are the same IP addresses. This may be the case in some
skipping to change at page 20, line 24 skipping to change at page 22, line 20
o An AR-REPLICATOR will perform IR or AR forwarding mode for the o An AR-REPLICATOR will perform IR or AR forwarding mode for the
incoming Overlay packets based on an ingress VNI lookup, as opposed incoming Overlay packets based on an ingress VNI lookup, as opposed
to the tunnel IP DA lookup described in sections 4 and 5. Note to the tunnel IP DA lookup described in sections 4 and 5. Note
that, when replicating to remote AR-REPLICATOR nodes, the use of that, when replicating to remote AR-REPLICATOR nodes, the use of
the IR-VNI or AR-VNI advertised by the egress node will determine the IR-VNI or AR-VNI advertised by the egress node will determine
the IR or AR forwarding mode at the subsequent AR-REPLICATOR. the IR or AR forwarding mode at the subsequent AR-REPLICATOR.
The rest of the procedures will follow what is described in sections The rest of the procedures will follow what is described in sections
4 and 5. 4 and 5.
8. AR Procedures and EVPN All-Active Multi-homing Split-Horizon 9. AR Procedures and EVPN All-Active Multi-homing Split-Horizon
8.1. Ethernet Segments on AR-LEAF nodes This section extends the procedures for the cases where AR-LEAF nodes
or AR-REPLICATOR nodes are attached to the the same Ethernet Segment
in the Broadcast Domain. The case where one (or more) AR-LEAF node(s)
and one (or more) AR-REPLICATOR node(s) are attached to the same
Ethernet Segment is out of scope.
9.1. Ethernet Segments on AR-LEAF nodes
If VXLAN or NVGRE are used, and if the Split-horizon is based on the If VXLAN or NVGRE are used, and if the Split-horizon is based on the
tunnel IP SA and "Local-Bias" as described in [RFC8365], the Split- tunnel IP SA and "Local-Bias" as described in [RFC8365], the Split-
horizon check will not work if there is an Ethernet-Segment shared horizon check will not work if there is an Ethernet-Segment shared
between two AR-LEAF nodes, and the AR-REPLICATOR changes the tunnel between two AR-LEAF nodes, and the AR-REPLICATOR changes the tunnel
IP SA of the packets with its own AR-IP. IP SA of the packets with its own AR-IP.
In order to be compatible with the IP SA split-horizon check, the AR- In order to be compatible with the IP SA split-horizon check, the AR-
REPLICATOR MAY keep the original received tunnel IP SA when REPLICATOR MAY keep the original received tunnel IP SA when
replicating packets to a remote AR-LEAF or RNVE. This will allow DF replicating packets to a remote AR-LEAF or RNVE. This will allow DF
skipping to change at page 21, line 7 skipping to change at page 23, line 10
Ethernet-Segments defined on AR-LEAF nodes. "Local-Bias" is Ethernet-Segments defined on AR-LEAF nodes. "Local-Bias" is
recommended in this case, as in the case of VXLAN or NVGRE explained recommended in this case, as in the case of VXLAN or NVGRE explained
above. The "Local-Bias" and tunnel IP SA preservation mechanisms above. The "Local-Bias" and tunnel IP SA preservation mechanisms
provide the required split-horizon behavior in non-selective or provide the required split-horizon behavior in non-selective or
selective AR. selective AR.
Note that if the AR-REPLICATOR implementation keeps the received Note that if the AR-REPLICATOR implementation keeps the received
tunnel IP SA, the use of uRPF (unicast Reverse Path Forwarding) tunnel IP SA, the use of uRPF (unicast Reverse Path Forwarding)
checks in the IP fabric based on the tunnel IP SA MUST be disabled. checks in the IP fabric based on the tunnel IP SA MUST be disabled.
8.2. Ethernet Segments on AR-REPLICATOR nodes 9.2. Ethernet Segments on AR-REPLICATOR nodes
Ethernet Segments associated to one or more AR-REPLICATOR nodes Ethernet Segments associated to one or more AR-REPLICATOR nodes
SHOULD follow "Local-Bias" procedures for EVPN all-active multi- SHOULD follow "Local-Bias" procedures for EVPN all-active multi-
homing, as follows: homing, as follows:
o For BUM traffic received on a local AR-REPLICATOR's AC, "Local- o For BUM traffic received on a local AR-REPLICATOR's AC, "Local-
Bias" procedures as in [RFC8365] SHOULD be followed. Bias" procedures as in [RFC8365] SHOULD be followed.
o For BUM traffic received on an AR-REPLICATOR overlay tunnel with o For BUM traffic received on an AR-REPLICATOR overlay tunnel with
AR-IP as the IP DA, "Local-Bias" SHOULD also be followed. That is, AR-IP as the IP DA, "Local-Bias" SHOULD also be followed. That is,
traffic received with AR-IP as IP DA will be treated as though it traffic received with AR-IP as IP DA will be treated as though it
had been received on a local AC that is part of the ES and will be had been received on a local AC that is part of the ES and will be
forwarded to all local ES, irrespective of their DF or NDF state. forwarded to all local ES, irrespective of their DF or NDF state.
o BUM traffic received on an AR-REPLICATOR overlay tunnel with IR-IP o BUM traffic received on an AR-REPLICATOR overlay tunnel with IR-IP
as the IP DA, will follow regular [RFC8365] "Local-Bias" rules and as the IP DA, will follow regular [RFC8365] "Local-Bias" rules and
will not be forwarded to local ESes that are shared with the AR-LEF will not be forwarded to local ESes that are shared with the AR-LEF
or AR-REPLICATOR originating the traffic. or AR-REPLICATOR originating the traffic.
9. Benefits of the optimized-IR solution 10. Benefits of the optimized-IR solution
A solution for the optimization of Ingress Replication in EVPN is A solution for the optimization of Ingress Replication in EVPN is
described in this document (optimized-IR). The solution brings the described in this document (optimized-IR). The solution brings the
following benefits: following benefits:
o Optimizes the multicast forwarding in low-performance NVEs, by o Optimizes the multicast forwarding in low-performance NVEs, by
relaying the replication to high-performance NVEs (AR-REPLICATORs) relaying the replication to high-performance NVEs (AR-REPLICATORs)
and while preserving the packet ordering for unicast applications. and while preserving the packet ordering for unicast applications.
o Reduces the flooded traffic in NVO networks where some NVEs do not o Reduces the flooded traffic in NVO networks where some NVEs do not
need broadcast/multicast and/or unknown unicast traffic. need broadcast/multicast and/or unknown unicast traffic.
o It is fully compatible with existing EVPN implementations and EVPN o It is fully compatible with existing EVPN implementations and EVPN
functions for NVO overlay tunnels. Optimized-IR NVEs and regular functions for NVO overlay tunnels. Optimized-IR NVEs and regular
NVEs can be even part of the same EVI. NVEs can be even part of the same EVI.
o It does not require any PIM-based tree in the NVO core of the o It does not require any PIM-based tree in the NVO core of the
network. network.
10. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
11. Security Considerations 11. Security Considerations
This section will be added in future versions. This section will be added in future versions.
12. IANA Considerations 12. IANA Considerations
IANA has allocated the following Border Gateway Protocol (BGP) IANA has allocated the following Border Gateway Protocol (BGP)
Parameters: Parameters:
1) Allocation in the P-Multicast Service Interface Tunnel (PMSI 1) Allocation in the P-Multicast Service Interface Tunnel (PMSI
Tunnel) Tunnel Types registry: Tunnel) Tunnel Types registry:
Value Meaning Reference Value Meaning Reference
0x0A Assisted-Replication Tunnel [This document] 0x0A Assisted-Replication Tunnel [This document]
2) Allocations in the P-Multicast Service Interface (PMSI) Tunnel 2) Allocations in the P-Multicast Service Interface (PMSI) Tunnel
Attribute Flags registry: Attribute Flags registry:
Value Name Reference Value Name Reference
3-4 Assisted-Replication Type (T) [This document] 3-4 Assisted-Replication Type (T) [This document]
5 Broadcast and Multicast (BM) [This document] 5 Broadcast and Multicast (BM) [This document]
6 Unknown (U) [This document] 6 Unknown (U) [This document]
13. Terminology 13. References
AC: Attachment Circuit
Regular-IR: Refers to Regular Ingress Replication, where the source
NVE/PE sends a copy to each remote NVE/PE part of the EVI.
AR-IP: IP address owned by the AR-REPLICATOR and used to
differentiate the ingress traffic that must follow the AR
procedures.
IR-IP: IP address used for Ingress Replication as in [RFC7432].
AR-VNI: VNI advertised by the AR-REPLICATOR along with the
Replicator-AR route. It is used to identify the ingress
packets that must follow AR procedures ONLY in the Single-IP
AR-REPLICATOR case.
IR-VNI: VNI advertised along with the RT-3 for IR.
AR forwarding mode: for an AR-LEAF, it means sending an AC BM packet
to a single AR-REPLICATOR with tunnel destination IP AR-IP.
For an AR-REPLICATOR, it means sending a BM packet to a
selective number or all the overlay tunnels when the packet
was previously received from an overlay tunnel.
IR forwarding mode: it refers to the Ingress Replication behavior
explained in [RFC7432]. It means sending an AC BM packet copy
to each remote PE/NVE in the EVI and sending an overlay BM
packet only to the ACs and not other overlay tunnels.
PTA: PMSI Tunnel Attribute
RT-3: EVPN Route Type 3, Inclusive Multicast Ethernet Tag route
RT-11: EVPN Route Type 11, Leaf Auto-Discovery (AD) route
14. References
14.1 Normative References 13.1 Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March
1997, <https://www.rfc-editor.org/info/rfc2119>. 1997, <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017,
<https://www.rfc-editor.org/info/rfc8174>. <https://www.rfc-editor.org/info/rfc8174>.
[RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP
skipping to change at page 24, line 5 skipping to change at page 25, line 12
[RFC7902] Rosen, E. and T. Morin, "Registry and Extensions for P- [RFC7902] Rosen, E. and T. Morin, "Registry and Extensions for P-
Multicast Service Interface Tunnel Attribute Flags", RFC 7902, DOI Multicast Service Interface Tunnel Attribute Flags", RFC 7902, DOI
10.17487/RFC7902, June 2016, <https://www.rfc- 10.17487/RFC7902, June 2016, <https://www.rfc-
editor.org/info/rfc7902>. editor.org/info/rfc7902>.
[EVPN-BUM] Zhang et al., "Updates on EVPN BUM Procedures", draft- [EVPN-BUM] Zhang et al., "Updates on EVPN BUM Procedures", draft-
ietf-bess-evpn-bum-procedure-updates-04.txt, work in progress, June ietf-bess-evpn-bum-procedure-updates-04.txt, work in progress, June
2018. 2018.
14.2 Informative References 13.2 Informative References
[RFC8365] Sajassi et al., "A Network Virtualization Overlay Solution [RFC8365] Sajassi et al., "A Network Virtualization Overlay Solution
Using Ethernet VPN (EVPN)", RFC 8365, March, 2018. Using Ethernet VPN (EVPN)", RFC 8365, March, 2018.
15.0 Contributors 14. Contributors
In addition to the names in the front page, the following co-authors In addition to the names in the front page, the following co-authors
also contributed to this document: also contributed to this document:
Wim Henderickx Wim Henderickx
Nokia Nokia
Kiran Nagaraj Kiran Nagaraj
Nokia Nokia
skipping to change at page 24, line 33 skipping to change at page 25, line 40
Nischal Sheth Nischal Sheth
Juniper Networks Juniper Networks
Aldrin Isaac Aldrin Isaac
Juniper Juniper
Mudassir Tufail Mudassir Tufail
Citibank Citibank
16. Acknowledgments 15. Acknowledgments
The authors would like to thank Neil Hart, David Motz, Dai Truong, The authors would like to thank Neil Hart, David Motz, Dai Truong,
Thomas Morin, Jeffrey Zhang and Shankar Murthy for their valuable Thomas Morin, Jeffrey Zhang and Shankar Murthy for their valuable
feedback and contributions. feedback and contributions.
17. Authors' Addresses 16. Authors' Addresses
Jorge Rabadan (Editor) Jorge Rabadan (Editor)
Nokia Nokia
777 E. Middlefield Road 777 E. Middlefield Road
Mountain View, CA 94043 USA Mountain View, CA 94043 USA
Email: jorge.rabadan@nokia.com Email: jorge.rabadan@nokia.com
Senthil Sathappan Senthil Sathappan
Nokia Nokia
Email: senthil.sathappan@nokia.com Email: senthil.sathappan@nokia.com
 End of changes. 55 change blocks. 
171 lines changed or deleted 197 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/