draft-ietf-bess-evpn-df-election-framework-07.txt   draft-ietf-bess-evpn-df-election-framework-08.txt 
BESS Workgroup J. Rabadan, Ed. BESS Workgroup J. Rabadan, Ed.
Internet Draft Nokia Internet Draft Nokia
S. Mohanty, Ed. Updates: 7432 S. Mohanty, Ed.
Intended status: Standards Track A. Sajassi Intended status: Standards Track A. Sajassi
Cisco Cisco
J. Drake J. Drake
Juniper Juniper
K. Nagaraj K. Nagaraj
S. Sathappan S. Sathappan
Nokia Nokia
Expires: June 23, 2019 December 20, 2018 Expires: July 22, 2019 January 18, 2019
Framework for EVPN Designated Forwarder Election Extensibility Framework for EVPN Designated Forwarder Election Extensibility
draft-ietf-bess-evpn-df-election-framework-07 draft-ietf-bess-evpn-df-election-framework-08
Abstract Abstract
An alternative to the Default Designated Forwarder (DF) selection An alternative to the Default Designated Forwarder (DF) selection
algorithm in Ethernet VPN (EVPN) networks is defined. The DF is the algorithm in Ethernet VPN (EVPN) networks is defined. The DF is the
Provider Edge (PE) router responsible for sending broadcast, unknown Provider Edge (PE) router responsible for sending broadcast, unknown
unicast and multicast (BUM) traffic to multi-homed Customer Equipment unicast and multicast (BUM) traffic to multi-homed Customer Equipment
(CE) on a particular Ethernet Segment (ES) within a VLAN. In (CE) on a particular Ethernet Segment (ES) within a VLAN. In
addition, the capability to influence the DF election result for a addition, the capability to influence the DF election result for a
VLAN based on the state of the associated Attachment Circuit (AC) is VLAN based on the state of the associated Attachment Circuit (AC) is
specified. specified. This document clarifies the DF Election Finite State
Machine in EVPN, therefore it updates the EVPN specification.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 2, line 4 skipping to change at page 2, line 5
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on June 23, 2019. This Internet-Draft will expire on July 22, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Default Designated Forwarder (DF) Election in EVPN . . . . 3 1.1. Default Designated Forwarder (DF) Election in EVPN . . . . 3
1.2. Problem Statement . . . . . . . . . . . . . . . . . . . . . 5 1.2. Problem Statement . . . . . . . . . . . . . . . . . . . . . 6
1.2.1. Unfair Load-Balancing and Service Disruption . . . . . 6 1.2.1. Unfair Load-Balancing and Service Disruption . . . . . 6
1.2.2. Traffic Black-Holing on Individual AC Failures . . . . 7 1.2.2. Traffic Black-Holing on Individual AC Failures . . . . 7
1.3. The Need for Extending the Default DF Election in EVPN . . 9 1.3. The Need for Extending the Default DF Election in EVPN . . 10
2. Conventions and Terminology . . . . . . . . . . . . . . . . . . 10 2. Conventions and Terminology . . . . . . . . . . . . . . . . . . 11
3. Designated Forwarder Election Protocol and BGP Extensions . . . 11 3. Designated Forwarder Election Protocol and BGP Extensions . . . 12
3.1. The DF Election Finite State Machine (FSM) . . . . . . . . 12 3.1. The DF Election Finite State Machine (FSM) . . . . . . . . 12
3.2. The DF Election Extended Community . . . . . . . . . . . . 14 3.2. The DF Election Extended Community . . . . . . . . . . . . 15
3.2.1. Backward Compatibility . . . . . . . . . . . . . . . . 17 3.2.1. Backward Compatibility . . . . . . . . . . . . . . . . 18
3.3. Auto-Derivation of ES-Import Route Target . . . . . . . . . 17 3.3. Auto-Derivation of ES-Import Route Target . . . . . . . . . 18
4. The Highest Random Weight DF Election Algorithm . . . . . . . . 17 4. The Highest Random Weight DF Election Algorithm . . . . . . . . 18
4.1. HRW and Consistent Hashing . . . . . . . . . . . . . . . . 18 4.1. HRW and Consistent Hashing . . . . . . . . . . . . . . . . 19
4.2. HRW Algorithm for EVPN DF Election . . . . . . . . . . . . 18 4.2. HRW Algorithm for EVPN DF Election . . . . . . . . . . . . 19
5. The Attachment Circuit Influenced DF Election Capability . . . 20 5. The Attachment Circuit Influenced DF Election Capability . . . 21
5.1. AC-Influenced DF Election Capability For VLAN-Aware 5.1. AC-Influenced DF Election Capability For VLAN-Aware
Bundle Services . . . . . . . . . . . . . . . . . . . . . . 22 Bundle Services . . . . . . . . . . . . . . . . . . . . . . 23
6. Solution Benefits . . . . . . . . . . . . . . . . . . . . . . . 23
7. Security Considerations . . . . . . . . . . . . . . . . . . . . 23 6. Solution Benefits . . . . . . . . . . . . . . . . . . . . . . . 24
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 24 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 25
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 25
9.1. Normative References . . . . . . . . . . . . . . . . . . . 25 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26
9.2. Informative References . . . . . . . . . . . . . . . . . . 25 9.1. Normative References . . . . . . . . . . . . . . . . . . . 26
10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 26 9.2. Informative References . . . . . . . . . . . . . . . . . . 27
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 26 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 28
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 28
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28
1. Introduction 1. Introduction
The Designated Forwarder (DF) in EVPN networks is the Provider Edge The Designated Forwarder (DF) in EVPN networks is the Provider Edge
(PE) router responsible for sending broadcast, unknown unicast and (PE) router responsible for sending broadcast, unknown unicast and
multicast (BUM) traffic to a multi-homed Customer Equipment (CE) multicast (BUM) traffic to a multi-homed Customer Equipment (CE)
device, on a given VLAN on a particular Ethernet Segment (ES). The DF device, on a given VLAN on a particular Ethernet Segment (ES). The DF
is selected out of a list of candidate PEs that advertise the same is selected out of a list of candidate PEs that advertise the same
Ethernet Segment Identifier (ESI) to the EVPN network. By default, Ethernet Segment Identifier (ESI) to the EVPN network. By default,
EVPN uses a DF Election algorithm referred to as "Service Carving" EVPN uses a DF Election algorithm referred to as "Service Carving"
and it is based on a modulus function (V mod N) that takes the number and it is based on a modulus function (V mod N) that takes the number
of PEs in the ES (N) and the VLAN value (V) as input. This Default DF of PEs in the ES (N) and the VLAN value (V) as input. This Default DF
Election algorithm has some inefficiencies that this document Election algorithm has some inefficiencies that this document
addresses by defining a new DF Election algorithm and a capability to addresses by defining a new DF Election algorithm and a capability to
influence the DF Election result for a VLAN, depending on the state influence the DF Election result for a VLAN, depending on the state
of the associated Attachment Circuit (AC). In order to avoid any of the associated Attachment Circuit (AC). In order to avoid any
ambiguity with the identifier used in the DF Election Algorithm, this ambiguity with the identifier used in the DF Election Algorithm, this
document uses the term Ethernet Tag instead of VLAN. This document document uses the term Ethernet Tag instead of VLAN. This document
also creates a registry with IANA, for future DF Election Algorithms also creates a registry with IANA, for future DF Election Algorithms
and Capabilities. It also presents a formal definition and and Capabilities. It also presents a formal definition and
clarification of the DF Election Finite State Machine. clarification of the DF Election Finite State Machine (FSM),
therefore the document updates [RFC7432] and EVPN implementations
MUST conform to the prescribed FSM.
The procedures described in this document apply to [RFC7432] and The procedures described in this document apply to DF election in all
[RFC8214] EVPN networks. This document does not intend to update EVPN solutions including [RFC7432] and [RFC8214]. Apart from the FSM
[RFC7432] or [RFC8214] but intends to improve the behavior of the DF formal description, this document does not intend to update other
[RFC7432] procedures. It only aims to improve the behavior of the DF
Election on PEs that are upgraded to follow the described procedures. Election on PEs that are upgraded to follow the described procedures.
1.1. Default Designated Forwarder (DF) Election in EVPN 1.1. Default Designated Forwarder (DF) Election in EVPN
[RFC7432] defines the Designated Forwarder (DF) as the EVPN PE [RFC7432] defines the Designated Forwarder (DF) as the EVPN PE
responsible for: responsible for:
o Flooding Broadcast, Unknown unicast and Multicast traffic (BUM), on o Flooding Broadcast, Unknown unicast and Multicast traffic (BUM), on
a given Ethernet Tag on a particular Ethernet Segment (ES), to the a given Ethernet Tag on a particular Ethernet Segment (ES), to the
CE. This is valid for single-active and all-active EVPN CE. This is valid for single-active and all-active EVPN
skipping to change at page 6, line 24 skipping to change at page 6, line 29
as DF for all of the VLANs. This is very sub-optimal. It defeats as DF for all of the VLANs. This is very sub-optimal. It defeats
the purpose of service carving as the DFs are not really evenly the purpose of service carving as the DFs are not really evenly
spread across. In fact, in this particular case, one of the PEs spread across. In fact, in this particular case, one of the PEs
does not get elected as DF at all, so it does not participate in does not get elected as DF at all, so it does not participate in
the DF responsibilities at all. Consider another example where, the DF responsibilities at all. Consider another example where,
referring to Figure 1, lets assume that PE2, PE3, PE4 are in referring to Figure 1, lets assume that PE2, PE3, PE4 are in
ascending order of the IP address; and each VLAN configured on ES2 ascending order of the IP address; and each VLAN configured on ES2
is associated with an Ethernet Tag of the form (3x+1), where x is is associated with an Ethernet Tag of the form (3x+1), where x is
an integer. This will result in PE3 always be selected as the DF. an integer. This will result in PE3 always be selected as the DF.
2- Even in the case when the Ethernet Tag distribution is uniform the 2- The Ethernet tag that identifies the BD can be as large as 2^24;
instance of a PE being up or down results in re-computation ((v however, it is not guaranteed that the tenant BD on the ES will
mod N-1) or (v mod N+1) as is the case); the resulting modulus conform to a uniform distribution. In fact, it is up to the
value need not be uniformly distributed because it can be subject customer what BDs they will configure on the ES. Quoting [Knuth],
to the primality of N-1 or N+1 as may be the case. "In general, we want to avoid values of M that divide r^k+a or
r^k-a, where k and a are small numbers and r is the radix of the
alphabetic character set (usually r=64, 256 or 100), since a
remainder modulo such a value of M tends to be largely a simple
superposition of key digits. Such considerations suggest that we
choose M to be a prime number such that r^k!=a(modulo)M or
r^k!=?a(modulo)M for small k & a."
In our case, N is the number of PEs in [RFC7432] which corresponds
to M above. Since N, N-1 or N+1 need not satisfy the primality
properties of the M above; as per the [RFC7432] modulo based DF
assignment, whenever a PE goes down or a new PE boots up (hosting
the same Ethernet Segment), the modulo scheme will not necessarily
map BDs to PEs uniformly.
3- The third problem is one of disruption. Consider a case when the 3- The third problem is one of disruption. Consider a case when the
same Ethernet Segment is multi-homed to a set of PEs. When the ES same Ethernet Segment is multi-homed to a set of PEs. When the ES
is down in one of the PEs, say PE1, or PE1 itself reboots, or the is down in one of the PEs, say PE1, or PE1 itself reboots, or the
BGP process goes down or the connectivity between PE1 and an RR BGP process goes down or the connectivity between PE1 and an RR
goes down, the effective number of PEs in the system now becomes goes down, the effective number of PEs in the system now becomes
N-1, and DFs are computed for all the VLANs that are configured on N-1, and DFs are computed for all the VLANs that are configured on
that Ethernet Segment. In general, if the DF for a VLAN v happens that Ethernet Segment. In general, if the DF for a VLAN v happens
not to be PE1, but some other PE, say PE2, it is likely that some not to be PE1, but some other PE, say PE2, it is likely that some
other PE (different from PE1 and PE2) will become the new DF. This other PE (different from PE1 and PE2) will become the new DF. This
skipping to change at page 9, line 38 skipping to change at page 10, line 29
procedures, it is not used to influence the DF election for the procedures, it is not used to influence the DF election for the
affected EVIs. affected EVIs.
This document adds an optional modification of the DF Election This document adds an optional modification of the DF Election
procedure so that the ACS may be taken into account as a variable in procedure so that the ACS may be taken into account as a variable in
the DF election, and therefore EVPN can provide protection against the DF election, and therefore EVPN can provide protection against
logical failures. logical failures.
1.3. The Need for Extending the Default DF Election in EVPN 1.3. The Need for Extending the Default DF Election in EVPN
Section 2.2 describes some of the issues that exist in the Default DF Section 1.2 describes some of the issues that exist in the Default DF
Election procedures. In order to address those issues, this document Election procedures. In order to address those issues, this document
introduces a new DF Election framework. This framework allows the PEs introduces a new DF Election framework. This framework allows the PEs
to agree on a common DF election algorithm, as well as the to agree on a common DF election algorithm, as well as the
capabilities to enable during the DF Election procedure. Generally, capabilities to enable during the DF Election procedure. Generally,
'DF election algorithm' refers to the algorithm by which a number of 'DF election algorithm' refers to the algorithm by which a number of
input parameters are used to determine the DF PE, while 'DF election input parameters are used to determine the DF PE, while 'DF election
capability' refers to an additional feature that can be used prior to capability' refers to an additional feature that can be used prior to
the invocation of the DF election algorithm, such as modifying the the invocation of the DF election algorithm, such as modifying the
inputs (or list of candidate PEs). inputs (or list of candidate PEs).
skipping to change at page 10, line 13 skipping to change at page 11, line 4
algorithm and a new capability that can influence the DF Election algorithm and a new capability that can influence the DF Election
result: result:
o The new DF Election algorithm is referred to as "Highest Random o The new DF Election algorithm is referred to as "Highest Random
Weight" (HRW). The HRW procedures are described in section 4. Weight" (HRW). The HRW procedures are described in section 4.
o The new DF Election capability is referred to as "AC-Influenced DF o The new DF Election capability is referred to as "AC-Influenced DF
Election" (AC-DF). The AC-DF procedures are described in section 5. Election" (AC-DF). The AC-DF procedures are described in section 5.
o HRW and AC-DF mechanisms are independent of each other. Therefore, o HRW and AC-DF mechanisms are independent of each other. Therefore,
a PE MAY support either HRW or AC-DF independently or MAY support a PE may support either HRW or AC-DF independently or may support
both of them together. A PE MAY also support AC-DF capability along both of them together. A PE may also support AC-DF capability along
with the Default DF election algorithm per [RFC7432]. with the Default DF election algorithm per [RFC7432].
In addition, this document defines a way to indicate the support of In addition, this document defines a way to indicate the support of
HRW and/or AC-DF along with the EVPN ES routes advertised for a given HRW and/or AC-DF along with the EVPN ES routes advertised for a given
ES. Refer to section 3.2 for more details. ES. Refer to section 3.2 for more details.
2. Conventions and Terminology 2. Conventions and Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
skipping to change at page 14, line 24 skipping to change at page 15, line 13
9. DF_CALC on CALCULATED: mark election result for VLAN or bundle, 9. DF_CALC on CALCULATED: mark election result for VLAN or bundle,
and transition to DF_DONE. and transition to DF_DONE.
11. DF_DONE on exiting the state: if there is a new DF election 11. DF_DONE on exiting the state: if there is a new DF election
triggered and the current DF is lost, then assume NDF for local triggered and the current DF is lost, then assume NDF for local
PE for VLAN or VLAN Bundle. PE for VLAN or VLAN Bundle.
12. DF_DONE on VLAN_CHANGE, RCVD_ES or LOST_ES: transition to 12. DF_DONE on VLAN_CHANGE, RCVD_ES or LOST_ES: transition to
DF_CALC. DF_CALC.
The above events and transitions are defined for the Default DF
Election Algorithm. As described in Section 5, the use of the AC-DF
capability introduces additional events and transitions.
3.2. The DF Election Extended Community 3.2. The DF Election Extended Community
For the DF election procedures to be consistent and unanimous, it is For the DF election procedures to be consistent and unanimous, it is
necessary that all the participating PEs agree on the DF Election necessary that all the participating PEs agree on the DF Election
algorithm and capabilities to be used. For instance, it is not algorithm and capabilities to be used. For instance, it is not
possible that some PEs continue to use the Default DF Election possible that some PEs continue to use the Default DF Election
algorithm and some PEs use HRW. For brown-field deployments and for algorithm and some PEs use HRW. For brown-field deployments and for
interoperability with legacy PEs, it is important that all PEs need interoperability with legacy PEs, it is important that all PEs need
to have the capability to fall back on the Default DF Election. A PE to have the capability to fall back on the Default DF Election. A PE
can indicate its willingness to support HRW and/or AC-DF by signaling can indicate its willingness to support HRW and/or AC-DF by signaling
skipping to change at page 15, line 12 skipping to change at page 16, line 5
Figure 4 DF Election Extended Community Figure 4 DF Election Extended Community
Where: Where:
o Type is 0x06 as registered with IANA for EVPN Extended Communities. o Type is 0x06 as registered with IANA for EVPN Extended Communities.
o Sub-Type is 0x06 - "DF Election Extended Community" as requested by o Sub-Type is 0x06 - "DF Election Extended Community" as requested by
this document to IANA. this document to IANA.
o RSV - Reserved bits for future use. o RSV / Reserved - Reserved bits for DF Alg specific information.
o DF Alg (5 bits) - Encodes the DF Election algorithm values (between o DF Alg (5 bits) - Encodes the DF Election algorithm values (between
0 and 31) that the advertising PE desires to use for the ES. This 0 and 31) that the advertising PE desires to use for the ES. This
document requests IANA to set up a registry called "DF Alg document requests IANA to set up a registry called "DF Alg
Registry" and solicits the following values: Registry" and solicits the following values:
- Type 0: Default DF Election algorithm, or modulus-based algorithm - Type 0: Default DF Election algorithm, or modulus-based algorithm
as in [RFC7432]. as in [RFC7432].
- Type 1: HRW algorithm (explained in this document). - Type 1: HRW algorithm (explained in this document).
skipping to change at page 16, line 23 skipping to change at page 17, line 16
used. In the Extended Community, the PE indicates the desired "DF used. In the Extended Community, the PE indicates the desired "DF
Alg" algorithm and "Bitmap" capabilities to be used for the ES. Alg" algorithm and "Bitmap" capabilities to be used for the ES.
- Only one DF Election Extended Community can be sent along with an - Only one DF Election Extended Community can be sent along with an
ES route. Note that the intent is not for the advertising PE to ES route. Note that the intent is not for the advertising PE to
indicate all the supported DF election algorithms and indicate all the supported DF election algorithms and
capabilities, but signal the preferred one. capabilities, but signal the preferred one.
- DF Algs 0 and 1 can be both used with bit AC-DF set to 0 or 1. - DF Algs 0 and 1 can be both used with bit AC-DF set to 0 or 1.
- In general, a specific DF Alg MAY determine the use of the - In general, a specific DF Alg SHOULD determine the use of the
reserved bits in the Extended Community, which may be used in a reserved bits in the Extended Community, which may be used in a
different way for a different DF Alg. different way for a different DF Alg. In particular, for DF Algs
0 and 1, the reserved bits are not set by the advertising PE and
SHOULD be ignored by the receiving PE.
o When a PE receives the ES Routes from all the other PEs for the ES o When a PE receives the ES Routes from all the other PEs for the ES
in question, it checks to see if all the advertisements have the in question, it checks to see if all the advertisements have the
extended community with the same DF Alg and Bitmap: extended community with the same DF Alg and Bitmap:
- In the case that they do, this particular PE MUST follow the - In the case that they do, this particular PE MUST follow the
procedures for the advertised DF Alg and capabilities. For procedures for the advertised DF Alg and capabilities. For
instance, if all ES routes for a given ES indicate DF Alg HRW and instance, if all ES routes for a given ES indicate DF Alg HRW and
AC-DF set to 1, the receiving PE and by induction all the other AC-DF set to 1, the receiving PE and by induction all the other
PEs in the ES will proceed to do DF Election as per the HRW PEs in the ES will proceed to do DF Election as per the HRW
Algorithm and following the AC-DF procedures. Algorithm and following the AC-DF procedures.
- Otherwise if even a single advertisement for the type-4 route is - Otherwise if even a single advertisement for the type-4 route is
not received with the locally configured DF Alg and capability, received without the locally configured DF Alg and capability,
the Default DF Election algorithm (modulus) algorithm MUST be the Default DF Election algorithm (modulus) algorithm MUST be
used as in [RFC7432]. This procedure handles the case where used as in [RFC7432]. This procedure handles the case where
participating PEs in the ES disagree about the DF algorithm and participating PEs in the ES disagree about the DF algorithm and
capability to apply. capability to apply.
- The absence of the DF Election Extended Community MUST be - The absence of the DF Election Extended Community or the presence
interpreted by a receiving PE as an indication of the Default DF of multiple DF Election Extended Communities (in the same route)
Election algorithm on the sending PE, that is, DF Alg 0 and no DF MUST be interpreted by a receiving PE as an indication of the
Election capabilities. Default DF Election algorithm on the sending PE, that is, DF Alg
0 and no DF Election capabilities.
o When all the PEs in an ES advertise DF Type 31, they will rely on o When all the PEs in an ES advertise DF Type 31, they will rely on
the local policy to decide how to proceed with the DF Election. the local policy to decide how to proceed with the DF Election.
o For any new capability defined in the future, the o For any new capability defined in the future, the
applicability/compatibility of this new capability to the existing applicability/compatibility of this new capability to the existing
DF Algs must be assessed on a case by case basis. DF Algs must be assessed on a case by case basis.
o Likewise, for any new DF Alg defined in future, its o Likewise, for any new DF Alg defined in future, its
applicability/compatibility to the existing capabilities must be applicability/compatibility to the existing capabilities must be
skipping to change at page 17, line 32 skipping to change at page 18, line 28
Extended Community will ignore it and will continue to use the Extended Community will ignore it and will continue to use the
Default DF Election algorithm. Default DF Election algorithm.
3.3. Auto-Derivation of ES-Import Route Target 3.3. Auto-Derivation of ES-Import Route Target
Section 7.6 of [RFC7432] describes how the value of the ES-Import Section 7.6 of [RFC7432] describes how the value of the ES-Import
Route Target for ESI types 1, 2, and 3 can be auto-derived by using Route Target for ESI types 1, 2, and 3 can be auto-derived by using
the high-order six bytes of the nine byte ESI value. The same auto- the high-order six bytes of the nine byte ESI value. The same auto-
derivation procedure can be extended to ESI types 0, 4, and 5 as long derivation procedure can be extended to ESI types 0, 4, and 5 as long
as it is ensured that the auto-derived values for ES-Import RT among as it is ensured that the auto-derived values for ES-Import RT among
different ES types don't overlap. different ES types don't overlap. As in [RFC7432], the mechanism to
guarantee that the auto-derived ESI or ES-import RT values for
different ESIs do not match is out of scope of this document.
4. The Highest Random Weight DF Election Algorithm 4. The Highest Random Weight DF Election Algorithm
The procedure discussed in this section is applicable to the DF The procedure discussed in this section is applicable to the DF
Election in EVPN Services [RFC7432] and EVPN Virtual Private Wire Election in EVPN Services [RFC7432] and EVPN Virtual Private Wire
Services [RFC8214]. Services [RFC8214].
Highest Random Weight (HRW) as defined in [HRW1999] is originally Highest Random Weight (HRW) as defined in [HRW1999] is originally
proposed in the context of Internet Caching and proxy Server load proposed in the context of Internet Caching and proxy Server load
balancing. Given an object name and a set of servers, HRW maps a balancing. Given an object name and a set of servers, HRW maps a
skipping to change at page 19, line 14 skipping to change at page 20, line 13
to the 'lowest VLAN in bundle' logic of [RFC7432]. to the 'lowest VLAN in bundle' logic of [RFC7432].
1. DF(v) = Si: Weight(v, Es, Si) >= Weight(v, Es, Sj), for all j. In 1. DF(v) = Si: Weight(v, Es, Si) >= Weight(v, Es, Sj), for all j. In
case of a tie, choose the PE whose IP address is numerically the case of a tie, choose the PE whose IP address is numerically the
least. Note 0 <= i,j < Number of PEs in the redundancy group. least. Note 0 <= i,j < Number of PEs in the redundancy group.
2. BDF(v) = Sk: Weight(v, Es, Si) >= Weight(v, Es, Sk) and Weight(v, 2. BDF(v) = Sk: Weight(v, Es, Si) >= Weight(v, Es, Sk) and Weight(v,
Es, Sk) >= Weight(v, Es, Sj). In case of tie choose the PE whose Es, Sk) >= Weight(v, Es, Sj). In case of tie choose the PE whose
IP address is numerically the least. IP address is numerically the least.
Where:
DF(v): is defined to be the address Si (index i) for which weight(v,
Es, Si) is the highest, 0 <= i < N-1
BDF(v) is defined as that PE with address Sk for which the computed
weight is the next highest after the weight of the DF. j is the
running index from 0 to N-1, i, k are selected values.
Since the Weight is a pseudo-random function with domain as the Since the Weight is a pseudo-random function with domain as the
three-tuple (v, Es, S), it is an efficient and deterministic three-tuple (v, Es, S), it is an efficient and deterministic
algorithm that is independent of the Ethernet Tag v sample space algorithm that is independent of the Ethernet Tag v sample space
distribution. Choosing a good hash function for the pseudo-random distribution. Choosing a good hash function for the pseudo-random
function is an important consideration for this algorithm to perform function is an important consideration for this algorithm to perform
better than the Default algorithm. As mentioned previously, such better than the Default algorithm. As mentioned previously, such
functions are described in the HRW paper. We take as candidate hash functions are described in the HRW paper. We take as candidate hash
function the first one out of the two that are preferred in function the first one out of the two that are preferred in
[HRW1999]: [HRW1999]:
skipping to change at page 19, line 45 skipping to change at page 21, line 5
modulo significant. modulo significant.
A point to note is that the Weight function takes into consideration A point to note is that the Weight function takes into consideration
the combination of the Ethernet Tag, Ethernet Segment and the PE IP- the combination of the Ethernet Tag, Ethernet Segment and the PE IP-
address, and the actual length of the server IP address (whether IPv4 address, and the actual length of the server IP address (whether IPv4
or IPv6) is not really relevant. The Default algorithm in [RFC7432] or IPv6) is not really relevant. The Default algorithm in [RFC7432]
cannot employ both IPv4 and IPv6 PE addresses, since [RFC7432] does cannot employ both IPv4 and IPv6 PE addresses, since [RFC7432] does
not specify how to decide on the ordering (the ordinal list) when not specify how to decide on the ordering (the ordinal list) when
both IPv4 and IPv6 PEs are present. both IPv4 and IPv6 PEs are present.
HRW solves the disadvantages pointed out in Section 2.2.1 and HRW solves the disadvantages pointed out in Section 1.2.1 and
ensures: ensures:
o with very high probability that the task of DF election for the o with very high probability that the task of DF election for the
VLANs configured on an ES is more or less equally distributed among VLANs configured on an ES is more or less equally distributed among
the PEs even for the 2 PE case. the PEs even for the 2 PE case.
o If a PE that is not the DF or the BDF for that VLAN, goes down or o If a PE that is not the DF or the BDF for that VLAN, goes down or
its connection to the ES goes down, it does not result in a DF or its connection to the ES goes down, it does not result in a DF or
BDF reassignment. This saves computation, especially in the case BDF reassignment. This saves computation, especially in the case
when the connection flaps. when the connection flaps.
o More importantly it avoids the needless disruption case of Section o More importantly it avoids the needless disruption case of Section
2.2.1 (3), that is inherent in the existing Default DF Election. 1.2.1 (3), that is inherent in the existing Default DF Election.
o In addition to the DF, the algorithm also furnishes the BDF, which o In addition to the DF, the algorithm also furnishes the BDF, which
would be the DF if the current DF fails. would be the DF if the current DF fails.
5. The Attachment Circuit Influenced DF Election Capability 5. The Attachment Circuit Influenced DF Election Capability
The procedure discussed in this section is applicable to the DF The procedure discussed in this section is applicable to the DF
Election in EVPN Services [RFC7432] and EVPN Virtual Private Wire Election in EVPN Services [RFC7432] and EVPN Virtual Private Wire
Services [RFC8214]. Services [RFC8214].
The AC-DF capability MAY be used with any "DF Alg" algorithm. It MUST The AC-DF capability is expected to be of general applicability with
modify the DF Election procedures by removing from consideration any any future DF Algorithm. It modifies the DF Election procedures by
candidate PE in the ES that cannot forward traffic on the AC that removing from consideration any candidate PE in the ES that cannot
belongs to the BD. This section is applicable to VLAN-Based and VLAN forward traffic on the AC that belongs to the BD. This section is
Bundle service interfaces. Section 5.1 describes the procedures for applicable to VLAN-Based and VLAN Bundle service interfaces. Section
VLAN-Aware Bundle interfaces. 5.1 describes the procedures for VLAN-Aware Bundle interfaces.
In particular, when used with the Default DF Alg, the AC-DF In particular, when used with the Default DF Alg, the AC-DF
capability modifies the Step 3 in the DF Election procedure described capability modifies the Step 3 in the DF Election procedure described
in [RFC7432] Section 8.5, as follows: in [RFC7432] Section 8.5, as follows:
3. When the timer expires, each PE builds an ordered "candidate" list 3. When the timer expires, each PE builds an ordered "candidate" list
of the IP addresses of all the PE nodes attached to the Ethernet of the IP addresses of all the PE nodes attached to the Ethernet
Segment (including itself), in increasing numeric value. The Segment (including itself), in increasing numeric value. The
candidate list is based on the Originator Router's IP addresses of candidate list is based on the Originator Router's IP addresses of
the ES routes, but excludes any PE from whom no Ethernet A-D per the ES routes, but excludes any PE from whom no Ethernet A-D per
ES route has been received, or from whom the route has been ES route has been received, or from whom the route has been
withdrawn. Afterwards, the DF Election algorithm is applied on a withdrawn. Afterwards, the DF Election algorithm is applied on a
per <ES,VLAN> or <ES,VLAN Bundle>, however, the IP address for a per <ES, Ethernet Tag>, however, the IP address for a PE will not
PE will not be considered candidate for a given <ES,VLAN> or be considered candidate for a given <ES, Ethernet Tag> until the
<ES,VLAN Bundle> until the corresponding Ethernet A-D per EVI corresponding Ethernet A-D per EVI route has been received from
route has been received from that PE. In other words, the ACS on that PE. In other words, the ACS on the ES for a given PE must be
the ES for a given PE must be UP so that the PE is considered as UP so that the PE is considered as candidate for a given BD. If
candidate for a given BD. the Default DF Alg is used, every PE in the resulting candidate
list is then given an ordinal indicating its position in the
ordered list, starting with 0 as the ordinal for the PE with the
numerically lowest IP address. The ordinals are used to determine
which PE node will be the DF for a given Ethernet Tag on the
Ethernet Segment, using the following rule:
The above paragraph differs from [RFC7432] Section 8.5, Step 3, in Assuming a redundancy group of N PE nodes, for VLAN-based service,
two aspects: the PE with ordinal i is the DF for an <ES, Ethernet Tag V> when
(V mod N)= i. In the case of VLAN-(aware) bundle service, then the
numerically lowest VLAN value in that bundle on that ES MUST be
used in the modulo function as Ethernet Tag.
o Any DF Alg algorithm can be used, and not only the modulus-based It should be noted that using the "Originating Router's IP
one (which is the Default DF Election, or DF Alg 0 in this address" field in the Ethernet Segment route to get the PE IP
document). address needed for the ordered list allows for a CE to be
multihomed across different ASes if such a need ever arises.
The above three paragraphs differ from [RFC7432] Section 8.5, Step 3,
in two aspects:
o Any DF Alg algorithm can be used, and not only the described
modulus-based DF Alg (referred to as the Default DF Election, or DF
Alg 0 in this document).
o The candidate list is pruned based upon non-receipt of Ethernet A-D o The candidate list is pruned based upon non-receipt of Ethernet A-D
routes: a PE's IP address MUST be removed from the ES candidate routes: a PE's IP address MUST be removed from the ES candidate
list if its Ethernet A-D per ES route is withdrawn. A PE's IP list if its Ethernet A-D per ES route is withdrawn. A PE's IP
address MUST NOT be considered as candidate DF for a <ES,VLAN> or address MUST NOT be considered as candidate DF for a <ES, Ethernet
<ES,VLAN Bundle>, if its Ethernet A-D per EVI route for the Tag>, if its Ethernet A-D per EVI route for the <ES, Ethernet Tag>
<ES,VLAN> or <ES,VLAN Bundle> respectively, is withdrawn. is withdrawn.
The following example illustrates the AC-DF behavior applied to the The following example illustrates the AC-DF behavior applied to the
Default DF election algorithm, assuming the network in Figure 2: Default DF election algorithm, assuming the network in Figure 2:
a) When PE1 and PE2 discover ES12, they advertise an ES route for a) When PE1 and PE2 discover ES12, they advertise an ES route for
ES12 with the associated ES-import extended community and the DF ES12 with the associated ES-import extended community and the DF
Election Extended Community indicating AC-DF=1; they start a timer Election Extended Community indicating AC-DF=1; they start a DF
at the same time. Likewise, PE2 and PE3 advertise an ES route for Wait timer (independently). Likewise, PE2 and PE3 advertise an ES
ES23 with AC-DF=1 and start a timer. route for ES23 with AC-DF=1 and start a DF Wait timer.
b) PE1/PE2 advertise an Ethernet A-D per ES route for ES12, and b) PE1/PE2 advertise an Ethernet A-D per ES route for ES12, and
PE2/PE3 advertise an Ethernet A-D per ES route for ES23. PE2/PE3 advertise an Ethernet A-D per ES route for ES23.
c) In addition, PE1/PE2/PE3 advertise an Ethernet A-D per EVI route c) In addition, PE1/PE2/PE3 advertise an Ethernet A-D per EVI route
for AC1, AC2, AC3 and AC4 as soon as the ACs are enabled. Note for AC1, AC2, AC3 and AC4 as soon as the ACs are enabled. Note
that the AC can be associated to a single customer VID (e.g. VLAN- that the AC can be associated to a single customer VID (e.g. VLAN-
based service interfaces) or a bundle of customer VIDs (e.g. VLAN based service interfaces) or a bundle of customer VIDs (e.g. VLAN
Bundle service interfaces). Bundle service interfaces).
skipping to change at page 22, line 8 skipping to change at page 23, line 31
the candidate list, the DF Election can be applied for the the candidate list, the DF Election can be applied for the
remaining N candidates. remaining N candidates.
Note that this procedure only modifies the existing EVPN control Note that this procedure only modifies the existing EVPN control
plane by adding and processing the DF Election Extended Community, plane by adding and processing the DF Election Extended Community,
and by pruning the candidate list of PEs that take part in the DF and by pruning the candidate list of PEs that take part in the DF
election. election.
In addition to the events defined in the FSM in Section 3.1, the In addition to the events defined in the FSM in Section 3.1, the
following events SHALL modify the candidate PE list and trigger the following events SHALL modify the candidate PE list and trigger the
DF re-election in a PE for a given <ES,VLAN> or <ES,VLAN Bundle>. In DF re-election in a PE for a given <ES, Ethernet Tag>. In the FSM of
the FSM of Figure 3, the events below MUST trigger a transition from Figure 3, the events below MUST trigger a transition from DF_DONE to
DF_DONE to DF_CALC: DF_CALC:
i. Local AC going DOWN/UP. i. Local AC going DOWN/UP.
ii. Reception of a new Ethernet A-D per EVI update/withdraw for the ii. Reception of a new Ethernet A-D per EVI update/withdraw for the
<ES,VLAN> or <ES,VLAN Bundle>. <ES, Ethernet Tag>.
iii. Reception of a new Ethernet A-D per ES update/withdraw for the iii. Reception of a new Ethernet A-D per ES update/withdraw for the
ES. ES.
5.1. AC-Influenced DF Election Capability For VLAN-Aware Bundle Services 5.1. AC-Influenced DF Election Capability For VLAN-Aware Bundle Services
The procedure described in section 5 works for VLAN-based and VLAN The procedure described in section 5 works for VLAN-based and VLAN
Bundle service interfaces since, for those service types, a PE Bundle service interfaces since, for those service types, a PE
advertises only one Ethernet A-D per EVI route per <ES,VLAN> or advertises only one Ethernet A-D per EVI route per <ES,VLAN> or
<ES,VLAN Bundle>. The withdrawal of such route means that the PE <ES,VLAN Bundle>. In Section 5, an Ethernet Tag represents a given
cannot forward traffic on that particular <ES,VLAN> or <ES,VLAN VLAN or VLAN Bundle for the purpose of DF Election. The withdrawal of
Bundle>, therefore the PE can be removed from consideration for DF. such route means that the PE cannot forward traffic on that
particular <ES,VLAN> or <ES,VLAN Bundle>, therefore the PE can be
removed from consideration for DF.
According to [RFC7432], in VLAN-aware Bundle services, the PE According to [RFC7432], in VLAN-aware Bundle services, the PE
advertises multiple Ethernet A-D per EVI routes per <ES,VLAN Bundle> advertises multiple Ethernet A-D per EVI routes per <ES,VLAN Bundle>
(one route per Ethernet Tag), while the DF Election is still (one route per Ethernet Tag), while the DF Election is still
performed per <ES,VLAN Bundle>. The withdrawal of an individual route performed per <ES,VLAN Bundle>. The withdrawal of an individual route
only indicates the unavailability of a specific AC but not only indicates the unavailability of a specific AC but not
necessarily all the ACs in the <ES,VLAN Bundle>. necessarily all the ACs in the <ES,VLAN Bundle>.
This document modifies the DF Election for VLAN-Aware Bundle services This document modifies the DF Election for VLAN-Aware Bundle services
in the following way: in the following way:
skipping to change at page 23, line 48 skipping to change at page 25, line 26
framework. In general, this framework allows the PEs that are part of framework. In general, this framework allows the PEs that are part of
the same Ethernet Segment to exchange additional information and the same Ethernet Segment to exchange additional information and
agree on the DF Election Type and Capabilities to be used. agree on the DF Election Type and Capabilities to be used.
Following the procedures in this document, the operator will minimize Following the procedures in this document, the operator will minimize
undesired situations such as unfair load-balancing, service undesired situations such as unfair load-balancing, service
disruption and traffic black-holing. Since those situations may have disruption and traffic black-holing. Since those situations may have
been purposely created by a malicious user with access to the been purposely created by a malicious user with access to the
configuration of one PE, this document enhances also the security of configuration of one PE, this document enhances also the security of
the network. Note that the network will not benefit of the new the network. Note that the network will not benefit of the new
procedures if the configuration of one of the PEs in the ES is procedures if the DF Election Alg is not consistently configured on
changed to the Default [RFC7432] DF Election. all the PEs in the ES (if there is no unanimity among all the PEs,
the DF Election Alg falls back to the Default [RFC7432] DF Election).
This behavior could be exploited by an attacker that manages to
modify the configuration of one PE in the Ethernet Segment so that
the DF Election Alg and capabilities in all the PEs in the Ethernet
Segment fall back to the Default DF Election. If that is the case,
the PEs will be exposed to the unfair load-balancing, service
disruption and black-holing that were mentioned earlier.
In addition, the new framework is extensible and allows for future In addition, the new framework is extensible and allows for future
new security enhancements that are out of the scope of this document. new security enhancements that are out of the scope of this document.
Finally, since this document extends the procedures in [RFC7432], the Finally, since this document extends the procedures in [RFC7432], the
same Security Considerations described in [RFC7432] are valid for same Security Considerations described in [RFC7432] are valid for
this document. this document.
8. IANA Considerations 8. IANA Considerations
IANA is requested to: IANA is requested to:
skipping to change at page 26, line 18 skipping to change at page 27, line 50
[RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path [RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path
Algorithm", RFC 2992, DOI 10.17487/RFC2992, November 2000, Algorithm", RFC 2992, DOI 10.17487/RFC2992, November 2000,
<http://www.rfc-editor.org/info/rfc2992>. <http://www.rfc-editor.org/info/rfc2992>.
[HRW1999] Thaler, D. and C. Ravishankar, "Using Name-Based Mappings [HRW1999] Thaler, D. and C. Ravishankar, "Using Name-Based Mappings
to Increase Hit Rates", IEEE/ACM Transactions in networking Volume 6 to Increase Hit Rates", IEEE/ACM Transactions in networking Volume 6
Issue 1, February 1998, <https://www.microsoft.com/en-us/research/wp- Issue 1, February 1998, <https://www.microsoft.com/en-us/research/wp-
content/uploads/2017/02/HRW98.pdf>. content/uploads/2017/02/HRW98.pdf>.
[Knuth] Art of Computer Programming - Sorting and Searching,Vol 3
Pg. 516, Addison Wesley
10. Acknowledgments 10. Acknowledgments
The authors want to thank Sriram Venkateswaran, Laxmi Padakanti, The authors want to thank Sriram Venkateswaran, Laxmi Padakanti,
Ranganathan Boovaraghavan, Tamas Mondal, Sami Boutros, Jakob Heitz, Ranganathan Boovaraghavan, Tamas Mondal, Sami Boutros, Jakob Heitz,
Mrinmoy Ghosh, Leo Mermelstein, Mankamana Mishra, Anoop Ghanwani and Mrinmoy Ghosh, Leo Mermelstein, Mankamana Mishra, Anoop Ghanwani and
Samir Thoria for their review and contributions. Special thanks to Samir Thoria for their review and contributions. Special thanks to
Stephane Litkowski for his thorough review and detailed Stephane Litkowski for his thorough review and detailed
contributions. contributions.
11. Contributors 11. Contributors
 End of changes. 37 change blocks. 
82 lines changed or deleted 147 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/