draft-ietf-bess-evpn-df-election-framework-00.txt   draft-ietf-bess-evpn-df-election-framework-01.txt 
skipping to change at page 1, line 14 skipping to change at page 1, line 14
Internet Draft Nokia Internet Draft Nokia
S. Mohanty, Ed. S. Mohanty, Ed.
Intended status: Standards Track A. Sajassi Intended status: Standards Track A. Sajassi
Cisco Cisco
J. Drake J. Drake
Juniper Juniper
K. Nagaraj K. Nagaraj
S. Sathappan S. Sathappan
Nokia Nokia
Expires: September 6, 2018 March 5, 2018 Expires: October 14, 2018 April 12, 2018
Framework for EVPN Designated Forwarder Election Extensibility Framework for EVPN Designated Forwarder Election Extensibility
draft-ietf-bess-evpn-df-election-framework-00 draft-ietf-bess-evpn-df-election-framework-01
Abstract Abstract
The Designated Forwarder (DF) in EVPN networks is the PE responsible The Designated Forwarder (DF) in EVPN networks is the PE responsible
for sending broadcast, unknown unicast and multicast (BUM) traffic to for sending broadcast, unknown unicast and multicast (BUM) traffic to
a multi-homed CE, on a given VLAN on a particular Ethernet Segment a multi-homed CE, on a given VLAN on a particular Ethernet Segment
(ES). The DF is selected out of a list of candidate PEs that (ES). The DF is selected out of a list of candidate PEs that
advertise the same Ethernet Segment Identifier (ESI) to the EVPN advertise the same Ethernet Segment Identifier (ESI) to the EVPN
network. By default, EVPN uses a DF Election algorithm referred to as network. By default, EVPN uses a DF Election algorithm referred to as
"Service Carving" and it is based on a modulus function (V mod N) "Service Carving" and it is based on a modulus function (V mod N)
skipping to change at page 2, line 18 skipping to change at page 2, line 18
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on September 6, 2018. This Internet-Draft will expire on October 14, 2018.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 4, line 26 skipping to change at page 4, line 26
o HRW - Highest Random Weight o HRW - Highest Random Weight
o VID and CE-VID - VLAN Identifier and Customer Equipment VLAN o VID and CE-VID - VLAN Identifier and Customer Equipment VLAN
Identifier. Identifier.
o Ethernet Tag - used to represent a Broadcast Domain that is o Ethernet Tag - used to represent a Broadcast Domain that is
configured on a given ES for the purpose of DF election. Note that configured on a given ES for the purpose of DF election. Note that
any of the following may be used to represent a Broadcast Domain: any of the following may be used to represent a Broadcast Domain:
VIDs (including double Q-in-Q tags), configured IDs, VNI, VIDs (including double Q-in-Q tags), configured IDs, VNI,
normalized VID, I-SIDs, etc., as along the representation of the normalized VID, I-SIDs, etc., as long as the representation of the
broadcast domains is configured consistently across the multi-homed broadcast domains is configured consistently across the multi-homed
PEs attached to that ES. PEs attached to that ES.
o DF Election Procedure and DF Algorithm - The Designated Forwarder o DF Election Procedure and DF Algorithm - The Designated Forwarder
Election Procedure or simply DF Election, refers to the process in Election Procedure or simply DF Election, refers to the process in
its entirety, including the discovery of the PEs in the ES, the its entirety, including the discovery of the PEs in the ES, the
creation and maintenance of the PE candidate list and the selection creation and maintenance of the PE candidate list and the selection
of a PE . The Designated Forwarder Algorithm is just a component of of a PE. The Designated Forwarder Algorithm is just a component of
the DF Election Procedure and strictly refers to the selection of a the DF Election Procedure and strictly refers to the selection of a
PE for a given <ES,Ethernet Tag>. PE for a given <ES,Ethernet Tag>.
This document also assumes familiarity with the terminology of This document also assumes familiarity with the terminology of
[RFC7432]. [RFC7432].
2. Introduction 2. Introduction
2.1. Default Designated Forwarder (DF) Election in EVPN 2.1. Default Designated Forwarder (DF) Election in EVPN
skipping to change at page 6, line 11 skipping to change at page 6, line 11
algorithm in which each participating PE independently and algorithm in which each participating PE independently and
unambiguously selects one of the participating PEs as the DF, and the unambiguously selects one of the participating PEs as the DF, and the
result should be unanimously in agreement. result should be unanimously in agreement.
The default procedure for DF election defined by [RFC7432] at the The default procedure for DF election defined by [RFC7432] at the
granularity of (ESI,EVI) is referred to as "service carving". In this granularity of (ESI,EVI) is referred to as "service carving". In this
document, service carving or default DF Election algorithm is used document, service carving or default DF Election algorithm is used
indistinctly. With service carving, it is possible to elect multiple indistinctly. With service carving, it is possible to elect multiple
DFs per Ethernet Segment (one per EVI) in order to perform load- DFs per Ethernet Segment (one per EVI) in order to perform load-
balancing of traffic destined to a given Segment. The objective is balancing of traffic destined to a given Segment. The objective is
that the load-balancing procedures should carve up the BDspace among that the load-balancing procedures should carve up the BD space among
the redundant PE nodes evenly, in such a way that every PE is the DF the redundant PE nodes evenly, in such a way that every PE is the DF
for a disjoint set of EVIs. for a disjoint set of EVIs.
The DF Election algorithm as described in [RFC7432] (Section 8.5) is The DF Election algorithm as described in [RFC7432] (Section 8.5) is
based on a modulus operation. The PEs to which the ES (for which DF based on a modulus operation. The PEs to which the ES (for which DF
election is to be carried out per VLAN) is multi-homed from an election is to be carried out per VLAN) is multi-homed form an
ordered (ordinal) list in ascending order of the PE IP address ordered (ordinal) list in ascending order of the PE IP address
values. For example, there are N PEs: PE0, PE1,... PEN-1 ranked as values. For example, there are N PEs: PE0, PE1,... PEN-1 ranked as
per increasing IP addresses in the ordinal list; then for each VLAN per increasing IP addresses in the ordinal list; then for each VLAN
with Ethernet Tag V, configured on the Ethernet Segment ES1, PEx is with Ethernet Tag V, configured on the Ethernet Segment ES1, PEx is
the DF for VLAN V on ES1 when x equals (V mod N). In the case of the DF for VLAN V on ES1 when x equals (V mod N). In the case of
VLAN-Bundle only the lowest VLAN is used. In the case when the VLAN-Bundle only the lowest VLAN is used. In the case when the
planned density is high (meaning there are significant number of planned density is high (meaning there are significant number of
VLANs and the Ethernet Tags are uniformly distributed), the thinking VLANs and the Ethernet Tags are uniformly distributed), the thinking
is that the DF Election will be spread across the PEs hosting that is that the DF Election will be spread across the PEs hosting that
Ethernet Segment and good service carving can be achieved. Ethernet Segment and good service carving can be achieved.
skipping to change at page 6, line 50 skipping to change at page 6, line 50
this document specifies that there will be multiple DFs, one for each this document specifies that there will be multiple DFs, one for each
BD configured in that EVI. BD configured in that EVI.
2.2. Problem Statement 2.2. Problem Statement
This section describes some potential issues on the default DF This section describes some potential issues on the default DF
Election algorithm. Election algorithm.
2.2.1. Unfair Load-Balancing and Service Disruption 2.2.1. Unfair Load-Balancing and Service Disruption
There are three fundamental problems with the current DF Election There are three fundamental problems with the current default DF
algorithm. Election algorithm.
1- First, the algorithm will not perform well when the Ethernet Tag 1- First, the algorithm will not perform well when the Ethernet Tag
follows a non-uniform distribution, for instance when the Ethernet follows a non-uniform distribution, for instance when the Ethernet
Tags are all even or all odd. In such a case let us assume that Tags are all even or all odd. In such a case let us assume that
the ES is multi-homed to two PEs; all the VLANs will only pick one the ES is multi-homed to two PEs; all the VLANs will only pick one
of the PEs as the DF. This is very sub-optimal. It defeats the of the PEs as the DF. This is very sub-optimal. It defeats the
purpose of service carving as the DFs are not really evenly spread purpose of service carving as the DFs are not really evenly spread
across. In this particular case, in fact one of the PEs does not across. In this particular case, in fact one of the PEs does not
get elected all as the DF, so it does not participate in the DF get elected all as the DF, so it does not participate in the DF
responsibilities at all. Consider another example where referring responsibilities at all. Consider another example where referring
skipping to change at page 8, line 48 skipping to change at page 8, line 48
a) A given individual Attachment Circuit (AC) defined in an ES is a) A given individual Attachment Circuit (AC) defined in an ES is
accidentally shutdown or even not provisioned yet (hence the accidentally shutdown or even not provisioned yet (hence the
Attachment Circuit Status - ACS - is DOWN), while the ES is Attachment Circuit Status - ACS - is DOWN), while the ES is
operationally active (since the ES route is active). operationally active (since the ES route is active).
b) A given MAC-VRF - with a defined ES - is shutdown or not b) A given MAC-VRF - with a defined ES - is shutdown or not
provisioned yet, while the ES is operationally active (since the provisioned yet, while the ES is operationally active (since the
ES route is active). In this case, the ACS of all the ACs defined ES route is active). In this case, the ACS of all the ACs defined
in that MAC-VRF is considered to be DOWN. in that MAC-VRF is considered to be DOWN.
Neither (a) nor (b) will trigger the DF re-election on the remote PEs Neither (a) nor (b) will trigger the DF re-election on the remote
for a given ES since the ACS is not taken into account in the DF multi-homed PEs for a given ES since the ACS is not taken into
election procedures. While the ACS is used as a DF election account in the DF election procedures. While the ACS is used as a DF
tie-breaker and trigger in VPLS multi-homing procedures [VPLS-MH], election tie-breaker and trigger in VPLS multi-homing procedures
there is no procedure defined in EVPN [RFC7432] to trigger the DF re- [VPLS-MH], there is no procedure defined in EVPN [RFC7432] to trigger
election based on the ACS change on the DF. the DF re- election based on the ACS change on the DF.
Figure 2 illustrates the described issue with an example. Figure 2 illustrates the described issue with an example.
+---+ +---+
|CE4| |CE4|
+---+ +---+
| |
PE4 | PE4 |
+-----+-----+ +-----+-----+
+---------------| +-----+ |---------------+ +---------------| +-----+ |---------------+
skipping to change at page 10, line 45 skipping to change at page 10, line 45
Election procedures. In order to address those issues, this document Election procedures. In order to address those issues, this document
describes a new DF Election algorithm and a new capability that can describes a new DF Election algorithm and a new capability that can
influence the DF Election result: influence the DF Election result:
o The new DF Election algorithm is referred to as "Highest Random o The new DF Election algorithm is referred to as "Highest Random
Weight" (HRW). The HRW procedures are described in section 4. Weight" (HRW). The HRW procedures are described in section 4.
o The new DF Election capability is referred to as "AC-Influenced DF o The new DF Election capability is referred to as "AC-Influenced DF
Election" (AC-DF). The AC-DF procedures are described in section 5. Election" (AC-DF). The AC-DF procedures are described in section 5.
o Both, HRW and AC-DF MAY be used independently or simultaneously. o HRW and AC-DF mechanisms are independent of each other. Therefore,
The AC-DF capability MAY be used with the default DF Election a PE MAY support either HRW or AC-DF independently or MAY support
algorithm too. both of them together. A PE MAY also support AC-DF capability along
with the default DF election algorithm per [RFC7432].
o In general, a DF Election Type refers to the type of DF election
algorithm that takes a number of parameters as input and determines
the DF PE. A DF Election capability refers to an additional feature
that can be executed along with the DF election algorithm, such as
modifying the inputs (or list of candidate PEs) before the DF
Election algorithm chooses the DF.
In addition, this document defines a way to indicate the support of In addition, this document defines a way to indicate the support of
HRW and/or AC-DF along with the EVPN ES routes advertised for a given HRW and/or AC-DF along with the EVPN ES routes advertised for a given
ES. Refer to section 3.2 for more details. ES. Refer to section 3.2 for more details.
3. Designated Forwarder Election Protocol and BGP Extensions 3. Designated Forwarder Election Protocol and BGP Extensions
This section describes the BGP extensions required to support the new This section describes the BGP extensions required to support the new
DF Election procedures. In addition, since the specification in EVPN DF Election procedures. In addition, since the specification in EVPN
[RFC7432] does leave several questions open as to the precise final [RFC7432] does leave several questions open as to the precise final
skipping to change at page 17, line 30 skipping to change at page 17, line 30
here. here.
4.2. HRW Algorithm for EVPN DF Election 4.2. HRW Algorithm for EVPN DF Election
The applicability of HRW to DF Election is described here. Let DF(v) The applicability of HRW to DF Election is described here. Let DF(v)
denote the Designated Forwarder and BDF(v) the Backup Designated denote the Designated Forwarder and BDF(v) the Backup Designated
forwarder for the Ethernet Tag V, where v is the VLAN, Si is the IP forwarder for the Ethernet Tag V, where v is the VLAN, Si is the IP
address of server i, Es denotes the Ethernet Segment Identifier and address of server i, Es denotes the Ethernet Segment Identifier and
weight is a pseudo-random function of v and Si. weight is a pseudo-random function of v and Si.
Note that while the DF election algorithm in [RFC7432] uses PE
address and Ethernet Tag as inputs, this document uses PE address,
ESI, and Ethernet Tag as inputs. This is because if the same set of
PEs are multi-homed to the same set of ESes, then the DF election
algorithm used in [RFC7432] would result in the same PE being elected
DF for the same set of broadcast domains on each ES, which can have
adverse side-effects on both load balancing and redundancy. Including
ESI in the DF election algorithm introduces additional entropy which
significantly reduces the probability of the same PE being elected DF
for the same set of broadcast domains on each ES. Therefore, the ESI
value in the Weight function below SHOULD be set to that of
corresponding ES. The ESI value MAY be set to all 0's in the Weight
function below if the operator chooses so.
In case of a VLAN-Bundle service, v denotes the lowest VLAN similar In case of a VLAN-Bundle service, v denotes the lowest VLAN similar
to the 'lowest VLAN in bundle' logic of [RFC7432]. to the 'lowest VLAN in bundle' logic of [RFC7432].
1. DF(v) = Si: Weight(v, Es, Si) >= Weight(V, Es, Sj), for all j. In 1. DF(v) = Si: Weight(v, Es, Si) >= Weight(V, Es, Sj), for all j. In
case of a tie, choose the PE whose IP address is numerically the case of a tie, choose the PE whose IP address is numerically the
least. Note 0 <= i,j <= Number of PEs in the redundancy group. least. Note 0 <= i,j <= Number of PEs in the redundancy group.
2. BDF(v) = Sk: Weight(v, Es, Si) >= Weight(V, Es, Sk) and Weight(v, 2. BDF(v) = Sk: Weight(v, Es, Si) >= Weight(V, Es, Sk) and Weight(v,
Sk) >= Weight(v, Es, Sj). In case of tie choose the PE whose IP Sk) >= Weight(v, Es, Sj). In case of tie choose the PE whose IP
address is numerically the least. address is numerically the least.
skipping to change at page 23, line 20 skipping to change at page 23, line 32
[RFC8214] Boutros, S., Sajassi, A., Salam, S., Drake, J., and J. [RFC8214] Boutros, S., Sajassi, A., Salam, S., Drake, J., and J.
Rabadan, "Virtual Private Wire Service Support in Ethernet VPN", RFC Rabadan, "Virtual Private Wire Service Support in Ethernet VPN", RFC
8214, DOI 10.17487/RFC8214, August 2017, <https://www.rfc- 8214, DOI 10.17487/RFC8214, August 2017, <https://www.rfc-
editor.org/info/rfc8214>. editor.org/info/rfc8214>.
[HRW1999] Thaler, D. and C. Ravishankar, "Using Name-Based Mappings [HRW1999] Thaler, D. and C. Ravishankar, "Using Name-Based Mappings
to Increase Hit Rates", IEEE/ACM Transactions in networking Volume 6 to Increase Hit Rates", IEEE/ACM Transactions in networking Volume 6
Issue 1, February 1998. Issue 1, February 1998.
[I-D.ietf-idr-extcomm-iana] Rosen, E. and Y. Rekhter, "IANA [RFC7153] Rosen, E. and Y. Rekhter, "IANA Registries for BGP
Registries for BGP Extended Communities", draft-ietf-idr-extcomm- Extended Communities", RFC 7153, DOI 10.17487/RFC7153, March 2014,
iana-02 (work in progress), December 2013. <https://www.rfc-editor.org/info/rfc7153>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March
1997, <http://www.rfc-editor.org/info/rfc2119>. 1997, <http://www.rfc-editor.org/info/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017,
<https://www.rfc-editor.org/info/rfc8174>. <https://www.rfc-editor.org/info/rfc8174>.
[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
 End of changes. 12 change blocks. 
21 lines changed or deleted 43 lines changed or added

This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/