draft-ietf-pim-drlb-15.txt   rfc8775.txt 
Network Working Group Y. Cai Internet Engineering Task Force (IETF) Y. Cai
Internet-Draft H. Ou Request for Comments: 8775 H. Ou
Intended status: Standards Track Alibaba Group Category: Standards Track Alibaba Group
Expires: July 6, 2020 S. Vallepalli ISSN: 2070-1721 S. Vallepalli
M. Mishra M. Mishra
S. Venaas S. Venaas
Cisco Systems, Inc. Cisco Systems, Inc.
A. Green A. Green
British Telecom British Telecom
January 3, 2020 April 2020
PIM Designated Router Load Balancing PIM Designated Router Load Balancing
draft-ietf-pim-drlb-15
Abstract Abstract
On a multi-access network, one of the PIM-SM (PIM Sparse Mode) On a multi-access network, one of the PIM-SM (PIM Sparse Mode)
routers is elected as a Designated Router. One of the routers is elected as a Designated Router. One of the
responsibilities of the Designated Router is to track local multicast responsibilities of the Designated Router is to track local multicast
listeners and forward data to these listeners if the group is listeners and forward data to these listeners if the group is
operating in PIM-SM. This document specifies a modification to the operating in PIM-SM. This document specifies a modification to the
PIM-SM protocol that allows more than one of the PIM-SM routers to PIM-SM protocol that allows more than one of the PIM-SM routers to
take on this responsibility so that the forwarding load can be take on this responsibility so that the forwarding load can be
distributed among multiple routers. distributed among multiple routers.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This is an Internet Standards Track document.
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841.
This Internet-Draft will expire on July 6, 2020. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc8775.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Terminology
3. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Applicability
4. Functional Overview . . . . . . . . . . . . . . . . . . . . . 5 4. Functional Overview
4.1. GDR Candidates . . . . . . . . . . . . . . . . . . . . . 6 4.1. GDR Candidates
5. Protocol Specification . . . . . . . . . . . . . . . . . . . 7 5. Protocol Specification
5.1. Hash Mask and Hash Algorithm . . . . . . . . . . . . . . 7 5.1. Hash Mask and Hash Algorithm
5.2. Modulo Hash Algorithm . . . . . . . . . . . . . . . . . . 8 5.2. Modulo Hash Algorithm
5.2.1. Modulo Hash Algorithm Examples . . . . . . . . . . . 9 5.2.1. Modulo Hash Algorithm Examples
5.2.2. Limitations . . . . . . . . . . . . . . . . . . . . . 10 5.2.2. Limitations
5.3. PIM Hello Options . . . . . . . . . . . . . . . . . . . . 11 5.3. PIM Hello Options
5.3.1. PIM DR Load Balancing Capability (DRLB-Cap) Hello 5.3.1. PIM DR Load-Balancing Capability (DRLB-Cap) Hello
Option . . . . . . . . . . . . . . . . . . . . . . . 11 Option
5.3.2. PIM DR Load Balancing List (DRLB-List) Hello Option . 12 5.3.2. PIM DR Load-Balancing List (DRLB-List) Hello Option
5.4. PIM DR Operation . . . . . . . . . . . . . . . . . . . . 13 5.4. PIM DR Operation
5.5. PIM GDR Candidate Operation . . . . . . . . . . . . . . . 14 5.5. PIM GDR Candidate Operation
5.6. DRLB-List Hello Option Processing . . . . . . . . . . . . 14 5.6. DRLB-List Hello Option Processing
5.7. PIM Assert Modification . . . . . . . . . . . . . . . . . 15 5.7. PIM Assert Modification
5.8. Backward Compatibility . . . . . . . . . . . . . . . . . 16 5.8. Backward Compatibility
6. Operational Considerations . . . . . . . . . . . . . . . . . 16 6. Operational Considerations
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 7. IANA Considerations
7.1. Initial registry . . . . . . . . . . . . . . . . . . . . 17 7.1. Initial Registry
7.2. Assignment of new Hash Algorithms . . . . . . . . . . . . 17 7.2. Assignment of New Hash Algorithms
8. Security Considerations . . . . . . . . . . . . . . . . . . . 17 8. Security Considerations
9. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 18 9. References
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 9.1. Normative References
10.1. Normative References . . . . . . . . . . . . . . . . . . 18 9.2. Informative References
10.2. Informative References . . . . . . . . . . . . . . . . . 19 Acknowledgements
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 Authors' Addresses
1. Introduction 1. Introduction
On a multi-access LAN, such as an Ethernet, with one or more PIM-SM On a multi-access LAN (such as an Ethernet) with one or more PIM-SM
(PIM Sparse Mode) [RFC7761] routers, one of the PIM-SM routers is (PIM Sparse Mode) [RFC7761] routers, one of the PIM-SM routers is
elected as a Designated Router (DR). The PIM DR has two elected as a Designated Router (DR). The PIM DR has two
responsibilities in the PIM-SM protocol. For any active sources on a responsibilities in the PIM-SM protocol. For any active sources on a
LAN, the PIM DR is responsible for registering with the Rendezvous LAN, the PIM DR is responsible for registering with the Rendezvous
Point (RP) if the group is operating in PIM-SM. Also, the PIM DR is Point (RP) if the group is operating in PIM-SM. Also, the PIM DR is
responsible for tracking local multicast listeners and forwarding to responsible for tracking local multicast listeners and forwarding
these listeners if the group is operating in PIM-SM. data to these listeners if the group is operating in PIM-SM.
Consider the following LAN in Figure 1: Consider the following LAN in Figure 1:
(core networks) (core networks)
| | | | | |
| | | | | |
R1 R2 R3 R1 R2 R3
| | | | | |
----(LAN)---- ----(LAN)----
| |
| |
(many receivers) (many receivers)
Figure 1: LAN with receivers Figure 1: LAN with Receivers
Assume R1 is elected as the DR. According to the PIM-SM protocol, R1 Assume R1 is elected as the DR. According to the PIM-SM protocol, R1
will be responsible for forwarding traffic to that LAN on behalf of will be responsible for forwarding traffic to that LAN on behalf of
all local members. In addition to keeping track of membership all local members. In addition to keeping track of membership
reports, R1 is also responsible for initiating the creation of source reports, R1 is also responsible for initiating the creation of source
and/or shared trees towards the senders or the RPs. The membership and/or shared trees towards the senders or the RPs. The membership
reports would be IGMP or MLD messages. This applies to any versions reports would be IGMP or Multicast Listener Discovery (MLD) messages.
of the IGMP and MLD protocols. The most recent versions are IGMPv3 This applies to any versions of the IGMP and MLD protocols. The most
[RFC3376] and MLDv2 [RFC3810]. recent versions are IGMPv3 [RFC3376] and MLDv2 [RFC3810].
Having a single router acting as DR and being responsible for data Having a single router acting as DR and being responsible for data-
plane forwarding leads to several issues. One of the issues is that plane forwarding leads to several issues. One of the issues is that
the aggregated bandwidth will be limited to what R1 can handle with the aggregated bandwidth will be limited to what R1 can handle with
regards to capacity of incoming links, the interface on the LAN, and regards to capacity of incoming links, the interface on the LAN, and
total forwarding capacity. It is very common that a LAN consists of total forwarding capacity. It is very common that a LAN consists of
switches that run IGMP/MLD or PIM snooping [RFC4541]. This allows switches that run IGMP/MLD or PIM snooping [RFC4541]. This allows
the forwarding of multicast packets to be restricted only to segments the forwarding of multicast packets to be restricted only to segments
leading to receivers that have indicated their interest in multicast leading to receivers that have indicated their interest in multicast
groups using either IGMP or MLD. The emergence of the switched groups using either IGMP or MLD. The emergence of the switched
Ethernet allows the aggregated bandwidth to exceed, sometimes by a Ethernet allows the aggregated bandwidth to exceed, sometimes by a
large number, that of a single link. For example, let us modify large number, that of a single link. For example, let us modify
skipping to change at page 4, line 18 skipping to change at line 148
R1 R2 R3 R1 R2 R3
| | | | | |
+=gi1===gi2===gi3=+ +=gi1===gi2===gi3=+
+ + + +
+ switch + + switch +
+ + + +
+=gi4===gi5===gi6=+ +=gi4===gi5===gi6=+
| | | | | |
H1 H2 H3 H1 H2 H3
Figure 2: LAN with Ethernet Switch Figure 2: LAN with Ethernet Switch
Let us assume that each individual link is a Gigabit Ethernet. Each Let us assume that each individual link is a Gigabit Ethernet. Each
router, R1, R2 and R3, and the switch have enough forwarding capacity router (R1, R2, and R3) and the switch have enough forwarding
to handle hundreds of Gigabits of data. capacity to handle hundreds of gigabits of data.
Let us further assume that each of the hosts requests 500 Mbps of Let us further assume that each of the hosts requests 500 Mbps of
unique multicast data. This totals to 1.5 Gbps of data, which is unique multicast data. This totals to 1.5 Gbps of data, which is
less than what each switch or the combined uplink bandwidth across less than what each switch or the combined uplink bandwidth across
the routers can handle, even under failure of a single router. the routers can handle, even under failure of a single router.
On the other hand, the link between R1 and switch, via port gi1, can On the other hand, the link between R1 and switch, via port gi1, can
only handle a throughput of 1Gbps. And if R1 is the only DR (the PIM only handle a throughput of 1 Gbps. And if R1 is the only DR (the
DR elected using the procedure defined by [RFC7761]) at least 500 PIM DR elected using the procedure defined by [RFC7761]), at least
Mbps worth of data will be lost because the only link that can be 500 Mbps worth of data will be lost because the only link that can be
used to draw the traffic from the routers to the switch is via gi1. used to draw the traffic from the routers to the switch is via gi1.
In other words, the entire network's throughput is limited by the In other words, the entire network's throughput is limited by the
single connection between the PIM DR and the switch (or LAN as in single connection between the PIM DR and the switch (or LAN, as in
Figure 1). Figure 1).
Another important issue is related to failover. If R1 is the only Another important issue is related to failover. If R1 is the only
forwarder on a shared LAN, when R1 goes out of service, multicast forwarder on a shared LAN, when R1 goes out of service, multicast
forwarding for the entire LAN has to be rebuilt by the newly elected forwarding for the entire LAN has to be rebuilt by the newly elected
PIM DR. However, if there were a way that allowed multiple routers PIM DR. However, if there were a way that allowed multiple routers
to forward to the LAN for different groups, failure of one of the to forward to the LAN for different groups, failure of one of the
routers would only lead to disruption to a subset of the flows, routers would only lead to disruption to a subset of the flows,
therefore improving the overall resilience of the network. therefore improving the overall resilience of the network.
This document specifies a modification to the PIM-SM protocol that This document specifies a modification to the PIM-SM protocol that
allows more than one of these routers, called Group Designated allows more than one of these routers, called Group Designated
Routers (GDR) to be selected so that the forwarding load can be Routers (GDRs), to be selected so that the forwarding load can be
distributed among a number of routers. distributed among a number of routers.
2. Terminology 2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in
14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
With respect to PIM-SM, this document follows the terminology that With respect to PIM-SM, this document follows the terminology that
has been defined in [RFC7761]. has been defined in [RFC7761].
This document also introduces the following new acronyms: This document also introduces the following new acronyms:
o GDR: Group Designated Router. For each multicast flow, either a GDR: Group Designated Router. For each multicast flow, either a
(*,G) for Any-Source Multicast (ASM), or an (S,G) for Source- (*,G) for Any-Source Multicast (ASM) or an (S,G) for Source-
Specific Multicast (SSM) [RFC4607], a Hash Algorithm (described Specific Multicast (SSM) [RFC4607], a hash algorithm (described
below) is used to select one of the routers as a GDR. The GDR is below) is used to select one of the routers as a GDR. The GDR is
responsible for initiating the forwarding tree building process responsible for initiating the forwarding tree building process
for the corresponding multicast flow. for the corresponding multicast flow.
o GDR Candidate: a router that has the potential to become a GDR. GDR Candidate: a router that has the potential to become a GDR.
There might be multiple GDR Candidates on a LAN, but only one can There might be multiple GDR Candidates on a LAN, but only one can
become the GDR for a specific multicast flow. become the GDR for a specific multicast flow.
3. Applicability 3. Applicability
The extension specified in this document applies to PIM-SM routers The extension specified in this document applies to PIM-SM routers
acting as last hop routers (there are directly connected receivers). acting as last-hop routers (there are directly connected receivers).
It does not alter the behavior of a PIM DR, or any other routers, on It does not alter the behavior of a PIM DR or any other routers on
the first hop network (directly connected sources). This is because the first-hop network (directly connected sources). This is because
the source tree is built using the IP address of the sender, not the the source tree is built using the IP address of the sender, not the
IP address of the PIM DR that sends PIM registers towards the RP. IP address of the PIM DR that sends PIM registers towards the RP.
The load balancing between first hop routers can be achieved The load balancing between first-hop routers can be achieved
naturally if an IGP provides equal cost multiple paths (which it naturally if an IGP provides equal cost multiple paths (which it
usually does in practice). Also distributing the load to do source usually does in practice). Also, distributing the load to do source
registration does not justify the additional complexity required to registration does not justify the additional complexity required to
support it. support it.
4. Functional Overview 4. Functional Overview
In the PIM DR election as defined in [RFC7761], when multiple routers In the PIM DR election as defined in [RFC7761], when multiple routers
are connected to a multi-access LAN (for example, an Ethernet), one are connected to a multi-access LAN (for example, an Ethernet), one
of them is elected to act as PIM DR. The PIM DR is responsible for of them is elected to act as PIM DR. The PIM DR is responsible for
sending local Join/Prune messages towards the RP or source. In order sending local Join/Prune messages towards the RP or source. In order
to elect the PIM DR, each PIM router on the LAN examines the received to elect the PIM DR, each PIM router on the LAN examines the received
PIM Hello messages and compares its own DR priority and IP address PIM Hello messages and compares its own DR priority and IP address
with those of its neighbors. The router with the highest DR priority with those of its neighbors. The router with the highest DR priority
is the PIM DR. If there are multiple such routers, their IP is the PIM DR. If there are multiple such routers, their IP
addresses are used as the tie-breaker, as described in [RFC7761]. addresses are used as the tiebreaker, as described in [RFC7761].
In order to share forwarding load among last hop routers, besides the In order to share forwarding load among last-hop routers, besides the
normal PIM DR election, one or more GDRs are elected on the multi- normal PIM DR election, one or more GDRs are elected on the multi-
access LAN. There is only one PIM DR on the multi-access LAN, but access LAN. There is only one PIM DR on the multi-access LAN, but
there might be multiple GDR Candidates. there might be multiple GDR Candidates.
For each multicast flow, that is, (*,G) for ASM and (S,G) for SSM, a For each multicast flow, that is, (*,G) for ASM and (S,G) for SSM, a
Hash Algorithm [Section 5.1] is used to select one of the routers to hash algorithm (Section 5.1) is used to select one of the routers to
be the GDR. The new DR Load Balancing Capability (DRLB-Cap) PIM be the GDR. The new DR Load-Balancing Capability (DRLB-Cap) PIM
Hello Option is used to announce the Capability as well as the Hash Hello Option is used to announce the Capability, as well as the hash
Algorithm type. Routers with the new DRLB-Cap Option advertised in algorithm type. Routers with the new DRLB-Cap Option advertised in
their PIM Hello, using the same GDR election Hash Algorithm and the their PIM Hello, using the same GDR election hash algorithm and the
same DR priority as the PIM DR, are considered as GDR Candidates. same DR priority as the PIM DR, are considered as GDR Candidates.
Hash Masks are defined for Source, Group and RP separately, in order Hash masks are defined for Source, Group, and RP, separately, in
to handle PIM ASM/SSM. The masks, as well as a sorted list of GDR order to handle PIM ASM/SSM. The masks, as well as a sorted list of
Candidate Addresses, are announced by the DR in a new DR Load GDR Candidate addresses, are announced by the DR in a new DR Load-
Balancing List (DRLB-List) PIM Hello Option. Balancing List (DRLB-List) PIM Hello Option.
A Hash Algorithm based on the announced Source, Group, or RP masks A hash algorithm based on the announced Source, Group, or RP masks
allows one GDR to be assigned to a corresponding multicast state. allows one GDR to be assigned to a corresponding multicast state.
That GDR is responsible for initiating the creation of the multicast That GDR is responsible for initiating the creation of the multicast
forwarding tree for multicast traffic. forwarding tree for multicast traffic.
4.1. GDR Candidates 4.1. GDR Candidates
GDR is the new concept introduced by this specification. GDR GDR is the new concept introduced by this specification. GDR
Candidates are routers eligible for GDR election on the LAN. To Candidates are routers eligible for GDR election on the LAN. To
become a GDR Candidate, a router must have the same DR priority and become a GDR Candidate, a router must have the same DR priority and
run the same GDR election Hash Algorithm as the DR on the LAN. run the same GDR election hash algorithm as the DR on the LAN.
For example, assume there are 4 routers on the LAN: R1, R2, R3 and For example, assume there are 4 routers on the LAN: R1, R2, R3, and
R4, each announcing a DRLB-Cap option. R1, R2 and R3 have the same R4, each announcing a DRLB-Cap Option. R1, R2, and R3 have the same
DR priority while R4's DR priority is less preferred. In this DR priority, while R4's DR priority is less preferred. In this
example, R4 will not be eligible for GDR election, because R4 will example, R4 will not be eligible for GDR election, because R4 will
not become a PIM DR unless all of R1, R2 and R3 go out of service. not become a PIM DR unless all of R1, R2, and R3 go out of service.
Furthermore, assume router R1 wins the PIM DR election, R1 and R2 Furthermore, assume router R1 wins the PIM DR election, R1 and R2
advertise the same Hash Algorithm for GDR election, while R3 advertise the same hash algorithm for GDR election, while R3
advertises a different one. In this case, only R1 and R2 will be advertises a different one. In this case, only R1 and R2 will be
eligible for GDR election, while R3 will not. eligible for GDR election, while R3 will not.
As a DR, R1 will include its own Load Balancing Hash Masks and the As a DR, R1 will include its own Load-Balancing Hash Masks and the
identity of R1 and R2 (the GDR Candidates) in its DRLB-List Hello identity of R1 and R2 (the GDR Candidates) in its DRLB-List Hello
Option. Option.
5. Protocol Specification 5. Protocol Specification
5.1. Hash Mask and Hash Algorithm 5.1. Hash Mask and Hash Algorithm
A Hash Mask is used to extract a number of bits from the A hash mask is used to extract a number of bits from the
corresponding IP address field (32 for IPv4, 128 for IPv6) and corresponding IP address field (32 for IPv4, 128 for IPv6) and
calculate a hash value. A hash value is used to select a GDR from calculate a hash value. A hash value is used to select a GDR from
GDR Candidates advertised by the PIM DR. Hash masks allow for GDR Candidates advertised by the PIM DR. Hash masks allow for
certain flows to always be forwarded by the same GDR, by ignoring certain flows to always be forwarded by the same GDR, by ignoring
certain bits in the hash value calculation, so that the hash values certain bits in the hash value calculation, so that the hash values
are the same. For example, 0.0.255.0 defines a Hash Mask for an IPv4 are the same. For example, 0.0.255.0 defines a hash mask for an IPv4
address that masks the first, the second, and the fourth octets, address that masks the first, second, and fourth octets, which means
which means that only the third octet will influence the hash value that only the third octet will influence the hash value computed.
computed. Note that the masks need not be a contiguous set of bits. Note that the masks need not be a contiguous set of bits. For
E.g, for IPv4, 15.15.15.15 would be a valid mask. example, for IPv4, 15.15.15.15 would be a valid mask.
In the text below, a hash mask is in some places said to be zero. A In the text below, a hash mask is, in some places, said to be zero.
hash mask is zero if no bits are set. That is, 0.0.0.0 for IPv4 and A hash mask is zero if no bits are set, that is, 0.0.0.0 for IPv4 and
:: for IPv6. Also, a hash mask is said to be an all-bits-set mask if :: for IPv6. Also, a hash mask is said to be an all-bits-set mask if
it is 255.255.255.255 for IPv4 or it is 255.255.255.255 for IPv4 or
ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff for IPv6. ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff for IPv6.
There are three Hash Masks defined: There are three hash masks defined:
o RP Hash Mask * RP Hash Mask
o Source Hash Mask * Source Hash Mask
o Group Hash Mask * Group Hash Mask
The hash masks need to be configured on the PIM routers that can The hash masks need to be configured on the PIM routers that can
potentially become a PIM DR, unless the implementation provides potentially become a PIM DR, unless the implementation provides
default hash mask values. An implementation SHOULD have default hash default hash mask values. An implementation SHOULD have default hash
mask values as follows. The default RP Hash Mask SHOULD be zero (no mask values as follows. The default RP Hash Mask SHOULD be zero (no
bits set). The default Source and Group Hash Masks SHOULD both be bits set). The default Source and Group Hash Masks SHOULD both be
all-bits-set masks. These default values are likely acceptable for all-bits-set masks. These default values are likely acceptable for
most deployments, and simplify configuration. There is only a need most deployments and simplify configuration. There is only a need to
to use other masks if one needs to ensure that certain flows are use other masks if one needs to ensure that certain flows are
forwarded by the same GDR. forwarded by the same GDR.
The DRLB-List Hello Option contains a list of GDR Candidates. The The DRLB-List Hello Option contains a list of GDR Candidates. The
first one listed has ordinal number 0, the second listed ordinal first one listed has ordinal number 0, the second listed ordinal
number 1, and the last one has ordinal number N - 1 if there are N number 1, and the last one has ordinal number N - 1 if there are N
candidates listed. The hash value computed will be the ordinal candidates listed. The hash value computed will be the ordinal
number of the GDR Candidate that is acting as GDR for the flow in number of the GDR Candidate that is acting as GDR for the flow in
question. question.
The input to be hashed is determined as follows: The input to be hashed is determined as follows:
o If the group is in ASM mode and the RP Hash Mask announced by the * If the group is in ASM mode and the RP Hash Mask announced by the
PIM DR is not zero (at least one bit is set), calculate the value PIM DR is not zero (at least one bit is set), calculate the value
of hashvalue_RP [Section 5.2] to determine the GDR. of hashvalue_RP (Section 5.2) to determine the GDR.
o If the group is in ASM mode and the RP Hash Mask announced by the * If the group is in ASM mode and the RP Hash Mask announced by the
PIM DR is zero (no bits are set), obtain the value of PIM DR is zero (no bits are set), obtain the value of
hashvalue_Group [Section 5.2] to determine the GDR. hashvalue_Group (Section 5.2) to determine the GDR.
o If the group is in SSM mode, use hashvalue_SG [Section 5.2] to * If the group is in SSM mode, use hashvalue_SG (Section 5.2) to
determine the GDR. determine the GDR.
A simple Modulo Hash Algorithm is defined in this document. However, A simple modulo hash algorithm is defined in this document. However,
to allow another Hash Algorithms to be used, a 1-octet "Hash to allow another hash algorithm to be used, a 1-octet "Hash
Algorithm" field is included in the DRLB-Cap Hello Option to specify Algorithm" field is included in the DRLB-Cap Hello Option to specify
the Hash Algorithm used by the router. the hash algorithm used by the router.
If different Hash Algorithms are advertised among the routers on a If different hash algorithms are advertised among the routers on a
LAN, only the routers advertising the same Hash Algorithm as the DR LAN, only the routers advertising the same hash algorithm as the DR
(as well as having the same DR priority as the DR) are eligible for (as well as having the same DR priority as the DR) are eligible for
GDR election. GDR election.
5.2. Modulo Hash Algorithm 5.2. Modulo Hash Algorithm
As part of computing the hash, the notation LSZC(hash_mask) is used As part of computing the hash, the notation LSZC(hash_mask) is used
to denote the number of zeroes counted from the least significant bit to denote the number of zeroes counted from the least significant bit
of a Hash Mask hash_mask. As an example, LSZC(255.255.128) is 7 and of a hash mask hash_mask. As an example, LSZC(255.255.128) is 7 and
also LSZC(ffff:8000::) is 111. If all bits are set, LSZC will be 0. LSZC(ffff:8000::) is 111. If all bits are set, LSZC will be 0. If
If the mask is zero, then LSZC will be 32 for IPv4, and 128 for IPv6. the mask is zero, then LSZC will be 32 for IPv4 and 128 for IPv6.
The number of GDR Candidates is denoted as GDRC. The number of GDR Candidates is denoted as GDRC.
The idea behind the Modulo Hash Algorithm is in simple terms that the The idea behind the modulo hash algorithm is, in simple terms, that
corresponding mask is applied to a value, then the result is shifted the corresponding mask is applied to a value, then the result is
right LSZC(mask) bits so that the least significant bits that were shifted right LSZC(mask) bits so that the least significant bits that
masked out are not considered. Then this result is masked by were masked out are not considered. Then, this result is masked by
0xffffffff, keeping only the last 32 bits of the result (this only 0xffffffff, keeping only the last 32 bits of the result (this only
makes a difference for IPv6). Finally, the hash value is this result makes a difference for IPv6). Finally, the hash value is this result
modulo the number of GDR Candidates (GDRC). modulo the number of GDR Candidates (GDRC).
The Modulo Hash Algorithm for computing the values hashvalue_RP, The modulo hash algorithm, for computing the values hashvalue_RP,
hashvalue_Group and hashvalue_SG is defined as follows. hashvalue_Group, and hashvalue_SG, is defined as follows.
hashvalue_RP is calculated as: hashvalue_RP is calculated as:
(((RP_address & RP_mask) >> LSZC(RP_mask)) & 0xffffffff) % GDRC (((RP_address & RP_mask) >> LSZC(RP_mask)) & 0xffffffff) % GDRC
RP_address is the address of the RP defined for the group and RP_address is the address of the RP defined for the group, and
RP_mask is the RP Hash Mask. RP_mask is the RP Hash Mask.
hashvalue_Group is calculated as: hashvalue_Group is calculated as:
(((Group_address & Group_mask) >> LSZC(Group_mask)) & 0xffffffff) (((Group_address & Group_mask) >> LSZC(Group_mask)) & 0xffffffff)
% GDRC % GDRC
Group_address is the group address and Group_mask is the Group Group_address is the group address, and Group_mask is the Group
Hash Mask. Hash Mask.
hashvalue_SG is calculated as: hashvalue_SG is calculated as:
((((Source_address & Source_mask) >> LSZC(Source_mask)) & ((((Source_address & Source_mask) >> LSZC(Source_mask)) &
0xffffffff) ^ (((Group_address & Group_mask) >> LSZC(Group_mask)) 0xffffffff) ^ (((Group_address & Group_mask) >> LSZC(Group_mask))
& 0xffffffff)) % GDRC & 0xffffffff)) % GDRC
Group_address is the group address and Group_mask is the Group Group_address is the group address, and Group_mask is the Group
Hash Mask. Hash Mask.
5.2.1. Modulo Hash Algorithm Examples 5.2.1. Modulo Hash Algorithm Examples
To help illustrate the algorithm, consider this example. Router X To help illustrate the algorithm, consider this example. Router X
with IPv4 address 203.0.113.1 receives a DRLB-List Hello Option from with IPv4 address 203.0.113.1 receives a DRLB-List Hello Option from
the DR, which announces RP Hash Mask 0.0.255.0 and a list of GDR the DR that announces RP Hash Mask 0.0.255.0 and a list of GDR
Candidates, sorted by IP addresses from high to low: 203.0.113.3, Candidates, sorted by IP addresses from high to low: 203.0.113.3,
203.0.113.2 and 203.0.113.1. The ordinal number assigned to those 203.0.113.2, and 203.0.113.1. The ordinal number assigned to those
addresses would be: addresses would be:
0 for 203.0.113.3; 1 for 203.0.113.2; 2 for 203.0.113.1 (Router X). 0 for 203.0.113.3; 1 for 203.0.113.2; 2 for 203.0.113.1 (Router X).
Assume there are 2 RPs: RP1 192.0.2.1 for Group1 and RP2 198.51.100.2 Assume there are 2 RPs: RP1 192.0.2.1 for Group1 and RP2 198.51.100.2
for Group2. Following the modulo Hash Algorithm: for Group2. Following the modulo hash algorithm:
LSZC(0.0.255.0) is 8 and GDRC is 3. The hashvalue_RP for Group1 with * LSZC(0.0.255.0) is 8, and GDRC is 3. The hashvalue_RP for Group1
RP RP1 is: with RP RP1 is:
(((192.0.2.1 & 0.0.255.0) >> 8) & 0xffffffff % 3) = 2 % 3 = 2 (((192.0.2.1 & 0.0.255.0) >> 8) & 0xffffffff % 3)
= 2 % 3
= 2
which matches the ordinal number assigned to Router X. Router X will This matches the ordinal number assigned to Router X. Router X
be the GDR for Group1. will be the GDR for Group1.
The hashvalue_RP for Group2 with RP RP2 is: * The hashvalue_RP for Group2 with RP RP2 is:
(((198.51.100.2 & 0.0.255.0) >> 8) & 0xffffffff % 3) = 100 % 3 = 1 (((198.51.100.2 & 0.0.255.0) >> 8) & 0xffffffff % 3)
which is different from the ordinal number of Router X (2). Hence, = 100 % 3
Router X will not be GDR for Group2. = 1
For IPv6 consider this example, similar to the above. Router X with This is different from the ordinal number of Router X (2). Hence,
IPv6 address fe80::1 receives a DRLB-List Hello Option from the DR, Router X will not be GDR for Group2.
which announces RP Hash Mask ::ffff:ffff:ffff:0 and a list of GDR
Candidates, sorted by IP addresses from high to low: fe80::3, fe80::2
and fe80::1. The ordinal number assigned to those addresses would
be:
0 for fe80::3; 1 for fe80::2; 2 for fe80::1 (Router X). For IPv6, consider this example, similar to the above. Router X with
IPv6 address fe80::1 receives a DRLB-List Hello Option from the DR
that announces RP Hash Mask ::ffff:ffff:ffff:0 and a list of GDR
Candidates, sorted by IP addresses from high to low: fe80::3,
fe80::2, and fe80::1. The ordinal number assigned to those addresses
would be:
0 for fe80::3; 1 for fe80::2; 2 for fe80::1 (Router X).
Assume there are 2 RPs: RP1 2001:db8::1:0:5678:1 for Group1 and RP2 Assume there are 2 RPs: RP1 2001:db8::1:0:5678:1 for Group1 and RP2
2001:db8::1:0:1234:2 for Group2. Following the modulo Hash 2001:db8::1:0:1234:2 for Group2. Following the modulo hash
Algorithm: algorithm:
LSZC(::ffff:ffff:ffff:0) is 16 and GDRC is 3. The hashvalue_RP for * LSZC(::ffff:ffff:ffff:0) is 16, and GDRC is 3. The hashvalue_RP
Group1 with RP RP1 is: for Group1 with RP RP1 is:
(((2001:db8::1:0:5678:1 & ::ffff:ffff:ffff:0) >> 16) & 0xffffffff % (((2001:db8::1:0:5678:1 & ::ffff:ffff:ffff:0) >> 16) &
3) = ((::1:0:5678:0 >> 16) & 0xffffffff % 3) = (::1:0:5678 & 0xffffffff % 3)
0xffffffff % 3) = ::5678 % 3 = 2 = ((::1:0:5678:0 >> 16) & 0xffffffff % 3)
= (::1:0:5678 & 0xffffffff % 3)
= ::5678 % 3
= 2
which matches the ordinal number assigned to Router X. Router X will This matches the ordinal number assigned to Router X. Router X
be the GDR for Group1. will be the GDR for Group1.
The hashvalue_RP for Group2 with RP RP2 is: * The hashvalue_RP for Group2 with RP RP2 is:
(((2001:db8::1:0:1234:1 & ::ffff:ffff:ffff:0) >> 16) & 0xffffffff % (((2001:db8::1:0:1234:1 & ::ffff:ffff:ffff:0) >> 16) &
3) = ((::1:0:1234:0 >> 16) & 0xffffffff % 3) = (::1:0:1234 & 0xffffffff % 3)
0xffffffff % 3) = ::1234 % 3 = 1 = ((::1:0:1234:0 >> 16) & 0xffffffff % 3)
= (::1:0:1234 & 0xffffffff % 3)
= ::1234 % 3
= 1
which is different from the ordinal number of Router X (2). Hence, This is different from the ordinal number of Router X (2). Hence,
Router X will not be GDR for Group2. Router X will not be GDR for Group2.
5.2.2. Limitations 5.2.2. Limitations
The Modulo Hash Algorithm has poor failover characteristics when a The modulo hash algorithm has poor failover characteristics when a
shared LAN has more than two GDRs. In the case of more than two GDRs shared LAN has more than two GDRs. In the case of more than two GDRs
on a LAN, when one GDR fails, all of the groups may be reassigned to on a LAN, when one GDR fails, all of the groups may be reassigned to
a different GDR, even if they were not assigned to the failed GDR. a different GDR, even if they were not assigned to the failed GDR.
However, many deployments use only two routers on a shared LAN for However, many deployments use only two routers on a shared LAN for
redundancy purposes. Future work may define new Hash Algorithms redundancy purposes. Future work may define new hash algorithms
where only groups assigned to the failed GDR get reassigned. where only groups assigned to the failed GDR get reassigned.
The Modulo Hash Algorithm will use at most 32 consecutive bits of the The modulo hash algorithm will use, at most, 32 consecutive bits of
input addresses for its computation. Exactly which bits are used of the input addresses for its computation. Exactly which bits are used
the source, group or RP addresses, depend on the respective masks. of the source, group, or RP addresses depend on the respective masks.
This limitation may be an issue for IPv6 deployments, since not all This limitation may be an issue for IPv6 deployments, since not all
bits of the IPv6 addresses are considered. If this causes bits of the IPv6 addresses are considered. If this causes
operational issues, a new hash algorithm would need to be defined. operational issues, a new hash algorithm would need to be defined.
5.3. PIM Hello Options 5.3. PIM Hello Options
PIM routers include a new option, called "Load Balancing Capability PIM routers include a new option, called "Load-Balancing Capability
(DRLB-Cap)" in their PIM Hello messages. (DRLB-Cap)", in their PIM Hello messages.
Besides this DRLB-Cap Hello Option, the elected PIM DR also includes Besides this DRLB-Cap Hello Option, the elected PIM DR also includes
a new "DR Load Balancing List (DRLB-List) Hello Option". The DRLB- a new "DR Load-Balancing List (DRLB-List) Hello Option". The DRLB-
List Hello Option consists of three Hash Masks as defined above and List Hello Option consists of three hash masks, as defined above, and
also a list of GDR Candidate addresses on the LAN. It is recommended also a list of GDR Candidate addresses on the LAN. It is recommended
that the GDR Candidate addresses are sorted in descending order. that the GDR Candidate addresses are sorted in descending order.
This ensures that when using algorithms such as the Modulo algorithm This ensures that when using algorithms, such as the modulo hash
in this document, that it is predictable which GDR is responsible for algorithm in this document, that it is predictable which GDR is
which groups, regardless of the order the DR learned about the responsible for which groups, regardless of the order the DR learned
candidates. about the candidates.
5.3.1. PIM DR Load Balancing Capability (DRLB-Cap) Hello Option 5.3.1. PIM DR Load-Balancing Capability (DRLB-Cap) Hello Option
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type = 34 | Length = 4 | | Type = 34 | Length = 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |Hash Algorithm | | Reserved |Hash Algorithm |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: PIM DR Load Balancing Capability Hello Option Figure 3: PIM DR Load-Balancing Capability Hello Option
Type: 34 Type: 34
Length: 4 Length: 4
Reserved: Transmitted as zero, ignored on receipt. Reserved: Transmitted as zero, ignored on receipt.
Hash Algorithm: Hash Algorithm type. A value listed in the IANA Hash Algorithm: Hash algorithm type. A value listed in the IANA
Designated Router Load Balancing Hash Algorithms registry. 0 is "PIM Designated Router Load-Balancing Hash Algorithms" registry. 0
used for the Modulo algorithm defined in this document. is used for the hash algorithm defined in this document.
This DRLB-Cap Hello Option MUST be advertised by routers on all This DRLB-Cap Hello Option MUST be advertised by routers on all
interfaces where DR Load Balancing is enabled. Note that the option interfaces where DR Load Balancing is enabled. Note that the option
is included at most once. is included, at most, once.
5.3.2. PIM DR Load Balancing List (DRLB-List) Hello Option 5.3.2. PIM DR Load-Balancing List (DRLB-List) Hello Option
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type = 35 | Length | | Type = 35 | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Mask | | Group Mask |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Mask | | Source Mask |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RP Mask | | RP Mask |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| GDR Candidate Address(es) | | GDR Candidate Address(es) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: PIM DR Load Balancing List Hello Option Figure 4: PIM DR Load-Balancing List Hello Option
Type: 35 Type: 35
Length: (3 + n) x (4 or 16) bytes, where n is the number of GDR Length: (3 + n) x (4 or 16) bytes, where n is the number of GDR
candidates. Candidates.
Group Mask (32/128 bits): Mask applied to group addresses as part Group Mask (32/128 bits): Mask applied to group addresses as part of
of hash computation. hash computation.
Source Mask (32/128 bits): Mask applied to source addresses as Source Mask (32/128 bits): Mask applied to source addresses as part
part of hash computation. of hash computation.
RP Mask (32/128 bits): Mask applied to RP addresses as part of RP Mask (32/128 bits): Mask applied to RP addresses as part of hash
hash computation. computation.
All masks MUST have the same number of bits as the IP source All masks MUST have the same number of bits as the IP source address
address in the PIM Hello IP header. in the PIM Hello IP header.
GDR Candidate Address(es) (32/128 bits): List of GDR Candidate(s) GDR Candidate Address(es) (32/128 bits): List of GDR Candidate(s)
All addresses MUST be in the same address family as the PIM All addresses MUST be in the same address family as the PIM Hello
Hello IP header. It is recommended that the addresses are IP header. It is recommended that the addresses are sorted in
sorted in descending order. descending order.
If the "Interface ID" option, as specified in [RFC6395], is If the "Interface ID" option, as specified in [RFC6395], is
present in a GDR Candidate's PIM Hello message, and the "Router present in a GDR Candidate's PIM Hello message and the "Router
Identifier" portion is non-zero: Identifier" portion is non-zero:
+ For IPv4, the "GDR Candidate Address" will be set directly * For IPv4, the "GDR Candidate Address" will be set directly to
to the "Router Identifier". the "Router Identifier".
+ For IPv6, the "GDR Candidate Address" will be 96 bits of * For IPv6, the "GDR Candidate Address" will be 96 bits of
zeroes followed by the 32 bit Router Identifier. zeroes, followed by the 32 bit Router Identifier.
If the "Interface ID" option is not present in a GDR Candidate' If the "Interface ID" option is not present in a GDR Candidate's
PIM Hello message, or if the "Interface ID" option is present PIM Hello message or if the "Interface ID" option is present but
but the "Router Identifier" field is zero, the "GDR Candidate the "Router Identifier" field is zero, the "GDR Candidate Address"
Address" will be the IPv4 or IPv6 source address of the PIM will be the IPv4 or IPv6 source address of the PIM Hello message.
Hello message.
This DRLB-List Hello Option MUST only be advertised by the This DRLB-List Hello Option MUST only be advertised by the elected
elected PIM DR. It MUST be ignored if received from a non-DR. PIM DR. It MUST be ignored if received from a non-DR. The option
The option MUST also be ignored if the hash masks are not the MUST also be ignored if the hash masks are not the correct number
correct number of bits, or GDR Candidate addresses are in the of bits or GDR Candidate addresses are in the wrong address
wrong address family. family.
5.4. PIM DR Operation 5.4. PIM DR Operation
The DR election process is still the same as defined in [RFC7761]. The DR election process is still the same as defined in [RFC7761].
The DR advertises the new DRLB-List Hello Option, which contains mask The DR advertises the new DRLB-List Hello Option, which contains mask
values from user configuration (or default values), followed by a values from user configuration (or default values), followed by a
list of GDR Candidate Addresses. Note that if a router included the list of GDR Candidate addresses. Note that if a router included the
"Interface ID" option in the hello message, and the Router ID is non- "Interface ID" option in the hello message and the Router ID is non-
zero, the Router ID will be used to form the GDR Candidate address of zero, the Router ID will be used to form the GDR Candidate address of
the router, as discussed in the previous section. It is recommended the router, as discussed in the previous section. It is recommended
that the list be sorted, from the highest value to the lowest value. that the list be sorted from the highest value to the lowest value.
The reason for sorting the list is to make the behavior The reason for sorting the list is to make the behavior
deterministic, regardless of the order in which the DR learns of new deterministic, regardless of the order in which the DR learns of new
candidates. Note that, as for non-DR routers, the DR also advertises candidates. Note that, as for non-DR routers, the DR also advertises
the DRLB-Cap Hello Option to indicate its ability to support the new the DRLB-Cap Hello Option to indicate its ability to support the new
functionality and the type of GDR election Hash Algorithm it uses. functionality and the type of GDR election hash algorithm it uses.
If a PIM DR receives a neighbor DRLB-Cap Hello Option, which contains If a PIM DR receives a neighbor DRLB-Cap Hello Option that contains
the same Hash Algorithm as the DR, and the neighbor has the same DR the same hash algorithm as the DR and the neighbor has the same DR
priority as the DR, PIM DR SHOULD consider the neighbor as a GDR priority as the DR, PIM DR SHOULD consider the neighbor as a GDR
Candidate and insert the GDR Candidate' Address into the list of the Candidate and insert the GDR Candidate's Address into the list of the
DRLB-List Option. However, the DR may have policies limiting which DRLB-List Option. However, the DR may have policies limiting which
GDR Candidates, or the number of GDR Candidates to include. or the number of GDR Candidates to include. Likewise, the DR SHOULD
Likewise, the DR SHOULD include itself in the list of GDR Candidates, include itself in the list of GDR Candidates, but it is permissible
but it is permissible not to do so, if for instance there is some not to do so, for instance, if there is some policy restricting the
policy restricting the candidate set. candidate set.
If a PIM neighbor included in the list expires, stops announcing the If a PIM neighbor included in the list expires, stops announcing the
DRLB-Cap Hello Option, changes DR priority, changes Hash Algorithm or DRLB-Cap Hello Option, changes DR priority, changes hash algorithm,
otherwise becomes ineligible as a candidate, the DR SHOULD or otherwise becomes ineligible as a candidate, the DR SHOULD
immediately send a triggered hello with a new list in the DRLB-List immediately send a triggered hello with a new list in the DRLB-List
option, excluding the neighbor. option, excluding the neighbor.
If a new router becomes eligible as a candidate, there is no urgency If a new router becomes eligible as a candidate, there is no urgency
in sending out an updated list. An updated list SHOULD be included in sending out an updated list. An updated list SHOULD be included
in the next hello. in the next hello.
5.5. PIM GDR Candidate Operation 5.5. PIM GDR Candidate Operation
When an IGMP/MLD report is received, a Hash Algorithm is used by the When an IGMP/MLD report is received, a hash algorithm is used by the
GDR Candidates to determine which router is going to be responsible GDR Candidates to determine which router is going to be responsible
for building forwarding trees on behalf of the host. for building forwarding trees on behalf of the host.
The router MUST include the DRLB-Cap Hello Option in all PIM Hello The router MUST include the DRLB-Cap Hello Option in all PIM Hello
messages sent on the interface. Note that the presence of the DRLB- messages sent on the interface. Note that the presence of the DRLB-
Cap Option in the PIM Hello does not guarantee that the router will Cap Option in the PIM Hello does not guarantee that the router will
be considered as a GDR candidate. Once the DR election is done, the be considered as a GDR Candidate. Once the DR election is done, the
DRLB-List Hello Option is received from the current PIM DR containing DRLB-List Hello Option is received from the current PIM DR containing
a list of the selected GDRs Candidates. a list of the selected GDR Candidates.
A router only acts as a GDR Candidate if it is included in the GDR A router only acts as a GDR Candidate if it is included in the GDR
Candidate list of the DRLB-List Hello Option. See next section for Candidate list of the DRLB-List Hello Option. See next section for
details. details.
5.6. DRLB-List Hello Option Processing 5.6. DRLB-List Hello Option Processing
This section discusses processing of the DRLB-List Hello Option, This section discusses processing of the DRLB-List Hello Option,
including the case where it was received in the previous hello, but including the case where it was received in the previous hello but
not in the current hello. All routers MUST ignore the DRLB-List not in the current hello. All routers MUST ignore the DRLB-List
Hello Option if it is received from a PIM router which is not the DR. Hello Option if it is received from a PIM router that is not the DR.
The option MUST only be processed by routers that are announcing the The option MUST only be processed by routers that are announcing the
DRLB-Cap Option, and only if the Hash Algorithm announced by the DR DRLB-Cap Option and only if the hash algorithm announced by the DR is
is the same as the local announcement. All GDR Candidates MUST use the same as the local announcement. All GDR Candidates MUST use the
the Hash Masks advertised in the Option, even if they differ from hash masks advertised in the Option, even if they differ from those
those the candidate was configured with. The DR MUST also process the candidate was configured with. The DR MUST also process its own
its own DRLB-List Hello Option. DRLB-List Hello Option.
A router stores the latest option contents that was announced, if A router stores the latest option contents that were announced, if
any, and deletes the previous contents. The router MUST also compare any, and deletes the previous contents. The router MUST also compare
the new contents with any previous contents, and if there are any the new contents with any previous contents and, if there are any
changes, continue processing as below. Note that if the option does changes, continue processing as below. Note that if the option does
not pass the above checks, the below processing MUST be done as if not pass the above checks, the below processing MUST be done as if
the option was not announced. the option was not announced.
If the contents of the DRLB-List Option, the masks or the candidate If the contents of the DRLB-List Option, the masks, or the candidate
list, differs from the previously saved copy, it is received for the list differ from the previously saved copy, it is received for the
first time, or it is no longer being received or accepted, the option first time, or it is no longer being received or accepted, the option
MUST be processed as below. MUST be processed as below.
1. If the local router is included in the GDR Candidate Address(es) 1. If the local router is included in the "GDR Candidate
field (it will look for its own address, or its Router ID if it Address(es)" field, it will look for its own address, or if it
announces a non-zero Router ID), for each of the groups, or announces a non-zero Router ID, its own Router ID. For each of
source and group pairs if the group is in SSM mode, with local the groups or source and group pairs, if the group is in SSM mode
receiver interest, the router MUST run the Hash Algorithm to with local receiver interest, the router MUST run the hash
determine which of them it is the GDR for. algorithm to determine which of them is for the GDR.
If there is no change in the GDR status, then no further * If there is no change in the GDR status, then no further
action is required. action is required.
If the router becomes the new GDR, then a multicast forwarding * If the router becomes the new GDR, then a multicast forwarding
tree MUST be built [RFC7761]. tree MUST be built [RFC7761].
If the router is no longer the GDR, then it uses an Assert as * If the router is no longer the GDR, then it uses an Assert as
explained in [Section 5.7]. explained in Section 5.7.
2. If the local router is not included in the GDR Candidate 2. If one of the following occurs:
Address(es) field, or if the DRLB-List Hello Option is no longer
included in the DR's Hello, or if the DR's Neighbor Liveness * the local router is not included in the "GDR Candidate
Timer expires [RFC7761], for each of the groups, or source and Address(es)" field,
group pairs if the group is in SSM mode, with local receiver
interest, for which the router is the GDR, it uses an Assert as * the DRLB-List Hello Option is no longer included in the DR's
explained in [Section 5.7]. Hello, or
* the DR's Neighbor Liveness Timer expires [RFC7761],
then for each group (or each source and group pair if the group
is in SSM mode) with local receiver interest, for which the
router is the GDR, the router uses an Assert as explained in
Section 5.7.
5.7. PIM Assert Modification 5.7. PIM Assert Modification
GDR changes may occur due to configuration change, due to GDR GDR changes may occur due to configuration change, GDR Candidates
candidates going down, and also new routers coming up and becoming going down, and also new routers coming up and becoming GDR
GDR candidates. This may occur while flows are being forwarded. If Candidates. This may occur while flows are being forwarded. If the
the GDR for an active flow changes, there is likely to be some GDR for an active flow changes, there is likely to be some
disruption, such as packet loss or duplicates. By using asserts, disruption, such as packet loss or duplicates. By using asserts,
packet loss is minimized, while allowing a small amount of packet loss is minimized while allowing a small amount of duplicates.
duplicates.
When a router stops acting as the GDR for a group, or source and When a router stops acting as the GDR for a group, or source and
group pair if SSM, it MUST set the Assert metric preference to group pair if SSM, it MUST set the Assert metric preference to
maximum (0x7fffffff) and the Assert metric to one less than maximum maximum (0x7fffffff) and the Assert metric to one less than maximum
(0xfffffffe). That is, whenever it sends or receives an Assert for (0xfffffffe). That is, whenever it sends or receives an Assert for
the group, it must use these values as the metric preference and the group, it must use these values as the metric preference and
metric rather than the values provided by the unicast routing metric rather than the values provided by the unicast routing
protocol. protocol.
The rest of this section is just for illustration purposes and not The rest of this section is just for illustration purposes and not
part of the protocol definition. part of the protocol definition.
To illustrate the behavior when there is a GDR change, consider the To illustrate the behavior when there is a GDR change, consider the
following scenario where there are two flows G1 and G2. R1 is the following scenario where there are two flows: G1 and G2. R1 is the
GDR for G1, and R2 is the GDR for G2. When R3 comes up, it is GDR for G1, and R2 is the GDR for G2. When R3 comes up, it is
possible that R3 becomes GDR for both G1 and G2, hence R3 starts to possible that R3 becomes GDR for both G1 and G2; hence, R3 starts to
build the forwarding tree for G1 and G2. If R1 and R2 stop build the forwarding tree for G1 and G2. If R1 and R2 stop
forwarding before R3 completes the process, packet loss might occur. forwarding before R3 completes the process, packet loss might occur.
On the other hand, if R1 and R2 continue forwarding while R3 is On the other hand, if R1 and R2 continue forwarding while R3 is
building the forwarding trees, duplicates might occur. building the forwarding trees, duplicates might occur.
When the role of GDR changes as above, instead of immediately When the role of GDR changes as above, instead of immediately
stopping forwarding, R1 and R2 continue forwarding to G1 and G2 stopping forwarding, R1 and R2 continue forwarding to G1 and G2
respectively, while, at the same time, R3 build forwarding trees for respectively, while, at the same time, R3 build forwarding trees for
G1 and G2. This will lead to PIM Asserts. G1 and G2. This will lead to PIM Asserts.
For G1, using the functionality described in this document, R1 and R3 For G1, using the functionality described in this document, R1 and R3
determine the new GDR, which is R3. With the modified Assert determine the new GDR, which is R3. With the modified Assert
behavior, R1 sets its Assert metric to the near maximum value behavior, R1 sets its Assert metric to the near maximum value, as
discussed above. That will make R3, which has normal metric in its discussed above. That will make R3, which has normal metric in its
Assert as the Assert winner. Assert, the Assert winner.
5.8. Backward Compatibility 5.8. Backward Compatibility
In the case of a hybrid Ethernet shared LAN (where some PIM routers In the case of a hybrid Ethernet shared LAN (where some PIM routers
support the functionality defined in this document, and some do not); support the functionality defined in this document and some do not):
o If the DR does not support the new functionality, then there will * If the DR does not support the new functionality, then there will
be no load-balancing. be no load balancing.
o If non-DR routers do not support the new functionality, they will * If non-DR routers do not support the new functionality, they will
not be considered as Candidate GDRs and it will not take part in not be considered as GDR Candidate and will not take part in load
load-balancing. Load-balancing may still happen on the link. balancing. Load balancing may still happen on the link.
6. Operational Considerations 6. Operational Considerations
An administrator needs to consider what the total bandwidth An administrator needs to consider what the total bandwidth
requirements are and find a set of routers that together has enough requirements are and find a set of routers that together have enough
available capacity, while making sure that each of the routers can available capacity while making sure that each of the routers can
handle its part, assuming that the traffic is distributed roughly handle its part, assuming that the traffic is distributed roughly
equally among the routers. Ideally, one should also have enough equally among the routers. Ideally, one should also have enough
bandwidth to handle the case where at least one router fails. All bandwidth to handle the case where at least one router fails. All
routers should have reachability to the sources, and RPs if routers should have reachability to the sources and RPs, if
applicable, that is not via the LAN. applicable, that are not via the LAN.
Care must be taken when choosing what hash masks to configure. One Care must be taken when choosing what hash masks to configure. One
would typically configure the same masks on all the routers, so that would typically configure the same masks on all the routers so that
they are the same, regardless of which router is elected as DR. The they are the same, regardless of which router is elected as DR. The
default masks are likely suitable for most deployment. The RP Hash default masks are likely suitable for most deployment. The RP Hash
Mask must be configured (the default is no bits set) if one wishes to Mask must be configured (the default is no bits set) if one wishes to
hash based on the RP address rather than the group address for ASM. hash based on the RP address rather than the group address for ASM.
The default masks will use the entire group addresses, and source The default masks will use the entire group addresses, and source
addresses if SSM, as part of the hash. An administrator may set addresses if SSM, as part of the hash. An administrator may set
other masks that masks out part of the addresses to ensure that other masks that mask out part of the addresses to ensure that
certain flows always get hashed to the same router. How this is certain flows always get hashed to the same router. How this is
achieved depends on how the group addresses are allocated. achieved depends on how the group addresses are allocated.
Only the routers announcing the same Hash Algorithm as the DR would Only the routers announcing the same hash algorithm as the DR would
be considered as GDR candidates. Network administrators need to make be considered as GDR Candidates. Network administrators need to make
sure that the desired set of routers announce the same algorithm. sure that the desired set of routers announce the same algorithm.
Migration between different algorithms is not considered in this Migration between different algorithms is not considered in this
document. document.
7. IANA Considerations 7. IANA Considerations
IANA has temporarily assigned type 34 for the PIM DR Load Balancing IANA has made these assignments in the "PIM-Hello Options" registry:
Capability (DRLB-Cap) Hello Option, and type 35 for the PIM DR Load value 34 for the PIM DR Load-Balancing Capability (DRLB-Cap) Hello
Balancing List (DRLB-List) Hello Option in the PIM-Hello Options Option (with Length of 4), and value 35 for the PIM DR Load-Balancing
registry. IANA is requested to make these assignments permanent when List (DRLB-List) Hello Option (with variable Length).
this document is published as an RFC. Note that the option names
have changed slightly since the temporary assignments were made.
Also, the length of option 34 is always 4, the registry currently
says it is variable.
This document requests IANA to create a registry called "Designated Per this document, IANA has created a registry called "PIM Designated
Router Load Balancing Hash Algorithms" in the "Protocol Independent Router Load-Balancing Hash Algorithms" in the "Protocol Independent
Multicast (PIM)" branch of the registry tree. The registry lists Multicast (PIM)" branch of the registry tree. The registry lists
Hash Algorithms for use by PIM Designated Router Load Balancing. hash algorithms for use by PIM Designated Router Load Balancing.
7.1. Initial registry 7.1. Initial Registry
The initial content of the registry should be as follows. The initial content of the registry is as follows.
Type Name Reference +-------+------------+-----------+
------ ---------------------------------------- -------------------- | Type | Name | Reference |
0 Modulo This document +=======+============+===========+
1-255 Unassigned | 0 | Modulo | RFC 8775 |
+-------+------------+-----------+
| 1-255 | Unassigned | |
+-------+------------+-----------+
7.2. Assignment of new Hash Algorithms Table 1
Assignment of new Hash Algorithms is done according to the "IETF 7.2. Assignment of New Hash Algorithms
Review" model, see [RFC8126].
Assignment of new hash algorithms is done according to the "IETF
Review" procedure; see [RFC8126].
8. Security Considerations 8. Security Considerations
Security of the new DR Load Balancing PIM Hello Options is only Security of the new DR Load-Balancing PIM Hello Options is only
guaranteed by the security of PIM Hello messages, so the security guaranteed by the security of PIM Hello messages, so the security
considerations for PIM Hello messages as described in PIM-SM considerations for PIM Hello messages, as described in PIM-SM
[RFC7761] apply here. [RFC7761], apply here.
If the DR is subverted it could omit or add certain GDRs or announce If the DR is subverted, it could omit or add certain GDRs or announce
an unsupported algorithm. If another router is subverted, it could an unsupported algorithm. If another router is subverted, it could
be made DR and cause similar issues. While these issues are specific be made DR and cause similar issues. While these issues are specific
to this specification, they are not that different from existing to this specification, they are not that different from existing
attacks such as subverting a DR and lowering the DR priority, causing attacks, such as subverting a DR and lowering the DR priority,
a different router to become the DR. causing a different router to become the DR.
If for any reason, the DR includes a GDR in the announced list which If, for any reason, the DR includes a GDR in the announced list that
announces a different algorithm from what the DR announces, the GDR announces a different algorithm from what the DR announces, the GDR
is required to ignore the announcement, and there will be no router is required to ignore the announcement, and there will be no router
acting as the DR for the flows that hash to that GDR. acting as the DR for the flows that hash to that GDR.
If a GDR is subverted, it could potentially be made to stop If a GDR is subverted, it could potentially be made to stop
forwarding all the traffic it is expected to forward. This is also forwarding all the traffic it is expected to forward. This is also
similar today to if a DR is subverted. similar today to if a DR is subverted.
An administrator may be able to achieve the desired load-balancing of An administrator may be able to achieve the desired load balancing of
known flows, but an attacker may send a single high rate flow which known flows, but an attacker may send a single high rate flow that is
is served by a single GDR, or send multiple flows that are expected served by a single GDR or send multiple flows that are expected to be
to be hashed to the same GDR. hashed to the same GDR.
9. Acknowledgement
The authors would like to thank Steve Simlo and Taki Millonis for
helping with the original idea; Alia Atlas, Bill Atwood, Joe Clarke,
Alissa Cooper, Jake Holland, Bharat Joshi, Anish Kachinthaya, Anvitha
Kachinthaya, Benjamin Kaduk, Mirja Kuhlewind, Barry Leiba, Ben Niven-
Jenkins, Alvaro Retana, Adam Roach, Michael Scharf, Eric Vyncke and
Carl Wallace for reviews and comments; and Toerless Eckert and
Rishabh Parekh for helpful conversation on the document.
10. References 9. References
10.1. Normative References 9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC6395] Gulrajani, S. and S. Venaas, "An Interface Identifier (ID) [RFC6395] Gulrajani, S. and S. Venaas, "An Interface Identifier (ID)
Hello Option for PIM", RFC 6395, DOI 10.17487/RFC6395, Hello Option for PIM", RFC 6395, DOI 10.17487/RFC6395,
October 2011, <https://www.rfc-editor.org/info/rfc6395>. October 2011, <https://www.rfc-editor.org/info/rfc6395>.
skipping to change at page 19, line 20 skipping to change at line 859
[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for
Writing an IANA Considerations Section in RFCs", BCP 26, Writing an IANA Considerations Section in RFCs", BCP 26,
RFC 8126, DOI 10.17487/RFC8126, June 2017, RFC 8126, DOI 10.17487/RFC8126, June 2017,
<https://www.rfc-editor.org/info/rfc8126>. <https://www.rfc-editor.org/info/rfc8126>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
10.2. Informative References 9.2. Informative References
[RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. [RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A.
Thyagarajan, "Internet Group Management Protocol, Version Thyagarajan, "Internet Group Management Protocol, Version
3", RFC 3376, DOI 10.17487/RFC3376, October 2002, 3", RFC 3376, DOI 10.17487/RFC3376, October 2002,
<https://www.rfc-editor.org/info/rfc3376>. <https://www.rfc-editor.org/info/rfc3376>.
[RFC3810] Vida, R., Ed. and L. Costa, Ed., "Multicast Listener [RFC3810] Vida, R., Ed. and L. Costa, Ed., "Multicast Listener
Discovery Version 2 (MLDv2) for IPv6", RFC 3810, Discovery Version 2 (MLDv2) for IPv6", RFC 3810,
DOI 10.17487/RFC3810, June 2004, DOI 10.17487/RFC3810, June 2004,
<https://www.rfc-editor.org/info/rfc3810>. <https://www.rfc-editor.org/info/rfc3810>.
skipping to change at page 19, line 42 skipping to change at line 881
[RFC4541] Christensen, M., Kimball, K., and F. Solensky, [RFC4541] Christensen, M., Kimball, K., and F. Solensky,
"Considerations for Internet Group Management Protocol "Considerations for Internet Group Management Protocol
(IGMP) and Multicast Listener Discovery (MLD) Snooping (IGMP) and Multicast Listener Discovery (MLD) Snooping
Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006, Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006,
<https://www.rfc-editor.org/info/rfc4541>. <https://www.rfc-editor.org/info/rfc4541>.
[RFC4607] Holbrook, H. and B. Cain, "Source-Specific Multicast for [RFC4607] Holbrook, H. and B. Cain, "Source-Specific Multicast for
IP", RFC 4607, DOI 10.17487/RFC4607, August 2006, IP", RFC 4607, DOI 10.17487/RFC4607, August 2006,
<https://www.rfc-editor.org/info/rfc4607>. <https://www.rfc-editor.org/info/rfc4607>.
Acknowledgements
The authors would like to thank Steve Simlo and Taki Millonis for
helping with the original idea; Alia Atlas, Bill Atwood, Joe Clarke,
Alissa Cooper, Jake Holland, Bharat Joshi, Anish Kachinthaya, Anvitha
Kachinthaya, Benjamin Kaduk, Mirja Kühlewind, Barry Leiba, Ben Niven-
Jenkins, Alvaro Retana, Adam Roach, Michael Scharf, Éric Vyncke, and
Carl Wallace for reviews and comments; and Toerless Eckert and
Rishabh Parekh for helpful conversation on the document.
Authors' Addresses Authors' Addresses
Yiqun Cai Yiqun Cai
Alibaba Group Alibaba Group
520 Almanor Avenue
Sunnyvale, CA 94085
United States of America
Email: yiqun.cai@alibaba-inc.com Email: yiqun.cai@alibaba-inc.com
Heidi Ou Heidi Ou
Alibaba Group Alibaba Group
520 Almanor Avenue
Sunnyvale, CA 94085
United States of America
Email: heidi.ou@alibaba-inc.com Email: heidi.ou@alibaba-inc.com
Sri Vallepalli Sri Vallepalli
Cisco Systems, Inc.
3625 Cisco Way
San Jose CA 95134
USA
Email: svallepa@cisco.com Email: vallepal@yahoo.com
Mankamana Mishra Mankamana Mishra
Cisco Systems, Inc. Cisco Systems, Inc.
821 Alder Drive, 821 Alder Drive,
Milpitas CA 95035 Milpitas, CA 95035
USA United States of America
Email: mankamis@cisco.com Email: mankamis@cisco.com
Stig Venaas Stig Venaas
Cisco Systems, Inc. Cisco Systems, Inc.
Tasman Drive Tasman Drive
San Jose CA 95134 San Jose, CA 95134
USA United States of America
Email: stig@cisco.com Email: stig@cisco.com
Andy Green Andy Green
British Telecom British Telecom
Adastral Park Adastral Park
Ipswich IP5 2RE Ipswich
IP5 2RE
United Kingdom United Kingdom
Email: andy.da.green@bt.com Email: andy.da.green@bt.com
 End of changes. 162 change blocks. 
343 lines changed or deleted 360 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/