draft-ietf-pim-drlb-10.txt   draft-ietf-pim-drlb-11.txt 
Network Working Group Y. Cai Network Working Group Y. Cai
Internet-Draft H. Ou Internet-Draft H. Ou
Intended status: Standards Track Alibaba Group Intended status: Standards Track Alibaba Group
Expires: May 17, 2019 S. Vallepalli Expires: April 13, 2020 S. Vallepalli
M. Mishra M. Mishra
S. Venaas S. Venaas
Cisco Systems, Inc. Cisco Systems, Inc.
A. Green A. Green
British Telecom British Telecom
November 13, 2018 October 11, 2019
PIM Designated Router Load Balancing PIM Designated Router Load Balancing
draft-ietf-pim-drlb-10 draft-ietf-pim-drlb-11
Abstract Abstract
On a multi-access network, one of the PIM routers is elected as a On a multi-access network, one of the PIM-SM routers is elected as a
Designated Router (DR). On the last hop LAN, the PIM DR is Designated Router. One of the responsibilities of the Designated
responsible for tracking local multicast listeners and forwarding Router is to track local multicast listeners and forward data to
traffic to these listeners if the group is operating in PIM-SM. This these listeners if the group is operating in PIM-SM. This document
document specifies a modification to the PIM-SM protocol that allows specifies a modification to the PIM-SM protocol that allows more than
more than one of these last hop routers to be selected, so that the one of the PIM-SM routers to take on this responsibility so that the
forwarding load can be distributed among these routers. forwarding load can be distributed among multiple routers.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 17, 2019. This Internet-Draft will expire on April 13, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 5
4. Functional Overview . . . . . . . . . . . . . . . . . . . . . 6 4. Functional Overview . . . . . . . . . . . . . . . . . . . . . 5
4.1. GDR Candidates . . . . . . . . . . . . . . . . . . . . . 6 4.1. GDR Candidates . . . . . . . . . . . . . . . . . . . . . 6
4.2. Hash Mask and Hash Algorithm . . . . . . . . . . . . . . 7 5. Protocol Specification . . . . . . . . . . . . . . . . . . . 7
4.3. Modulo Hash Algorithm . . . . . . . . . . . . . . . . . . 8 5.1. Hash Mask and Hash Algorithm . . . . . . . . . . . . . . 7
4.3.1. Limitations . . . . . . . . . . . . . . . . . . . . . 9 5.2. Modulo Hash Algorithm . . . . . . . . . . . . . . . . . . 8
4.4. PIM Hello Options . . . . . . . . . . . . . . . . . . . . 9 5.2.1. Modulo Hash Algorithm Example . . . . . . . . . . . . 9
5. Hello Option Formats . . . . . . . . . . . . . . . . . . . . 10 5.2.2. Limitations . . . . . . . . . . . . . . . . . . . . . 10
5.1. PIM DR Load Balancing Capability (DRLBC) Hello Option . . 10 5.3. PIM Hello Options . . . . . . . . . . . . . . . . . . . . 10
5.2. PIM DR Load Balancing GDR (DRLBGDR) Hello Option . . . . 10 5.3.1. PIM DR Load Balancing Capability (DRLB-Cap) Hello
6. Protocol Specification . . . . . . . . . . . . . . . . . . . 11 Option . . . . . . . . . . . . . . . . . . . . . . . 10
6.1. PIM DR Operation . . . . . . . . . . . . . . . . . . . . 11 5.3.2. PIM DR Load Balancing List (DRLB-List) Hello Option . 11
6.2. PIM GDR Candidate Operation . . . . . . . . . . . . . . . 12 5.4. PIM DR Operation . . . . . . . . . . . . . . . . . . . . 12
6.2.1. Router Receives New DRLBGDR . . . . . . . . . . . . . 13 5.5. PIM GDR Candidate Operation . . . . . . . . . . . . . . . 13
6.2.2. Router Receives Updated DRLBGDR . . . . . . . . . . . 13 5.6. DRLB-List Hello Option Processing . . . . . . . . . . . . 13
6.3. PIM Assert Modification . . . . . . . . . . . . . . . . . 14 5.7. PIM Assert Modification . . . . . . . . . . . . . . . . . 14
7. Compatibility . . . . . . . . . . . . . . . . . . . . . . . . 15 5.8. Backward Compatibility . . . . . . . . . . . . . . . . . 16
8. Manageability Considerations . . . . . . . . . . . . . . . . 16 6. Manageability Considerations . . . . . . . . . . . . . . . . 16
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17
9.1. Initial registry . . . . . . . . . . . . . . . . . . . . 16 7.1. Initial registry . . . . . . . . . . . . . . . . . . . . 17
9.2. Assignment of new hash algorithms . . . . . . . . . . . . 16 7.2. Assignment of new hash algorithms . . . . . . . . . . . . 17
10. Security Considerations . . . . . . . . . . . . . . . . . . . 16 8. Security Considerations . . . . . . . . . . . . . . . . . . . 17
11. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 17 9. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 18
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 18
12.1. Normative References . . . . . . . . . . . . . . . . . . 17 10.1. Normative References . . . . . . . . . . . . . . . . . . 18
12.2. Informative References . . . . . . . . . . . . . . . . . 17 10.2. Informative References . . . . . . . . . . . . . . . . . 18
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19
1. Introduction 1. Introduction
On a multi-access LAN such as an Ethernet, one of the PIM routers is On a multi-access LAN, such as an Ethernet, with one or more PIM-SM
elected as a DR. The PIM DR has two roles in the PIM-SM protocol. [RFC7761] routers, one of the PIM-SM routers is elected as a
On the first hop LAN, the PIM DR is responsible for registering an Designated Router (DR). The PIM DR has two responsibilities in the
active source with the Rendezvous Point (RP) if the group is PIM-SM protocol. For any active sources on a LAN, the PIM DR is
operating in PIM-SM. On the last hop LAN, the PIM DR is responsible responsible for registering with the Rendezvous Point (RP) if the
for tracking local multicast listeners and forwarding to these group is operating in PIM-SM. Also, the PIM DR is responsible for
listeners if the group is operating in PIM-SM. tracking local multicast listeners and forwarding to these listeners
if the group is operating in PIM-SM.
Consider the following last hop LAN in Figure 1: Consider the following LAN in Figure 1:
(core networks) (core networks)
| | | | | |
| | | | | |
R1 R2 R3 R1 R2 R3
| | | | | |
--(last hop LAN)-- ----(LAN)----
| |
| |
(many receivers) (many receivers)
Figure 1: Last Hop LAN Figure 1: LAN with receivers
Assume R1 is elected as the Designated Router. According to
[RFC7761], R1 will be responsible for forwarding traffic to that LAN
on behalf of any local members. In addition to keeping track of IGMP
and MLD membership reports, R1 is also responsible for initiating the
creation of source and/or shared trees towards the senders or the
RPs.
Forcing sole data plane forwarding responsibility on the PIM DR Assume R1 is elected as the DR. According to the PIM-SM protocol, R1
uncovers a limitation in the protocol. In comparison, even though an will be responsible for forwarding traffic to that LAN on behalf of
OSPF DR or an IS-IS DIS handles additional duties while running the any local members. In addition to keeping track of membership
OSPF or IS-IS protocols, they are not required to be solely reports, R1 is also responsible for initiating the creation of source
responsible for forwarding packets for the network. On the other and/or shared trees towards the senders or the RPs. The membership
hand, on a last hop LAN, only the PIM DR is asked to forward packets reports would be IGMP or MLD messages. This applies to any versions
while the other routers handle only control traffic (and perhaps drop of the IGMP and MLD protocols. The most recent versions are IGMPv3
packets due to RPF failures). Hence the forwarding load of a last [RFC3376] and MLDv2 [RFC3810].
hop LAN is concentrated on a single router.
This leads to several issues. One of the issues is that the Having a single router acting as DR and being responsible for data
aggregated bandwidth will be limited to what R1 can handle towards plane forwarding leads to several issues. One of the issues is that
this particular interface. It is very common that the last hop LAN the aggregated bandwidth will be limited to what R1 can handle with
consists of switches that run IGMP/MLD or PIM snooping. This allows regards to capacity of incoming links, the interface on the LAN, and
total forwarding capacity. It is very common that a LAN consists of
switches that run IGMP/MLD or PIM snooping [RFC4541]. This allows
the forwarding of multicast packets to be restricted only to segments the forwarding of multicast packets to be restricted only to segments
leading to receivers who have indicated their interest in multicast leading to receivers who have indicated their interest in multicast
groups using either IGMP or MLD. The emergence of the switched groups using either IGMP or MLD. The emergence of the switched
Ethernet allows the aggregated bandwidth to exceed, sometimes by a Ethernet allows the aggregated bandwidth to exceed, sometimes by a
large number, that of a single link. For example, let us modify large number, that of a single link. For example, let us modify
Figure 1 and introduce an Ethernet switch in Figure 2. Figure 1 and introduce an Ethernet switch in Figure 2.
(core networks) (core networks)
| | | | | |
| | | | | |
R1 R2 R3 R1 R2 R3
| | | | | |
+=gi0===gi1===gi2=+ +=gi0===gi1===gi2=+
+ + + +
+ switch + + switch +
+ + + +
+=gi4===gi5===gi6=+ +=gi4===gi5===gi6=+
| | | | | |
H1 H2 H3 H1 H2 H3
Figure 2: Last Hop Network with Ethernet Switch Figure 2: LAN with Ethernet Switch
Let us assume that each individual link is a Gigabit Ethernet. Each Let us assume that each individual link is a Gigabit Ethernet. Each
router, R1, R2 and R3, and the switch have enough forwarding capacity router, R1, R2 and R3, and the switch have enough forwarding capacity
to handle hundreds of Gigabits of data. to handle hundreds of Gigabits of data.
Let us further assume that each of the hosts requests 500 Mbps of Let us further assume that each of the hosts requests 500 Mbps of
unique multicast data. This totals to 1.5 Gbps of data, which is unique multicast data. This totals to 1.5 Gbps of data, which is
less than what each switch or the combined uplink bandwidth across less than what each switch or the combined uplink bandwidth across
the routers can handle, even under failure of a single router. the routers can handle, even under failure of a single router.
On the other hand, the link between R1 and switch, via port gi0, can On the other hand, the link between R1 and switch, via port gi0, can
only handle a throughput of 1Gbps. And if R1 is the only DR (the PIM only handle a throughput of 1Gbps. And if R1 is the only DR (the PIM
DR elected using the procedure defined by [RFC7761]) at least 500 DR elected using the procedure defined by [RFC7761]) at least 500
Mbps worth of data will be lost because the only link that can be Mbps worth of data will be lost because the only link that can be
used to draw the traffic from the routers to the switch is via gi0. used to draw the traffic from the routers to the switch is via gi0.
In other words, the entire network's throughput is limited by the In other words, the entire network's throughput is limited by the
single connection between the PIM DR and the switch (or the last hop single connection between the PIM DR and the switch (or LAN as in
LAN as in Figure 1). Figure 1).
Another important issue is related to failover. If R1 is the only Another important issue is related to failover. If R1 is the only
forwarder on the last hop router for a shared LAN, when R1 goes out forwarder on a shared LAN, when R1 goes out of service, multicast
of service, multicast forwarding for the entire LAN has to be rebuilt forwarding for the entire LAN has to be rebuilt by the newly elected
by the newly elected PIM DR. However, if there was a way that PIM DR. However, if there was a way that allowed multiple routers to
allowed multiple routers to forward to the LAN for different groups, forward to the LAN for different groups, failure of one of the
failure of one of the routers would only lead to disruption to a routers would only lead to disruption to a subset of the flows,
subset of the flows, therefore improving the overall resilience of therefore improving the overall resilience of the network.
the network.
There is a limitation in the hash algorithm used in this document,
but this document provides the option to have different and more
consistent hash algorithms in the future.
This document specifies a modification to the PIM-SM protocol that This document specifies a modification to the PIM-SM protocol that
allows more than one of these routers, called Group Designated allows more than one of these routers, called Group Designated
Routers (GDR) to be selected so that the forwarding load can be Routers (GDR) to be selected so that the forwarding load can be
distributed among a number of routers. distributed among a number of routers.
2. Terminology 2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
document are to be interpreted as described in [RFC2119]. "OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
With respect to PIM, this document follows the terminology that has With respect to PIM-SM, this document follows the terminology that
been defined in [RFC7761]. has been defined in [RFC7761].
This document also introduces the following new acronyms: This document also introduces the following new acronyms:
o GDR: GDR stands for "Group Designated Router". For each multicast o GDR: Group Designated Router. For each multicast flow, either a
flow, either a (*,G) for ASM, or an (S,G) for SSM, a hash (*,G) for Any-Source Multicast (ASM), or an (S,G) for Source-
algorithm (described below) is used to select one of the routers Specific Multicast (SSM) [RFC4607], a hash algorithm (described
as a GDR. The GDR is responsible for initiating the forwarding below) is used to select one of the routers as a GDR. The GDR is
tree building process for the corresponding multicast flow. responsible for initiating the forwarding tree building process
for the corresponding multicast flow.
o GDR Candidate: a last hop router that has the potential to become o GDR Candidate: a router that has the potential to become a GDR.
a GDR. A GDR Candidate must have the same DR priority and must There might be multiple GDR Candidates on a LAN, but only one can
run the same GDR election hash algorithm as the DR router. It become the GDR for a specific multicast flow.
must send and process new PIM Hello Options as defined in this
document. There might be more than one GDR Candidate on a LAN,
but only one can become GDR for a specific multicast flow.
3. Applicability 3. Applicability
The extension specified in this document applies to PIM-SM last hop The extension specified in this document applies to PIM-SM when they
routers only. It does not alter the behavior of a PIM DR on the act as last hop routers (there are directly connected receivers). It
first hop network. This is because the source tree is built using does not alter the behavior of a PIM DR, or any other routers, on the
the IP address of the sender, not the IP address of the PIM DR that first hop network (directly connected sources). This is because the
sends the registers towards the RP. The load balancing between first source tree is built using the IP address of the sender, not the IP
hop routers can be achieved naturally if an IGP provides equal cost address of the PIM DR that sends the registers towards the RP. The
multiple paths (which it usually does in practice). Also load balancing between first hop routers can be achieved naturally if
distributing the load to do registering does not justify the an IGP provides equal cost multiple paths (which it usually does in
additional complexity required to support it. practice). Also distributing the load to do registering does not
justify the additional complexity required to support it.
4. Functional Overview 4. Functional Overview
In the PIM DR election as defined in [RFC7761], when multiple last In the PIM DR election as defined in [RFC7761], when multiple routers
hop routers are connected to a multi-access LAN (for example, an are connected to a multi-access LAN (for example, an Ethernet), one
Ethernet), one of them is elected to act as PIM DR. The PIM DR is of them is elected to act as PIM DR. The PIM DR is responsible for
responsible for sending local Join/Prune messages towards the RP or sending local Join/Prune messages towards the RP or source. In order
source. In order to elect the PIM DR, each PIM router on the LAN to elect the PIM DR, each PIM router on the LAN examines the received
examines the received PIM Hello messages and compares its own DR PIM Hello messages and compares its own DR priority and IP address
priority and IP address with those of its neighbors. The router with with those of its neighbors. The router with the highest DR priority
the highest DR priority is the PIM DR. If there are multiple such is the PIM DR. If there are multiple such routers, their IP
routers, their IP addresses are used as the tie-breaker, as described addresses are used as the tie-breaker, as described in [RFC7761].
in [RFC7761].
In order to share forwarding load among last hop routers, besides the In order to share forwarding load among last hop routers, besides the
normal PIM DR election, the GDR is also elected on the last hop normal PIM DR election, the GDR is also elected on the multi-access
multi-access LAN. There is only one PIM DR on the multi-access LAN, LAN. There is only one PIM DR on the multi-access LAN, but there
but there might be multiple GDR Candidates. might be multiple GDR Candidates.
For each multicast flow, that is, (*,G) for ASM and (S,G) for SSM, a For each multicast flow, that is, (*,G) for ASM and (S,G) for SSM, a
hash algorithm is used to select one of the routers to be the GDR. A hash algorithm is used to select one of the routers to be the GDR. A
new DR Load Balancing Capability (DRLBC) PIM Hello Option, which new DR Load Balancing Capability (DRLB-Cap) PIM Hello Option, which
contains hash algorithm type, is announced by routers on interfaces contains hash algorithm type, is announced by routers on interfaces
where this specification is enabled. Last hop routers with the new where this specification is enabled. Routers with the new DRLB-Cap
DRLBC Option advertised in its Hello, and using the same GDR election Option advertised in their PIM Hello, using the same GDR election
hash algorithm and the same DR priority as the PIM DR, are considered hash algorithm and the same DR priority as the PIM DR, are considered
as GDR Candidates. as GDR Candidates.
Hash Masks are defined for Source, Group and RP separately, in order Hash Masks are defined for Source, Group and RP separately, in order
to handle PIM ASM/SSM. The masks, as well as a sorted list of GDR to handle PIM ASM/SSM. The masks, as well as a sorted list of GDR
Candidate Addresses, are announced by the DR in a new DR Load Candidate Addresses, are announced by the DR in a new DR Load
Balancing GDR (DRLBGDR) PIM Hello Option. Balancing List (DRLB-List) PIM Hello Option.
A hash algorithm based on the announced Source, Group, or RP masks A hash algorithm based on the announced Source, Group, or RP masks
allows one GDR to be assigned to a corresponding multicast state. allows one GDR to be assigned to a corresponding multicast state.
And that GDR is responsible for initiating the creation of the And that GDR is responsible for initiating the creation of the
multicast forwarding tree for multicast traffic. multicast forwarding tree for multicast traffic.
4.1. GDR Candidates 4.1. GDR Candidates
GDR is the new concept introduced by this specification. GDR GDR is the new concept introduced by this specification. GDR
Candidates are routers eligible for GDR election on the LAN. To Candidates are routers eligible for GDR election on the LAN. To
become a GDR Candidate, a router MUST support this specification, become a GDR Candidate, a router must have the same DR priority and
have the same DR priority and run the same GDR election hash run the same GDR election hash algorithm as the DR on the LAN.
algorithm as the DR on the LAN.
For example, assume there are 4 routers on the LAN: R1, R2, R3 and For example, assume there are 4 routers on the LAN: R1, R2, R3 and
R4, which all support this specification. R1, R2 and R3 have the R4, each announcing a DRLB-Cap option. R1, R2 and R3 have the same
same DR priority while R4's DR priority is less preferred. In this DR priority while R4's DR priority is less preferred. In this
example, R4 will not be eligible for GDR election, because R4 will example, R4 will not be eligible for GDR election, because R4 will
not become a PIM DR unless all of R1, R2 and R3 go out of service. not become a PIM DR unless all of R1, R2 and R3 go out of service.
Furthermore, assume router R1 wins the PIM DR election, R1 and R2 run Furthermore, assume router R1 wins the PIM DR election, R1 and R2 run
the same hash algorithm for GDR election, while R3 runs a different the same hash algorithm for GDR election, while R3 runs a different
one. In this case, only R1 and R2 will be eligible for GDR election, one. In this case, only R1 and R2 will be eligible for GDR election,
while R3 will not. while R3 will not.
As a DR, R1 will include its own Load Balancing Hash Masks and the As a DR, R1 will include its own Load Balancing Hash Masks and the
identity of R1 and R2 (the GDR Candidates) in its DRLBGDR Hello identity of R1 and R2 (the GDR Candidates) in its DRLB-List Hello
Option. Option.
4.2. Hash Mask and Hash Algorithm 5. Protocol Specification
5.1. Hash Mask and Hash Algorithm
A Hash Mask is used to extract a number of bits from the A Hash Mask is used to extract a number of bits from the
corresponding IP address field (32 for v4, 128 for v6) and calculate corresponding IP address field (32 for IPv4, 128 for IPv6) and
a hash value. A hash value is used to select a GDR from GDR calculate a hash value. A hash value is used to select a GDR from
Candidates advertised by PIM DR. For example, 0.0.255.0 defines a GDR Candidates advertised by PIM DR. For example, 0.0.255.0 defines
Hash Mask for an IPv4 address that masks the first, the second, and a Hash Mask for an IPv4 address that masks the first, the second, and
the fourth octets. the fourth octets. Hash masks allow for certain flows to always be
forwarded by the same GDR, since the hash values are the same. For
instance the mask 0.0.255.0 means that only the third octet will be
considered when hashing.
In the text below, a hash mask is in some places said to be zero. A
hash mask is zero if no bits are set. That is, 0.0.0.0 for IPv4 and
:: for IPv6. Also, a hash mask is said to be an all-bits-set mask if
it is 255.255.255.255 for IPv4 or
FFFF:FFFF:FFFF:FFFF:FFFFF:FFFF:FFFF:FFFF for IPv6.
There are three Hash Masks defined: There are three Hash Masks defined:
o RP Hash Mask o RP Hash Mask
o Source Hash Mask o Source Hash Mask
o Group Hash Mask o Group Hash Mask
The hash masks need to be configured on the PIM routers that can The hash masks need to be configured on the PIM routers that can
potentially become a PIM DR, unless the implementation provides potentially become a PIM DR, unless the implementation provides
default Hash Mask values. An implementation SHOULD provide masks default hash mask values. An implementation SHOULD have default hash
with default values 255.255.255.255 (IPv4) and mask values as follows. The default RP Hash Mask SHOULD be zero (no
FFFF:FFFF:FFFF:FFFF:FFFFF:FFFF:FFFF:FFFF (IPv6). bits set). The default Source and Group Hash Masks SHOULD both be
all-bits-set masks. These default values are likely acceptable for
most deployments, and simplify configuration.
The DRLB-List Hello Option contains a list of GDR Candidates. The
first one listed has ordinal number 0, the second listed ordinal
number 1, and the last one has ordinal number N - 1 if there are N
candidates listed. The hash value computed will be the ordinal
number of the GDR Candidate that is acting as GDR.
o If the group is in ASM mode and the RP Hash Mask announced by the o If the group is in ASM mode and the RP Hash Mask announced by the
PIM DR is not 0, calculate the value of hashvalue_RP [Section 4.3] PIM DR is not zero (at least one bit is set), calculate the value
to determine GDR. of hashvalue_RP [Section 5.2] to determine the GDR.
o If the group is in ASM mode and the RP Hash Mask announced by the o If the group is in ASM mode and the RP Hash Mask announced by the
PIM DR is 0, obtain the value of hashvalue_Group [Section 4.3 ] to PIM DR is zero (no bits are set), obtain the value of
determine GDR. hashvalue_Group [Section 5.2] to determine the GDR.
o If the group is in SSM mode, use hashvalue_SG [Section 4.3] to o If the group is in SSM mode, use hashvalue_SG [Section 5.2] to
determine GDR. determine the GDR.
A simple Modulo hash algorithm is defined in this document. However, A simple Modulo hash algorithm is defined in this document. However,
to allow another hash algorithms to be used, a 1-octet "Hash to allow another hash algorithms to be used, a 1-octet "Hash
Algorithm" field is included in DRLBC Hello Option to specify the Algorithm" field is included in the DRLB-Cap Hello Option to specify
hash algorithm used by a last hop router. the hash algorithm used by the router.
If different hash algorithms are advertised among last hop routers, If different hash algorithms are advertised among the routers on a
only last hop routers running the same hash algorithm as the DR (and LAN, only the outers advertising the same hash algorithm as the DR
having the same DR priority as the DR) are eligible for GDR election. (as well as having the same DR priority as the DR) are eligible for
GDR election.
4.3. Modulo Hash Algorithm 5.2. Modulo Hash Algorithm
The Modulo hash algorithm is discussed here with a detailed As part of computing the hash, the notation LSZC(hash_mask) is used
description on hashvalue_RP. The same algorithm is described in to denote the number of zeroes counted from the least significant bit
brief for hashvalue_Group using the group address instead of the RP of a Hash Mask hash_mask. As an example, LSZC(255.255.128) is 7 and
address for an ASM group with zero RP_hashmask, and also with also LSZC(FFFF:8000::) is 111. If all bits are set, LSZC will be 0.
hashvalue_SG for a the source address of an (S,G), instead of the RP If the mask is zero, then LSZC will be 32 for IPv4, and 128 for IPv6.
address,
o For ASM groups, with a non-zero RP_Hash Mask, hash value is The number of GDR Candidates is denoted as GDRC.
calculated as:
hashvalue_RP = (((RP_address & RP_hashmask) >> N) & 0xFFFF) % M The idea behind the Modulo hash algorithm is in simple terms that the
corresponding mask is applied to a value, then the result is shifted
right LSZC(mask) bits so that the least significant bits that were
masked out are not considered. Then this result is masked by 0xFFFF,
keeping only the last 32 bits of the result (this only makes a
difference for IPv6). Finally, the hash value is this result modulo
the number of GDR Candidates (GDRC).
RP_address is the address of the RP defined for the group. N The Modulo hash algorithm for computing the values hashvalue_RP,
is the number of zeroes, counted from the least significant bit hashvalue_Group and hashvalue_SG is defined as follows.
of the RP_hashmask. M is the number of GDR Candidates.
For example, Router X with IPv4 address 203.0.113.1 receives a hashvalue_RP is calculated as:
DRLBGDR Hello Option from the DR, which announces RP Hash Mask
0.0.255.0 and a list of GDR Candidates, sorted by IP addresses
from high to low: 203.0.113.3, 203.0.113.2 and 203.0.113.1.
The ordinal number assigned to those addresses would be:
0 for 203.0.113.3; 1 for 203.0.113.2; 2 for 203.0.113.1 (Router (((RP_address & RP_mask) >> LSZC(RP_mask)) & 0xFFFF) % GDRC
X)
Assume there are 2 RPs: RP1 192.0.2.1 for Group1 and RP2 RP_address is the address of the RP defined for the group and
198.51.100.2 for Group2. Following the modulo hash algorithm: RP_mask is the RP Hash Mask.
N is 8 for 0.0.255.0, and M is 3 for the total number of GDR hashvalue_Group is calculated as:
Candidates. The hashvalue_RP for RP1 192.0.2.1 is:
(((192.0.2.1 & 0.0.255.0) >> 8) & 0xFFFF % 3) = 2 % 3 = 2 (((Group_address & Group_mask) >> LSZC(Group_mask)) & 0xFFFF) %
GDRC
matches the ordinal number assigned to Router X. Router X will Group_address is the group address and Group_mask is the Group
be the GDR for Group1, which uses 192.0.2.1 as the RP. Hash Mask.
The hashvalue_RP for RP2 198.51.100.2 is: hashvalue_SG is calculated as:
(((198.51.100.2 & 0.0.255.0) >> 8) & 0xFFFF % 3) = 100 % 3 = 1 ((((Source_address & Source_mask) >> LSZC(Source_mask)) & 0xFFFF)
which is different from Router X's ordinal number(2) hence, ^ (((Group_address & Group_mask) >> LSZC(Group_mask)) & 0xFFFF)) %
Router X will not be GDR for Group2. GDRC
o If RP_hashmask is 0, a hash value for an ASM group is calculated Group_address is the group address and Group_mask is the Group
using the Group Hash Mask: Hash Mask.
hashvalue_Group = (((Group_address & Group_hashmask) >> N) & 5.2.1. Modulo Hash Algorithm Example
0xFFFF) % M
Compare hashvalue_Group with Ordinal number assigned to Router To help illustrate the algorithm, consider this example. Router X
X, to decide if Router X is the GDR. with IPv4 address 203.0.113.1 receives a DRLB-List Hello Option from
the DR, which announces RP Hash Mask 0.0.255.0 and a list of GDR
Candidates, sorted by IP addresses from high to low: 203.0.113.3,
203.0.113.2 and 203.0.113.1. The ordinal number assigned to those
addresses would be:
o For SSM groups, a hash value is calculated using both the Source 0 for 203.0.113.3; 1 for 203.0.113.2; 2 for 203.0.113.1 (Router X)
and Group Hash Mask:
hashvalue_SG = ((((Source_address & Source_hashmask) >> N_S) & Assume there are 2 RPs: RP1 192.0.2.1 for Group1 and RP2 198.51.100.2
0xFFFF) ^ (((Group_address & Group_hashmask) >> N_G) & 0xFFFF)) for Group2. Following the modulo hash algorithm:
% M
4.3.1. Limitations LSZC(0.0.255.0) is 8 and GDRC is 3. The hashvalue_RP for Group1 with
RP RP1 is:
(((192.0.2.1 & 0.0.255.0) >> 8) & 0xFFFF % 3) = 2 % 3 = 2
which matches the ordinal number assigned to Router X. Router X will
be the GDR for Group1.
The hashvalue_RP for Group2 with RP RP2 is:
(((198.51.100.2 & 0.0.255.0) >> 8) & 0xFFFF % 3) = 100 % 3 = 1
which is different from the ordinal number of router X (2). Hence,
Router X will not be GDR for Group2.
5.2.2. Limitations
The Modulo Hash Algorithm has poor failover characteristics when a The Modulo Hash Algorithm has poor failover characteristics when a
shared LAN has more than two GDRs. In the case of more than two GDRs shared LAN has more than two GDRs. In the case of more than two GDRs
on a LAN, when one GDR fails, all of the groups may be reassigned to on a LAN, when one GDR fails, all of the groups may be reassigned to
a new GDR, even if they were not assigned to the failed GDR. a different GDR, even if they were not assigned to the failed GDR.
However, many deployments use only two routers on a shared LAN for However, many deployments use only two routers on a shared LAN for
redundancy purposes. Future work may define new hash algorithms redundancy purposes. Future work may define new hash algorithms
where only groups assigned to the failed GDR get reassigned. where only groups assigned to the failed GDR get reassigned.
4.4. PIM Hello Options 5.3. PIM Hello Options
When a last hop PIM router sends a PIM Hello for an interface with
this specification enabled, it includes a new option, called "Load
Balancing Capability (DRLBC)".
Besides this DRLBC Hello Option, the elected PIM DR also includes a
new "DR Load Balancing GDR (DRLBGDR) Hello Option". The DRLBGDR
Hello Option consists of three Hash Masks as defined above and also a
sorted list of GDR Candidate addresses on the last hop LAN.
The elected PIM DR uses DRLBC Hello Option advertised by all routers When a PIM router sends a PIM Hello on an interface with this
on the last hop LAN to compose the DRLBGDR Option. The GDR specification enabled, it includes a new option, called "Load
Candidates use the DRLBGDR Hello Option advertised by the PIM DR to Balancing Capability (DRLB-Cap)".
calculate the hash value.
5. Hello Option Formats Besides this DRLB-Cap Hello Option, the elected PIM DR also includes
a new "DR Load Balancing List (DRLB-List) Hello Option". The DRLB-
List Hello Option consists of three Hash Masks as defined above and
also a sorted list of GDR Candidate addresses on the LAN.
5.1. PIM DR Load Balancing Capability (DRLBC) Hello Option 5.3.1. PIM DR Load Balancing Capability (DRLB-Cap) Hello Option
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type = TBD | Length = 4 | | Type = 34 | Length = 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |Hash Algorithm | | Reserved |Hash Algorithm |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: Capability Hello Option Figure 3: PIM DR Load Balancing Capability Hello Option
Type: TBD Type: 34
Length: 4 Length: 4
Hash Algorithm: 0 for Modulo Reserved: Transmitted as zero, ignored on receipt.
This DRLBC Hello Option MUST be advertised by last hop routers on Hash Algorithm: Hash algorithm type. 0 for the Modulo algorithm
interfaces with this specification enabled. defined in this document.
5.2. PIM DR Load Balancing GDR (DRLBGDR) Hello Option This DRLB-Cap Hello Option MUST be advertised by routers on all
interfaces where DR Load Balancing is enabled.
5.3.2. PIM DR Load Balancing List (DRLB-List) Hello Option
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type = TBD | Length | | Type = 35 | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Mask | | Group Mask |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Mask | | Source Mask |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RP Mask | | RP Mask |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| GDR Candidate Address(es) | | GDR Candidate Address(es) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: GDR Hello Option Figure 4: PIM DR Load Balancing List Hello Option
Type: TBD Type: 35
Length: (3 + n) x (4 or 16) where n is the number of GDR Length: (3 + n) x (4 or 16), where n is the number of GDR
candidates. candidates.
Group Mask (32/128 bits): Mask Group Mask (32/128 bits): Mask applied to group addresses as part
of hash computation.
Source Mask (32/128 bits): Mask Source Mask (32/128 bits): Mask applied to source addresses as
part of hash computation.
RP Mask (32/128 bits): Mask RP Mask (32/128 bits): Mask applied to RP addresses as part of
hash computation.
All masks MUST be in the same address family as the Hello IP All masks MUST have the same number of bits as the IP source
header. address in the PIM Hello IP header.
GDR Address (32/128 bits): Address(es) of GDR Candidate(s) GDR Address (32/128 bits): Address(es) of GDR Candidate(s)
All addresses must be in the same address family as the Hello All addresses MUST be in the same address family as the PIM
IP header. The addresses are sorted in descending order. The Hello IP header. It is RECOMMENDED that the addresses are
order is converted to the ordinal number associated with each sorted in descending order.
GDR candidate in hash value calculation. For example, if
addresses advertised are R3, R2, R1, the ordinal number
assigned to R3 is 0, to R2 is 1 and to R1 is 2.
If the "Interface ID" option, as specified in [RFC6395], is If the "Interface ID" option, as specified in [RFC6395], is
present in a GDR Candidate's PIM Hello message, and the "Router present in a GDR Candidate's PIM Hello message, and the "Router
ID" portion is non-zero: ID" portion is non-zero:
+ For IPv4, the "GDR Candidate Address" will be set directly + For IPv4, the "GDR Candidate Address" will be set directly
to the "Router ID". to the "Router ID".
+ For IPv6, the "GDR Candidate Address" will be set to the + For IPv6, the "GDR Candidate Address" will be 96 bits of
IPv4-IPv6 translated address of the "Router ID", as zeroes followed by the 32 bit Router ID.
described in [RFC4291], that is the "Router-ID" is appended
to the prefix of 96 bits of zeroes.
If the "Interface ID" option is not present in a GDR
Candidate's PIM Hello message, or if the "Interface ID" option
is present but the "Router ID" field is zero, the "GDR
Candidate Address" will be the IPv4 or IPv6 source address of
the PIM Hello message.
This DRLBGDR Hello Option MUST only be advertised by the If the "Interface ID" option is not present in a GDR Candidate'
elected PIM DR. PIM Hello message, or if the "Interface ID" option is present
but the "Router ID" field is zero, the "GDR Candidate Address"
will be the IPv4 or IPv6 source address of the PIM Hello
message.
6. Protocol Specification This DRLB-List Hello Option MUST only be advertised by the
elected PIM DR. It MUST be ignored if received from a non-DR.
6.1. PIM DR Operation 5.4. PIM DR Operation
The DR election process is still the same as defined in [RFC7761]. A The DR election process is still the same as defined in [RFC7761]. A
DR that has this specification enabled on an interface advertises the DR that has this specification enabled on an interface advertises the
new DRLBGDR Hello Option, which contains mask values from user new DRLB-List Hello Option, which contains mask values from user
configuration, followed by a sorted list of GDR Candidate Addresses, configuration (or default values), followed by a list of GDR
from the highest value to the lowest value. Moreover, same as non-DR Candidate Addresses. It is RECOMMENDED that the list is sorted, from
routers, the DR also advertises DRLBC Hello Option to indicate its the highest value to the lowest value. The reason for sorting the
list is to make the behavior deterministic, regardless of the order
the DR learns of new candidates. Note that same as non-DR routers,
the DR also advertises DRLB-Cap Hello Option to indicate its
capability of supporting this specification and the type of its GDR capability of supporting this specification and the type of its GDR
election hash algorithm. election hash algorithm.
If a PIM DR receives a PIM Hello with the DRLBGDR Option, the PIM DR If a PIM DR receives a neighbor DRLB-Cap Hello Option, which contains
SHOULD ignore the TLV.
If a PIM DR receives a neighbor DRLBC Hello Option, which contains
the same hash algorithm as the DR, and the neighbor has the same DR the same hash algorithm as the DR, and the neighbor has the same DR
priority as the DR, PIM DR SHOULD consider the neighbor as a GDR priority as the DR, PIM DR SHOULD consider the neighbor as a GDR
Candidate and insert the GDR Candidate's Address into the sorted list Candidate and insert the GDR Candidate' Address into the list of the
of the DRLBGDR Option. However, the DR MAY have policies limiting DRLB-List Option. However, the DR may have policies limiting which
which GDR Candidates, or the number of GDR Candidates to include. GDR Candidates, or the number of GDR Candidates to include. The DR
would normally include itself in the list of GDR Candidates.
6.2. PIM GDR Candidate Operation If a PIM neighbor included in the list expires, stops announcing the
DRLB-Cap Hello Option, changes DR priority, changes hash algorithm or
otherwise becomes ineligibile as a candidate, the DR should
immediately send a triggered hello with a new list in the DRLB-List
option, excluding the neighbor.
If a new router becomes eligible as a candidate, there is no urgency
in sending out an updated list. An updated list SHOULD be included
in the next hello.
5.5. PIM GDR Candidate Operation
When an IGMP/MLD report is received, without this specification, only When an IGMP/MLD report is received, without this specification, only
the PIM DR will handle the join and potentially run into the issues the PIM DR will handle the join and potentially run into the issues
described earlier. Using this specification, a hash algorithm is described earlier. Using this specification, a hash algorithm is
used by the GDR Candidates to determine which router is going to be used by the GDR Candidates to determine which router is going to be
responsible for building forwarding trees on behalf of the host. responsible for building forwarding trees on behalf of the host.
If this specification is enabled on an interface, the router MUST If this specification is enabled on an interface, the router MUST
include the DRLBC Hello Option in its PIM Hello on the interface. include the DRLB-Cap Hello Option in all PIM Hello messages sent on
Note that the presence of the DRLBC Option in PIM Hello does not that interface. Note that the presence of the DRLB-Cap Option in PIM
guarantee that this router would be considered as a GDR candidate. Hello does not guarantee that this router would be considered as a
Once DR election is done, the DRLBGDR Hello Option would be received GDR candidate. Once DR election is done, the DRLB-List Hello Option
from the current PIM DR on the link which would contain a list of would be received from the current PIM DR on the link which would
GDRs selected by the PIM DR. contain a list of GDRs Candidates selected by the PIM DR.
A router only acts as a GDR candidate if it is included in the GDR
list of the DRLBGDR Hello Option.
A GDR Candidate may receive a DRLBGDR Hello Option from the PIM DR
with different Hash Masks from those the candidate was configured
with. The GDR Candidate MUST use the Hash Masks advertised by the
PIM DR to calculate the hash value.
A GDR Candidate MUST ignore the DRLBGDR Hello Option if it is
received from a PIM router which is not the DR.
If the PIM DR does not support this specification, GDR election will A router only acts as a GDR Candidate if it is included in the GDR
not take place, and only the PIM DR joins the multicast tree. Candidate list of the DRLB-List Hello Option. See next section for
details.
6.2.1. Router Receives New DRLBGDR 5.6. DRLB-List Hello Option Processing
The first time a router receives a DRLBGDR option from the PIM DR, it This section discusses processing of the DRLB-List Hello Option. All
MUST process the option and check if it is in the GDR list. routers MUST ignore the DRLB-List Hello Option if it is received from
a PIM router which is not the DR. The option MUST only be processed
by routers that are announcing the DRLB-Cap Option. Also, the
algorithm announced in the DRLB-Cap Option, MUST be the same as what
was announced by the DR. All GDR Candidates MUST use the Hash Masks
advertised in the Option, even if they differ from those the
candidate was configured with.
1. If a router is not listed as a GDR candidate in DRLBGDR, no A router stores the latest option contents that was announced, if
action is needed. any, and deletes the previous contents. The router MUST also compare
the new contents with any previous contents, and if there are any
changes, continue processing as below. Note that if the option does
not pass the above checks, the below processing MUST be done as if
the option was not announced.
2. If a router is listed as a GDR candidate in DRLBGDR, then it MUST If the contents of the DRLB-List Option, the masks or the candidate
process each of the groups, or source and group pairs if SSM, in list, differs from the previously saved copy, it is received for the
the IGMP/MLD reports. The masks are announced in the PIM Hello first time, or it is no longer being received or accepted, the option
by the DR in the DRLBGDR Hello Option. For each group in the MUST be processed as below.
reports that is in ASM mode, and each source and group pair if
the group is in SSM mode, it (PIM Router) needs to run the hash
algorithm (described in section 4.3) based on the announced
Source, Group or RP masks to determine if it is the GDR for
specified group, or source and group pair. If the hash result is
to be the GDR for the multicast flow, it does build the multicast
forwarding tree. If it is not the GDR for the multicast flow, no
action is needed.
6.2.2. Router Receives Updated DRLBGDR 1. If the router was not included in the previous GDR list, or there
was no previous GDR list, but it is included in the new GDR list,
the router MUST for each of the groups, or source and group pairs
if the group is in SSM mode, with local receiver interest, run
the hash algorithm to determine which of them it is the GDR for.
If a router (GDR or non GDR) receives an unchanged DRLBGDR from the If it is not the GDR for a group, or source and group pair if
current PIM DR, no action is needed. SSM, no processing is required.
If a router (GDR or non GDR) receives a new or modified DRLBGDR from If it is hashed as the GDR, it needs to build a multicast
the current PIM DR, it requires processing as described below: forwarding tree.
1. If it was included in the previous GDR list, and still is 2. If the router was included in the previous GDR list, and still is
included in the new GDR list: It needs to process each of the included in the new GDR list: The router MUST for each of the
groups, or source and group pairs if the group is in SSM mode, groups, or source and group pairs if the group is in SSM mode,
and run the hash algorithm to check if it is still the GDR for with local receiver interest, run the hash algorithm to determine
the given group, or source and group pair if SSM. which of them it is the GDR for.
If it was the GDR for a group, or source and group pair if If it was the GDR for a group, or source and group pair if
SSM, and the new hash result chose it as the GDR, then no SSM, and the new hash result chose it as the GDR, then no
processing is required. processing is required.
If it was the GDR for a group, or source and group pair if If it was the GDR for a group, or source and group pair if
SSM, earlier and now it is no longer the GDR, then it sets its SSM, earlier and now it is no longer the GDR, then it sets the
assert metric for the multicast flow to be assert metric preference to maximum (0x7FFFFFFF) and the
(PIM_ASSERT_INFINITY - 1), as explained in Section 6.3. assert metric to one less than maximum (0xFFFFFFFE), as
explained in [Section 5.7].
If it was not the GDR for a group, or source and group pair if If it was not the GDR for a group, or source and group pair if
SSM, earlier, and the new hash does not make it GDR, then no SSM, earlier, and the new hash does not make it GDR, then no
processing is required. processing is required.
If it was not the GDR for an earlier group, or source and If it was not the GDR for an earlier group, or source and
group pair if SSM, and now becomes the GDR, it starts building group pair if SSM, and now becomes the GDR, it starts building
multicast forwarding tree for this flow. multicast forwarding tree for this flow.
2. If it was included in the previous GDR list, but is not included 3. If the router was included in the previous GDR list, but is not
in the new GDR list: It needs to process each of the groups, or included in the new GDR list, or there is no new GDR list: The
source and group pairs if the group is in SSM mode. router MUST for each of the groups, or source and group pairs if
the group is in SSM mode, with local receiver interest do as
follows.
If it was the GDR for a group, or source and group pair if If it was the GDR for a group, or source and group pair if
SSM, it sets its assert metric for the multicast flow to be SSM, it sets the assert metric preference to maximum
(PIM_ASSERT_INFINITY - 1), as explained in Section 6.3. (0x7FFFFFFF) and the assert metric to one less than maximum
(0xFFFFFFFE), as explained in [Section 5.7].
If it was not the GDR, then no processing is required. If it was not the GDR, then no processing is required.
3. If it was not included in the previous GDR list, but is included 5.7. PIM Assert Modification
in the new GDR list, the router MUST run the hash algorithm for
each of the groups, source and group pairs if SSM.
If it is not the GDR for a group, or source and group pair if
SSM, no processing is required.
If it is hashed as the GDR, it needs to build a multicast
forwarding tree.
6.3. PIM Assert Modification
It is possible that the identity of the GDR might change in the
middle of an active flow. Examples when this could happen include:
When a new PIM router comes up GDR changes may occur due to configuration change, due to GDR
candidates going down, and also new routers coming up and becoming
GDR candidates. This may occur while flows are being forwarded. If
the GDR for an active flow changes, there is likely to be some
disruption, such as packet loss or duplicates. By using asserts,
packet loss is minimized, while allowing a small amount of
duplicates.
When a GDR restarts When a router stops acting as the GDR for a group, or source and
group pair if SSM, it MUST set the assert metric preference to
maximum (0x7FFFFFFF) and the assert metric to one less than maximum
(0xFFFFFFFE). This was also mentioned in the previous section. That
is, whenever it sends or receives an assert for the group, it must
use these values as the metric preference and metric rather than the
values provided by routing. This is similar to what is done for
AssertCancel Messages in [RFC7761], except that the metric value here
is one less.
When the GDR changes, existing traffic might be disrupted. The rest of this section is just for illustration purposes and not
Duplicates or packet loss might be observed. To illustrate the case, part of the protocol definition.
consider the following scenario where there are two flows G1 and G2.
R1 is the GDR for G1, and R2 is the GDR for G2. When R3 comes up
online, it is possible that R3 becomes GDR for both G1 and G2, hence
R3 starts to build the forwarding tree for G1 and G2. If R1 and R2
stop forwarding before R3 completes the process, packet loss might
occur. On the other hand, if R1 and R2 continue forwarding while R3
is building the forwarding trees, duplicates might occur.
This is not a typical deployment scenario but might still happen. To illustrate the behavior when there is a GDR change, consider the
Here we describe a mechanism to minimize the impact. We essentially following scenario where there are two flows G1 and G2. R1 is the
want to minimize packet loss. Therefore, we would allow a small GDR for G1, and R2 is the GDR for G2. When R3 comes up, it is
amount of duplicates and depend on PIM Assert to minimize the possible that R3 becomes GDR for both G1 and G2, hence R3 starts to
duplication. build the forwarding tree for G1 and G2. If R1 and R2 stop
forwarding before R3 completes the process, packet loss might occur.
On the other hand, if R1 and R2 continue forwarding while R3 is
building the forwarding trees, duplicates might occur.
When the role of GDR changes as above, instead of immediately When the role of GDR changes as above, instead of immediately
stopping forwarding, R1 and R2 continue forwarding to G1 and G2 stopping forwarding, R1 and R2 continue forwarding to G1 and G2
respectively, while, at the same time, R3 build forwarding trees for respectively, while, at the same time, R3 build forwarding trees for
G1 and G2. This will lead to PIM Asserts. G1 and G2. This will lead to PIM Asserts.
With the introduction of GDR, the following modification to the
Assert packet MUST be done: if a router enables this specification on
its downstream interface, but it is not a GDR (before network event
it was GDR), it would adjust its Assert metric to
(PIM_ASSERT_INFINITY - 1).
Using the above example, for G1, assume R1 and R3 agree on the new Using the above example, for G1, assume R1 and R3 agree on the new
GDR, which is R3. R1 will set its Assert metric as GDR, which is R3. With the new assert behavior, R1 sets its assert
(PIM_ASSERT_INFINITY - 1). That will make R3, which has normal metric to the near maximum value discussed above. That will make R3,
metric in its Assert as the Assert winner. which has normal metric in its Assert as the Assert winner.
For G2, assume it takes a slightly longer time for R2 to find out For G2, assume it takes a slightly longer time for R2 to find out
that R3 is the new GDR and still considers itself being the GDR while that R3 is the new GDR and still considers itself being the GDR while
R3 already has assumed the role of GDR. Since both R2 and R3 think R3 already has assumed the role of GDR. Since both R2 and R3 think
they are GDRs, they further compare their metric and IP addresses. they are GDRs, they further compare their metric and IP addresses.
If R3 has the better routing metric, or the same metric but a better If R3 has the better routing metric, or the same metric but a better
tie-breaker, the result will be consistent during GDR selection. If tie-breaker, the result will be consistent during GDR selection. If
unfortunately, R2 has the better metric or the same metric but a unfortunately, R2 has the better metric or the same metric but a
better tie-breaker, R2 will become the Assert winner and continues to better tie-breaker, R2 will become the Assert winner and continues to
forward traffic. This will continue until: forward traffic. Shortly after when R2 finds out that it is no
longer the GDR, R2 will change to using the near maximum assert
The next PIM Hello Option from DR selects R3 as the GDR. R3 will metric. Next time R2 sends an assert message, it will lose the
then build the forwarding tree and send an Assert. assert and stop forwarding. As assert winner, R2 would send periodic
assert messages per [RFC7761].
The process continues until R2 agrees to the selection of R3 as the
GDR, and sets its own Assert metric to (PIM_ASSERT_INFINITY - 1),
which will make R3 the Assert winner. During the process, we will
see intermittent duplication of traffic but packet loss will be
minimized. In the unlikely case that R2 never relinquishes its role
as GDR (while every other router thinks otherwise), the proposed
mechanism also helps to keep the duplication to a minimum until
manual intervention takes place to remedy the situation.
7. Compatibility 5.8. Backward Compatibility
In the case of a hybrid Ethernet shared LAN (where some PIM routers In the case of a hybrid Ethernet shared LAN (where some PIM routers
enable the specification defined in this document, and some do not) enable the specification defined in this document, and some do not).
o If a router which does not support this specification becomes the o If a router which does not support this specification becomes the
DR on the LAN, then it is the only router acting as a DR, and DR on the LAN, then it is the only router acting as a DR, and
there will be no load-balancing. there will be no load-balancing.
o If a router which does not support this specification becomes a o If a router which does not support this specification becomes a
non-DR on link, then it acts as non-DR defined in [RFC7761], and non-DR on link, then it acts as non-DR defined in [RFC7761], and
it will not take part in any load-balancing. it will not take part in any load-balancing. Load-balancing may
still happen.
8. Manageability Considerations 6. Manageability Considerations
An administrator needs to consider what the total bandwidth
requirements are and find a set of routers that together has enough
total capacity, while making sure that each of the router can handle
its part, assuming that the traffic is distributed roughly equally
among the routers. Ideally, one should also have enough bandwidth to
handle the case where at least one router fails. Ideally all the
routers should have reachability to the sources, and RPs if
applicable, that is not via the LAN.
Care must be taken when choosing what hash masks to configure. One
would typically configure the same masks on all the routers, so that
they are the same, regardless of which router is elected as DR. The
default masks are likely suitable for most deployment. The RP Hash
Mask must be configured (the default is no bits set) if one wishes to
hash based on the RP address rather than the group address for ASM.
The default masks will use the entire group addresses, and source
addresses if SSM, as part of the hash. An administrator may set
other masks that masks out part of the addresses to ensure that
certain flows always get hashed to the same router. How this is
achieved depends on how the group addresses are allocated.
Only the routers announcing the same Hash Algorithm as the DR would Only the routers announcing the same Hash Algorithm as the DR would
be considered as GDR candidates. Network administrators need to make be considered as GDR candidates. Network administrators need to make
sure that the desired set of routers announce the same algorithm. sure that the desired set of routers announce the same algorithm.
Migration between different algorithms is not considered in this Migration between different algorithms is not considered in this
document. document.
9. IANA Considerations 7. IANA Considerations
IANA has temporarily assigned type 34 for the PIM DR Load Balancing IANA has temporarily assigned type 34 for the PIM DR Load Balancing
Capability (DRLBC) Hello Option, and type 35 for the PIM DR Load Capability (DRLB-Cap) Hello Option, and type 35 for the PIM DR Load
Balancing GDR (DRLBGDR) Hello Option in the PIM-Hello Options Balancing List (DRLB-List) Hello Option in the PIM-Hello Options
registry. IANA is requested to make these assignments permanent when registry. IANA is requested to make these assignments permanent when
this document is published as an RFC. The string TBD should be this document is published as an RFC. Note that the option names
replaced by the assigned values accordingly. have changed slightly since the temporary assignments were made.
Also, the length of option 34 is always 4, the registry currently
says it is variable.
This document requests IANA to create a registry called "Designated This document requests IANA to create a registry called "Designated
Router Load Balancing Hash Algorithms" in the "Protocol Independent Router Load Balancing Hash Algorithms" in the "Protocol Independent
Multicast (PIM)" branch of the registry tree. The registry lists Multicast (PIM)" branch of the registry tree. The registry lists
hash algorithms for use by PIM Designated Router Load Balancing. hash algorithms for use by PIM Designated Router Load Balancing.
9.1. Initial registry 7.1. Initial registry
The initial content of the registry should be as follows. The initial content of the registry should be as follows.
Type Name Reference Type Name Reference
------ ---------------------------------------- -------------------- ------ ---------------------------------------- --------------------
0 Modulo This document 0 Modulo This document
1-255 Unassigned 1-255 Unassigned
9.2. Assignment of new hash algorithms 7.2. Assignment of new hash algorithms
Assignment of new hash algorithms is done according to the "IETF Assignment of new hash algorithms is done according to the "IETF
Review" model, see [RFC5226]. Review" model, see [RFC8126].
10. Security Considerations 8. Security Considerations
Security of the new DR Load Balancing PIM Hello Options is only Security of the new DR Load Balancing PIM Hello Options is only
guaranteed by the security of PIM Hello messages, so the security guaranteed by the security of PIM Hello messages, so the security
considerations for PIM Hello messages as described in PIM-SM considerations for PIM Hello messages as described in PIM-SM
[RFC7761] apply here. [RFC7761] apply here.
11. Acknowledgement If the DR is subverted it could omit or add certain GDRs or announce
an unsupported algorithm. If another router is subverted, it could
be made DR and cause similar issues. While these issues are specific
to this specification, they are not that different from existing
attacks such as subverting a DR and lowering the DR priority, causing
a different router to become the DR.
The authors would like to thank Steve Simlo, Taki Millonis for If a GDR is subverted, it could potentially be made to stop
helping with the original idea, Bill Atwood, Bharat Joshi for review forwarding all the traffic it is expected to forward. This is also
comments, Toerless Eckert and Rishabh Parekh for helpful conversation similar today to if a DR is subverted.
on the document.
Special thanks to Anish Kachinthaya, Anvitha Kachinthaya and Jake 9. Acknowledgement
Holland for reviewing the document and providing comments.
12. References The authors would like to thank Steve Simlo and Taki Millonis for
helping with the original idea; Alia Atlas, Bill Atwood, Jake
Holland, Bharat Joshi, Anish Kachinthaya, Anvitha Kachinthaya and
Alvaro Retana for reviews and comments; and Toerless Eckert and
Rishabh Parekh for helpful conversation on the document.
12.1. Normative References 10. References
10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing
Architecture", RFC 4291, DOI 10.17487/RFC4291, February
2006, <https://www.rfc-editor.org/info/rfc4291>.
[RFC6395] Gulrajani, S. and S. Venaas, "An Interface Identifier (ID) [RFC6395] Gulrajani, S. and S. Venaas, "An Interface Identifier (ID)
Hello Option for PIM", RFC 6395, DOI 10.17487/RFC6395, Hello Option for PIM", RFC 6395, DOI 10.17487/RFC6395,
October 2011, <https://www.rfc-editor.org/info/rfc6395>. October 2011, <https://www.rfc-editor.org/info/rfc6395>.
[RFC7761] Fenner, B., Handley, M., Holbrook, H., Kouvelas, I., [RFC7761] Fenner, B., Handley, M., Holbrook, H., Kouvelas, I.,
Parekh, R., Zhang, Z., and L. Zheng, "Protocol Independent Parekh, R., Zhang, Z., and L. Zheng, "Protocol Independent
Multicast - Sparse Mode (PIM-SM): Protocol Specification Multicast - Sparse Mode (PIM-SM): Protocol Specification
(Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March
2016, <https://www.rfc-editor.org/info/rfc7761>. 2016, <https://www.rfc-editor.org/info/rfc7761>.
12.2. Informative References [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 10.2. Informative References
IANA Considerations Section in RFCs", RFC 5226,
DOI 10.17487/RFC5226, May 2008, [RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A.
<https://www.rfc-editor.org/info/rfc5226>. Thyagarajan, "Internet Group Management Protocol, Version
3", RFC 3376, DOI 10.17487/RFC3376, October 2002,
<https://www.rfc-editor.org/info/rfc3376>.
[RFC3810] Vida, R., Ed. and L. Costa, Ed., "Multicast Listener
Discovery Version 2 (MLDv2) for IPv6", RFC 3810,
DOI 10.17487/RFC3810, June 2004,
<https://www.rfc-editor.org/info/rfc3810>.
[RFC4541] Christensen, M., Kimball, K., and F. Solensky,
"Considerations for Internet Group Management Protocol
(IGMP) and Multicast Listener Discovery (MLD) Snooping
Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006,
<https://www.rfc-editor.org/info/rfc4541>.
[RFC4607] Holbrook, H. and B. Cain, "Source-Specific Multicast for
IP", RFC 4607, DOI 10.17487/RFC4607, August 2006,
<https://www.rfc-editor.org/info/rfc4607>.
[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for
Writing an IANA Considerations Section in RFCs", BCP 26,
RFC 8126, DOI 10.17487/RFC8126, June 2017,
<https://www.rfc-editor.org/info/rfc8126>.
Authors' Addresses Authors' Addresses
Yiqun Cai Yiqun Cai
Alibaba Group Alibaba Group
Email: yiqun.cai@alibaba-inc.com Email: yiqun.cai@alibaba-inc.com
Heidi Ou Heidi Ou
Alibaba Group Alibaba Group
Email: heidi.ou@alibaba-inc.com
Sri Vallepalli Sri Vallepalli
Cisco Systems, Inc. Cisco Systems, Inc.
3625 Cisco Way 3625 Cisco Way
San Jose CA 95134 San Jose CA 95134
USA USA
Email: svallepa@cisco.com Email: svallepa@cisco.com
Mankamana Mishra Mankamana Mishra
Cisco Systems, Inc. Cisco Systems, Inc.
 End of changes. 130 change blocks. 
394 lines changed or deleted 459 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/