draft-ietf-pim-drlb-13.txt | draft-ietf-pim-drlb-14.txt | |||
---|---|---|---|---|
Network Working Group Y. Cai | Network Working Group Y. Cai | |||
Internet-Draft H. Ou | Internet-Draft H. Ou | |||
Intended status: Standards Track Alibaba Group | Intended status: Standards Track Alibaba Group | |||
Expires: April 25, 2020 S. Vallepalli | Expires: June 13, 2020 S. Vallepalli | |||
M. Mishra | M. Mishra | |||
S. Venaas | S. Venaas | |||
Cisco Systems, Inc. | Cisco Systems, Inc. | |||
A. Green | A. Green | |||
British Telecom | British Telecom | |||
October 23, 2019 | December 11, 2019 | |||
PIM Designated Router Load Balancing | PIM Designated Router Load Balancing | |||
draft-ietf-pim-drlb-13 | draft-ietf-pim-drlb-14 | |||
Abstract | Abstract | |||
On a multi-access network, one of the PIM-SM routers is elected as a | On a multi-access network, one of the PIM-SM (PIM Sparse Mode) | |||
Designated Router. One of the responsibilities of the Designated | routers is elected as a Designated Router. One of the | |||
Router is to track local multicast listeners and forward data to | responsibilities of the Designated Router is to track local multicast | |||
these listeners if the group is operating in PIM-SM. This document | listeners and forward data to these listeners if the group is | |||
specifies a modification to the PIM-SM protocol that allows more than | operating in PIM-SM. This document specifies a modification to the | |||
one of the PIM-SM routers to take on this responsibility so that the | PIM-SM protocol that allows more than one of the PIM-SM routers to | |||
forwarding load can be distributed among multiple routers. | take on this responsibility so that the forwarding load can be | |||
distributed among multiple routers. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on April 25, 2020. | This Internet-Draft will expire on June 13, 2020. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2019 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 27 ¶ | skipping to change at page 2, line 27 ¶ | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
3. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 5 | 3. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
4. Functional Overview . . . . . . . . . . . . . . . . . . . . . 5 | 4. Functional Overview . . . . . . . . . . . . . . . . . . . . . 5 | |||
4.1. GDR Candidates . . . . . . . . . . . . . . . . . . . . . 6 | 4.1. GDR Candidates . . . . . . . . . . . . . . . . . . . . . 6 | |||
5. Protocol Specification . . . . . . . . . . . . . . . . . . . 7 | 5. Protocol Specification . . . . . . . . . . . . . . . . . . . 7 | |||
5.1. Hash Mask and Hash Algorithm . . . . . . . . . . . . . . 7 | 5.1. Hash Mask and Hash Algorithm . . . . . . . . . . . . . . 7 | |||
5.2. Modulo Hash Algorithm . . . . . . . . . . . . . . . . . . 8 | 5.2. Modulo Hash Algorithm . . . . . . . . . . . . . . . . . . 8 | |||
5.2.1. Modulo Hash Algorithm Examples . . . . . . . . . . . 9 | 5.2.1. Modulo Hash Algorithm Examples . . . . . . . . . . . 9 | |||
5.2.2. Limitations . . . . . . . . . . . . . . . . . . . . . 10 | 5.2.2. Limitations . . . . . . . . . . . . . . . . . . . . . 10 | |||
5.3. PIM Hello Options . . . . . . . . . . . . . . . . . . . . 10 | 5.3. PIM Hello Options . . . . . . . . . . . . . . . . . . . . 11 | |||
5.3.1. PIM DR Load Balancing Capability (DRLB-Cap) Hello | 5.3.1. PIM DR Load Balancing Capability (DRLB-Cap) Hello | |||
Option . . . . . . . . . . . . . . . . . . . . . . . 11 | Option . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
5.3.2. PIM DR Load Balancing List (DRLB-List) Hello Option . 11 | 5.3.2. PIM DR Load Balancing List (DRLB-List) Hello Option . 11 | |||
5.4. PIM DR Operation . . . . . . . . . . . . . . . . . . . . 13 | 5.4. PIM DR Operation . . . . . . . . . . . . . . . . . . . . 13 | |||
5.5. PIM GDR Candidate Operation . . . . . . . . . . . . . . . 13 | 5.5. PIM GDR Candidate Operation . . . . . . . . . . . . . . . 14 | |||
5.6. DRLB-List Hello Option Processing . . . . . . . . . . . . 14 | 5.6. DRLB-List Hello Option Processing . . . . . . . . . . . . 14 | |||
5.7. PIM Assert Modification . . . . . . . . . . . . . . . . . 15 | 5.7. PIM Assert Modification . . . . . . . . . . . . . . . . . 15 | |||
5.8. Backward Compatibility . . . . . . . . . . . . . . . . . 16 | 5.8. Backward Compatibility . . . . . . . . . . . . . . . . . 16 | |||
6. Operational Considerations . . . . . . . . . . . . . . . . . 16 | 6. Operational Considerations . . . . . . . . . . . . . . . . . 16 | |||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 | |||
7.1. Initial registry . . . . . . . . . . . . . . . . . . . . 17 | 7.1. Initial registry . . . . . . . . . . . . . . . . . . . . 17 | |||
7.2. Assignment of new Hash Algorithms . . . . . . . . . . . . 17 | 7.2. Assignment of new Hash Algorithms . . . . . . . . . . . . 17 | |||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 17 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 17 | |||
9. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 18 | 9. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . 18 | 10.1. Normative References . . . . . . . . . . . . . . . . . . 18 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . 18 | 10.2. Informative References . . . . . . . . . . . . . . . . . 19 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
1. Introduction | 1. Introduction | |||
On a multi-access LAN, such as an Ethernet, with one or more PIM-SM | On a multi-access LAN, such as an Ethernet, with one or more PIM-SM | |||
[RFC7761] routers, one of the PIM-SM routers is elected as a | (PIM Sparse Mode) [RFC7761] routers, one of the PIM-SM routers is | |||
Designated Router (DR). The PIM DR has two responsibilities in the | elected as a Designated Router (DR). The PIM DR has two | |||
PIM-SM protocol. For any active sources on a LAN, the PIM DR is | responsibilities in the PIM-SM protocol. For any active sources on a | |||
responsible for registering with the Rendezvous Point (RP) if the | LAN, the PIM DR is responsible for registering with the Rendezvous | |||
group is operating in PIM-SM. Also, the PIM DR is responsible for | Point (RP) if the group is operating in PIM-SM. Also, the PIM DR is | |||
tracking local multicast listeners and forwarding to these listeners | responsible for tracking local multicast listeners and forwarding to | |||
if the group is operating in PIM-SM. | these listeners if the group is operating in PIM-SM. | |||
Consider the following LAN in Figure 1: | Consider the following LAN in Figure 1: | |||
(core networks) | (core networks) | |||
| | | | | | | | |||
| | | | | | | | |||
R1 R2 R3 | R1 R2 R3 | |||
| | | | | | | | |||
----(LAN)---- | ----(LAN)---- | |||
| | | | |||
| | | | |||
(many receivers) | (many receivers) | |||
Figure 1: LAN with receivers | Figure 1: LAN with receivers | |||
Assume R1 is elected as the DR. According to the PIM-SM protocol, R1 | Assume R1 is elected as the DR. According to the PIM-SM protocol, R1 | |||
will be responsible for forwarding traffic to that LAN on behalf of | will be responsible for forwarding traffic to that LAN on behalf of | |||
any local members. In addition to keeping track of membership | all local members. In addition to keeping track of membership | |||
reports, R1 is also responsible for initiating the creation of source | reports, R1 is also responsible for initiating the creation of source | |||
and/or shared trees towards the senders or the RPs. The membership | and/or shared trees towards the senders or the RPs. The membership | |||
reports would be IGMP or MLD messages. This applies to any versions | reports would be IGMP or MLD messages. This applies to any versions | |||
of the IGMP and MLD protocols. The most recent versions are IGMPv3 | of the IGMP and MLD protocols. The most recent versions are IGMPv3 | |||
[RFC3376] and MLDv2 [RFC3810]. | [RFC3376] and MLDv2 [RFC3810]. | |||
Having a single router acting as DR and being responsible for data | Having a single router acting as DR and being responsible for data | |||
plane forwarding leads to several issues. One of the issues is that | plane forwarding leads to several issues. One of the issues is that | |||
the aggregated bandwidth will be limited to what R1 can handle with | the aggregated bandwidth will be limited to what R1 can handle with | |||
regards to capacity of incoming links, the interface on the LAN, and | regards to capacity of incoming links, the interface on the LAN, and | |||
total forwarding capacity. It is very common that a LAN consists of | total forwarding capacity. It is very common that a LAN consists of | |||
switches that run IGMP/MLD or PIM snooping [RFC4541]. This allows | switches that run IGMP/MLD or PIM snooping [RFC4541]. This allows | |||
the forwarding of multicast packets to be restricted only to segments | the forwarding of multicast packets to be restricted only to segments | |||
leading to receivers who have indicated their interest in multicast | leading to receivers that have indicated their interest in multicast | |||
groups using either IGMP or MLD. The emergence of the switched | groups using either IGMP or MLD. The emergence of the switched | |||
Ethernet allows the aggregated bandwidth to exceed, sometimes by a | Ethernet allows the aggregated bandwidth to exceed, sometimes by a | |||
large number, that of a single link. For example, let us modify | large number, that of a single link. For example, let us modify | |||
Figure 1 and introduce an Ethernet switch in Figure 2. | Figure 1 and introduce an Ethernet switch in Figure 2. | |||
(core networks) | (core networks) | |||
| | | | | | | | |||
| | | | | | | | |||
R1 R2 R3 | R1 R2 R3 | |||
| | | | | | | | |||
+=gi0===gi1===gi2=+ | +=gi1===gi2===gi3=+ | |||
+ + | + + | |||
+ switch + | + switch + | |||
+ + | + + | |||
+=gi4===gi5===gi6=+ | +=gi4===gi5===gi6=+ | |||
| | | | | | | | |||
H1 H2 H3 | H1 H2 H3 | |||
Figure 2: LAN with Ethernet Switch | Figure 2: LAN with Ethernet Switch | |||
Let us assume that each individual link is a Gigabit Ethernet. Each | Let us assume that each individual link is a Gigabit Ethernet. Each | |||
router, R1, R2 and R3, and the switch have enough forwarding capacity | router, R1, R2 and R3, and the switch have enough forwarding capacity | |||
to handle hundreds of Gigabits of data. | to handle hundreds of Gigabits of data. | |||
Let us further assume that each of the hosts requests 500 Mbps of | Let us further assume that each of the hosts requests 500 Mbps of | |||
unique multicast data. This totals to 1.5 Gbps of data, which is | unique multicast data. This totals to 1.5 Gbps of data, which is | |||
less than what each switch or the combined uplink bandwidth across | less than what each switch or the combined uplink bandwidth across | |||
the routers can handle, even under failure of a single router. | the routers can handle, even under failure of a single router. | |||
On the other hand, the link between R1 and switch, via port gi0, can | On the other hand, the link between R1 and switch, via port gi1, can | |||
only handle a throughput of 1Gbps. And if R1 is the only DR (the PIM | only handle a throughput of 1Gbps. And if R1 is the only DR (the PIM | |||
DR elected using the procedure defined by [RFC7761]) at least 500 | DR elected using the procedure defined by [RFC7761]) at least 500 | |||
Mbps worth of data will be lost because the only link that can be | Mbps worth of data will be lost because the only link that can be | |||
used to draw the traffic from the routers to the switch is via gi0. | used to draw the traffic from the routers to the switch is via gi1. | |||
In other words, the entire network's throughput is limited by the | In other words, the entire network's throughput is limited by the | |||
single connection between the PIM DR and the switch (or LAN as in | single connection between the PIM DR and the switch (or LAN as in | |||
Figure 1). | Figure 1). | |||
Another important issue is related to failover. If R1 is the only | Another important issue is related to failover. If R1 is the only | |||
forwarder on a shared LAN, when R1 goes out of service, multicast | forwarder on a shared LAN, when R1 goes out of service, multicast | |||
forwarding for the entire LAN has to be rebuilt by the newly elected | forwarding for the entire LAN has to be rebuilt by the newly elected | |||
PIM DR. However, if there was a way that allowed multiple routers to | PIM DR. However, if there were a way that allowed multiple routers | |||
forward to the LAN for different groups, failure of one of the | to forward to the LAN for different groups, failure of one of the | |||
routers would only lead to disruption to a subset of the flows, | routers would only lead to disruption to a subset of the flows, | |||
therefore improving the overall resilience of the network. | therefore improving the overall resilience of the network. | |||
This document specifies a modification to the PIM-SM protocol that | This document specifies a modification to the PIM-SM protocol that | |||
allows more than one of these routers, called Group Designated | allows more than one of these routers, called Group Designated | |||
Routers (GDR) to be selected so that the forwarding load can be | Routers (GDR) to be selected so that the forwarding load can be | |||
distributed among a number of routers. | distributed among a number of routers. | |||
2. Terminology | 2. Terminology | |||
skipping to change at page 5, line 33 ¶ | skipping to change at page 5, line 33 ¶ | |||
below) is used to select one of the routers as a GDR. The GDR is | below) is used to select one of the routers as a GDR. The GDR is | |||
responsible for initiating the forwarding tree building process | responsible for initiating the forwarding tree building process | |||
for the corresponding multicast flow. | for the corresponding multicast flow. | |||
o GDR Candidate: a router that has the potential to become a GDR. | o GDR Candidate: a router that has the potential to become a GDR. | |||
There might be multiple GDR Candidates on a LAN, but only one can | There might be multiple GDR Candidates on a LAN, but only one can | |||
become the GDR for a specific multicast flow. | become the GDR for a specific multicast flow. | |||
3. Applicability | 3. Applicability | |||
The extension specified in this document applies to PIM-SM when they | The extension specified in this document applies to PIM-SM routers | |||
act as last hop routers (there are directly connected receivers). It | acting as last hop routers (there are directly connected receivers). | |||
does not alter the behavior of a PIM DR, or any other routers, on the | It does not alter the behavior of a PIM DR, or any other routers, on | |||
first hop network (directly connected sources). This is because the | the first hop network (directly connected sources). This is because | |||
source tree is built using the IP address of the sender, not the IP | the source tree is built using the IP address of the sender, not the | |||
address of the PIM DR that sends the registers towards the RP. The | IP address of the PIM DR that sends PIM registers towards the RP. | |||
load balancing between first hop routers can be achieved naturally if | The load balancing between first hop routers can be achieved | |||
an IGP provides equal cost multiple paths (which it usually does in | naturally if an IGP provides equal cost multiple paths (which it | |||
practice). Also distributing the load to do registering does not | usually does in practice). Also distributing the load to do source | |||
justify the additional complexity required to support it. | registration does not justify the additional complexity required to | |||
support it. | ||||
4. Functional Overview | 4. Functional Overview | |||
In the PIM DR election as defined in [RFC7761], when multiple routers | In the PIM DR election as defined in [RFC7761], when multiple routers | |||
are connected to a multi-access LAN (for example, an Ethernet), one | are connected to a multi-access LAN (for example, an Ethernet), one | |||
of them is elected to act as PIM DR. The PIM DR is responsible for | of them is elected to act as PIM DR. The PIM DR is responsible for | |||
sending local Join/Prune messages towards the RP or source. In order | sending local Join/Prune messages towards the RP or source. In order | |||
to elect the PIM DR, each PIM router on the LAN examines the received | to elect the PIM DR, each PIM router on the LAN examines the received | |||
PIM Hello messages and compares its own DR priority and IP address | PIM Hello messages and compares its own DR priority and IP address | |||
with those of its neighbors. The router with the highest DR priority | with those of its neighbors. The router with the highest DR priority | |||
is the PIM DR. If there are multiple such routers, their IP | is the PIM DR. If there are multiple such routers, their IP | |||
addresses are used as the tie-breaker, as described in [RFC7761]. | addresses are used as the tie-breaker, as described in [RFC7761]. | |||
In order to share forwarding load among last hop routers, besides the | In order to share forwarding load among last hop routers, besides the | |||
normal PIM DR election, the GDR is also elected on the multi-access | normal PIM DR election, one or more GDRs are elected on the multi- | |||
LAN. There is only one PIM DR on the multi-access LAN, but there | access LAN. There is only one PIM DR on the multi-access LAN, but | |||
might be multiple GDR Candidates. | there might be multiple GDR Candidates. | |||
For each multicast flow, that is, (*,G) for ASM and (S,G) for SSM, a | For each multicast flow, that is, (*,G) for ASM and (S,G) for SSM, a | |||
Hash Algorithm is used to select one of the routers to be the GDR. | Hash Algorithm [Section 5.1] is used to select one of the routers to | |||
The new DR Load Balancing Capability (DRLB-Cap) PIM Hello Option is | be the GDR. The new DR Load Balancing Capability (DRLB-Cap) PIM | |||
used to announce the Capability as well as the Hash Algorithm type. | Hello Option is used to announce the Capability as well as the Hash | |||
Routers with the new DRLB-Cap Option advertised in their PIM Hello, | Algorithm type. Routers with the new DRLB-Cap Option advertised in | |||
using the same GDR election Hash Algorithm and the same DR priority | their PIM Hello, using the same GDR election Hash Algorithm and the | |||
as the PIM DR, are considered as GDR Candidates. | same DR priority as the PIM DR, are considered as GDR Candidates. | |||
Hash Masks are defined for Source, Group and RP separately, in order | Hash Masks are defined for Source, Group and RP separately, in order | |||
to handle PIM ASM/SSM. The masks, as well as a sorted list of GDR | to handle PIM ASM/SSM. The masks, as well as a sorted list of GDR | |||
Candidate Addresses, are announced by the DR in a new DR Load | Candidate Addresses, are announced by the DR in a new DR Load | |||
Balancing List (DRLB-List) PIM Hello Option. | Balancing List (DRLB-List) PIM Hello Option. | |||
A Hash Algorithm based on the announced Source, Group, or RP masks | A Hash Algorithm based on the announced Source, Group, or RP masks | |||
allows one GDR to be assigned to a corresponding multicast state. | allows one GDR to be assigned to a corresponding multicast state. | |||
And that GDR is responsible for initiating the creation of the | That GDR is responsible for initiating the creation of the multicast | |||
multicast forwarding tree for multicast traffic. | forwarding tree for multicast traffic. | |||
4.1. GDR Candidates | 4.1. GDR Candidates | |||
GDR is the new concept introduced by this specification. GDR | GDR is the new concept introduced by this specification. GDR | |||
Candidates are routers eligible for GDR election on the LAN. To | Candidates are routers eligible for GDR election on the LAN. To | |||
become a GDR Candidate, a router must have the same DR priority and | become a GDR Candidate, a router must have the same DR priority and | |||
run the same GDR election Hash Algorithm as the DR on the LAN. | run the same GDR election Hash Algorithm as the DR on the LAN. | |||
For example, assume there are 4 routers on the LAN: R1, R2, R3 and | For example, assume there are 4 routers on the LAN: R1, R2, R3 and | |||
R4, each announcing a DRLB-Cap option. R1, R2 and R3 have the same | R4, each announcing a DRLB-Cap option. R1, R2 and R3 have the same | |||
DR priority while R4's DR priority is less preferred. In this | DR priority while R4's DR priority is less preferred. In this | |||
example, R4 will not be eligible for GDR election, because R4 will | example, R4 will not be eligible for GDR election, because R4 will | |||
not become a PIM DR unless all of R1, R2 and R3 go out of service. | not become a PIM DR unless all of R1, R2 and R3 go out of service. | |||
Furthermore, assume router R1 wins the PIM DR election, R1 and R2 run | Furthermore, assume router R1 wins the PIM DR election, R1 and R2 | |||
the same Hash Algorithm for GDR election, while R3 runs a different | advertise the same Hash Algorithm for GDR election, while R3 | |||
one. In this case, only R1 and R2 will be eligible for GDR election, | advertises a different one. In this case, only R1 and R2 will be | |||
while R3 will not. | eligible for GDR election, while R3 will not. | |||
As a DR, R1 will include its own Load Balancing Hash Masks and the | As a DR, R1 will include its own Load Balancing Hash Masks and the | |||
identity of R1 and R2 (the GDR Candidates) in its DRLB-List Hello | identity of R1 and R2 (the GDR Candidates) in its DRLB-List Hello | |||
Option. | Option. | |||
5. Protocol Specification | 5. Protocol Specification | |||
5.1. Hash Mask and Hash Algorithm | 5.1. Hash Mask and Hash Algorithm | |||
A Hash Mask is used to extract a number of bits from the | A Hash Mask is used to extract a number of bits from the | |||
corresponding IP address field (32 for IPv4, 128 for IPv6) and | corresponding IP address field (32 for IPv4, 128 for IPv6) and | |||
calculate a hash value. A hash value is used to select a GDR from | calculate a hash value. A hash value is used to select a GDR from | |||
GDR Candidates advertised by the PIM DR. Hash masks allow for | GDR Candidates advertised by the PIM DR. Hash masks allow for | |||
certain flows to always be forwarded by the same GDR, by ignoring | certain flows to always be forwarded by the same GDR, by ignoring | |||
certain bits in the hash value calculation, so that the hash values | certain bits in the hash value calculation, so that the hash values | |||
are the same. For example, 0.0.255.0 defines a Hash Mask for an IPv4 | are the same. For example, 0.0.255.0 defines a Hash Mask for an IPv4 | |||
address that masks the first, the second, and the fourth octets, | address that masks the first, the second, and the fourth octets, | |||
which means that only the third octet will influence the hash value | which means that only the third octet will influence the hash value | |||
computed. | computed. Note that the masks need not be a contiguous set of bits. | |||
E.g, for IPv4, 15.15.15.15 would be a valid mask. | ||||
In the text below, a hash mask is in some places said to be zero. A | In the text below, a hash mask is in some places said to be zero. A | |||
hash mask is zero if no bits are set. That is, 0.0.0.0 for IPv4 and | hash mask is zero if no bits are set. That is, 0.0.0.0 for IPv4 and | |||
:: for IPv6. Also, a hash mask is said to be an all-bits-set mask if | :: for IPv6. Also, a hash mask is said to be an all-bits-set mask if | |||
it is 255.255.255.255 for IPv4 or | it is 255.255.255.255 for IPv4 or | |||
FFFF:FFFF:FFFF:FFFF:FFFFF:FFFF:FFFF:FFFF for IPv6. | ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff for IPv6. | |||
There are three Hash Masks defined: | There are three Hash Masks defined: | |||
o RP Hash Mask | o RP Hash Mask | |||
o Source Hash Mask | o Source Hash Mask | |||
o Group Hash Mask | o Group Hash Mask | |||
The hash masks need to be configured on the PIM routers that can | The hash masks need to be configured on the PIM routers that can | |||
potentially become a PIM DR, unless the implementation provides | potentially become a PIM DR, unless the implementation provides | |||
default hash mask values. An implementation SHOULD have default hash | default hash mask values. An implementation SHOULD have default hash | |||
mask values as follows. The default RP Hash Mask SHOULD be zero (no | mask values as follows. The default RP Hash Mask SHOULD be zero (no | |||
bits set). The default Source and Group Hash Masks SHOULD both be | bits set). The default Source and Group Hash Masks SHOULD both be | |||
all-bits-set masks. These default values are likely acceptable for | all-bits-set masks. These default values are likely acceptable for | |||
most deployments, and simplify configuration. | most deployments, and simplify configuration. There is only a need | |||
to use other masks if one needs to ensure that certain flows are | ||||
forwarded by the same GDR. | ||||
The DRLB-List Hello Option contains a list of GDR Candidates. The | The DRLB-List Hello Option contains a list of GDR Candidates. The | |||
first one listed has ordinal number 0, the second listed ordinal | first one listed has ordinal number 0, the second listed ordinal | |||
number 1, and the last one has ordinal number N - 1 if there are N | number 1, and the last one has ordinal number N - 1 if there are N | |||
candidates listed. The hash value computed will be the ordinal | candidates listed. The hash value computed will be the ordinal | |||
number of the GDR Candidate that is acting as GDR. | number of the GDR Candidate that is acting as GDR for the flow in | |||
question. | ||||
The input to be hashed is determined as follows: | ||||
o If the group is in ASM mode and the RP Hash Mask announced by the | o If the group is in ASM mode and the RP Hash Mask announced by the | |||
PIM DR is not zero (at least one bit is set), calculate the value | PIM DR is not zero (at least one bit is set), calculate the value | |||
of hashvalue_RP [Section 5.2] to determine the GDR. | of hashvalue_RP [Section 5.2] to determine the GDR. | |||
o If the group is in ASM mode and the RP Hash Mask announced by the | o If the group is in ASM mode and the RP Hash Mask announced by the | |||
PIM DR is zero (no bits are set), obtain the value of | PIM DR is zero (no bits are set), obtain the value of | |||
hashvalue_Group [Section 5.2] to determine the GDR. | hashvalue_Group [Section 5.2] to determine the GDR. | |||
o If the group is in SSM mode, use hashvalue_SG [Section 5.2] to | o If the group is in SSM mode, use hashvalue_SG [Section 5.2] to | |||
skipping to change at page 8, line 27 ¶ | skipping to change at page 8, line 35 ¶ | |||
If different Hash Algorithms are advertised among the routers on a | If different Hash Algorithms are advertised among the routers on a | |||
LAN, only the routers advertising the same Hash Algorithm as the DR | LAN, only the routers advertising the same Hash Algorithm as the DR | |||
(as well as having the same DR priority as the DR) are eligible for | (as well as having the same DR priority as the DR) are eligible for | |||
GDR election. | GDR election. | |||
5.2. Modulo Hash Algorithm | 5.2. Modulo Hash Algorithm | |||
As part of computing the hash, the notation LSZC(hash_mask) is used | As part of computing the hash, the notation LSZC(hash_mask) is used | |||
to denote the number of zeroes counted from the least significant bit | to denote the number of zeroes counted from the least significant bit | |||
of a Hash Mask hash_mask. As an example, LSZC(255.255.128) is 7 and | of a Hash Mask hash_mask. As an example, LSZC(255.255.128) is 7 and | |||
also LSZC(FFFF:8000::) is 111. If all bits are set, LSZC will be 0. | also LSZC(ffff:8000::) is 111. If all bits are set, LSZC will be 0. | |||
If the mask is zero, then LSZC will be 32 for IPv4, and 128 for IPv6. | If the mask is zero, then LSZC will be 32 for IPv4, and 128 for IPv6. | |||
The number of GDR Candidates is denoted as GDRC. | The number of GDR Candidates is denoted as GDRC. | |||
The idea behind the Modulo Hash Algorithm is in simple terms that the | The idea behind the Modulo Hash Algorithm is in simple terms that the | |||
corresponding mask is applied to a value, then the result is shifted | corresponding mask is applied to a value, then the result is shifted | |||
right LSZC(mask) bits so that the least significant bits that were | right LSZC(mask) bits so that the least significant bits that were | |||
masked out are not considered. Then this result is masked by | masked out are not considered. Then this result is masked by | |||
0xFFFFFFFF, keeping only the last 32 bits of the result (this only | 0xffffffff, keeping only the last 32 bits of the result (this only | |||
makes a difference for IPv6). Finally, the hash value is this result | makes a difference for IPv6). Finally, the hash value is this result | |||
modulo the number of GDR Candidates (GDRC). | modulo the number of GDR Candidates (GDRC). | |||
The Modulo Hash Algorithm for computing the values hashvalue_RP, | The Modulo Hash Algorithm for computing the values hashvalue_RP, | |||
hashvalue_Group and hashvalue_SG is defined as follows. | hashvalue_Group and hashvalue_SG is defined as follows. | |||
hashvalue_RP is calculated as: | hashvalue_RP is calculated as: | |||
(((RP_address & RP_mask) >> LSZC(RP_mask)) & 0xFFFFFFFF) % GDRC | (((RP_address & RP_mask) >> LSZC(RP_mask)) & 0xffffffff) % GDRC | |||
RP_address is the address of the RP defined for the group and | RP_address is the address of the RP defined for the group and | |||
RP_mask is the RP Hash Mask. | RP_mask is the RP Hash Mask. | |||
hashvalue_Group is calculated as: | hashvalue_Group is calculated as: | |||
(((Group_address & Group_mask) >> LSZC(Group_mask)) & 0xFFFFFFFF) | (((Group_address & Group_mask) >> LSZC(Group_mask)) & 0xffffffff) | |||
% GDRC | % GDRC | |||
Group_address is the group address and Group_mask is the Group | Group_address is the group address and Group_mask is the Group | |||
Hash Mask. | Hash Mask. | |||
hashvalue_SG is calculated as: | hashvalue_SG is calculated as: | |||
((((Source_address & Source_mask) >> LSZC(Source_mask)) & | ((((Source_address & Source_mask) >> LSZC(Source_mask)) & | |||
0xFFFFFFFF) ^ (((Group_address & Group_mask) >> LSZC(Group_mask)) | 0xffffffff) ^ (((Group_address & Group_mask) >> LSZC(Group_mask)) | |||
& 0xFFFFFFFF)) % GDRC | & 0xffffffff)) % GDRC | |||
Group_address is the group address and Group_mask is the Group | Group_address is the group address and Group_mask is the Group | |||
Hash Mask. | Hash Mask. | |||
5.2.1. Modulo Hash Algorithm Examples | 5.2.1. Modulo Hash Algorithm Examples | |||
To help illustrate the algorithm, consider this example. Router X | To help illustrate the algorithm, consider this example. Router X | |||
with IPv4 address 203.0.113.1 receives a DRLB-List Hello Option from | with IPv4 address 203.0.113.1 receives a DRLB-List Hello Option from | |||
the DR, which announces RP Hash Mask 0.0.255.0 and a list of GDR | the DR, which announces RP Hash Mask 0.0.255.0 and a list of GDR | |||
Candidates, sorted by IP addresses from high to low: 203.0.113.3, | Candidates, sorted by IP addresses from high to low: 203.0.113.3, | |||
skipping to change at page 9, line 37 ¶ | skipping to change at page 9, line 44 ¶ | |||
addresses would be: | addresses would be: | |||
0 for 203.0.113.3; 1 for 203.0.113.2; 2 for 203.0.113.1 (Router X). | 0 for 203.0.113.3; 1 for 203.0.113.2; 2 for 203.0.113.1 (Router X). | |||
Assume there are 2 RPs: RP1 192.0.2.1 for Group1 and RP2 198.51.100.2 | Assume there are 2 RPs: RP1 192.0.2.1 for Group1 and RP2 198.51.100.2 | |||
for Group2. Following the modulo Hash Algorithm: | for Group2. Following the modulo Hash Algorithm: | |||
LSZC(0.0.255.0) is 8 and GDRC is 3. The hashvalue_RP for Group1 with | LSZC(0.0.255.0) is 8 and GDRC is 3. The hashvalue_RP for Group1 with | |||
RP RP1 is: | RP RP1 is: | |||
(((192.0.2.1 & 0.0.255.0) >> 8) & 0xFFFFFFFF % 3) = 2 % 3 = 2 | (((192.0.2.1 & 0.0.255.0) >> 8) & 0xffffffff % 3) = 2 % 3 = 2 | |||
which matches the ordinal number assigned to Router X. Router X will | which matches the ordinal number assigned to Router X. Router X will | |||
be the GDR for Group1. | be the GDR for Group1. | |||
The hashvalue_RP for Group2 with RP RP2 is: | The hashvalue_RP for Group2 with RP RP2 is: | |||
(((198.51.100.2 & 0.0.255.0) >> 8) & 0xFFFFFFFF % 3) = 100 % 3 = 1 | (((198.51.100.2 & 0.0.255.0) >> 8) & 0xffffffff % 3) = 100 % 3 = 1 | |||
which is different from the ordinal number of Router X (2). Hence, | ||||
which is different from the ordinal number of router X (2). Hence, | ||||
Router X will not be GDR for Group2. | Router X will not be GDR for Group2. | |||
For IPv6 consider this example, similar to the above. Router X with | For IPv6 consider this example, similar to the above. Router X with | |||
IPv6 address FE80::1 receives a DRLB-List Hello Option from the DR, | IPv6 address fe80::1 receives a DRLB-List Hello Option from the DR, | |||
which announces RP Hash Mask ::FFFF:FFFF:FFFF:0 and a list of GDR | which announces RP Hash Mask ::ffff:ffff:ffff:0 and a list of GDR | |||
Candidates, sorted by IP addresses from high to low: FE80::3, FE80::2 | Candidates, sorted by IP addresses from high to low: fe80::3, fe80::2 | |||
and FE80::1. The ordinal number assigned to those addresses would | and fe80::1. The ordinal number assigned to those addresses would | |||
be: | be: | |||
0 for FE80::3; 1 for FE80::2; 2 for FE80::1 (Router X). | 0 for fe80::3; 1 for fe80::2; 2 for fe80::1 (Router X). | |||
Assume there are 2 RPs: RP1 2001:DB8::1:0:5678:1 for Group1 and RP2 | Assume there are 2 RPs: RP1 2001:db8::1:0:5678:1 for Group1 and RP2 | |||
2001:DB8::1:0:1234:2 for Group2. Following the modulo Hash | 2001:db8::1:0:1234:2 for Group2. Following the modulo Hash | |||
Algorithm: | Algorithm: | |||
LSZC(::FFFF:FFFF:FFFF:0) is 16 and GDRC is 3. The hashvalue_RP for | LSZC(::ffff:ffff:ffff:0) is 16 and GDRC is 3. The hashvalue_RP for | |||
Group1 with RP RP1 is: | Group1 with RP RP1 is: | |||
(((2001:DB8::1:0:5678:1 & ::FFFF:FFFF:FFFF:0) >> 16) & 0xFFFFFFFF % | (((2001:db8::1:0:5678:1 & ::ffff:ffff:ffff:0) >> 16) & 0xffffffff % | |||
3) = ((::1:0:5678:0 >> 16) & 0xFFFFFFFF % 3) = (::1:0:5678 & | 3) = ((::1:0:5678:0 >> 16) & 0xffffffff % 3) = (::1:0:5678 & | |||
0xFFFFFFFF % 3) = ::5678 % 3 = 2 | 0xffffffff % 3) = ::5678 % 3 = 2 | |||
which matches the ordinal number assigned to Router X. Router X will | which matches the ordinal number assigned to Router X. Router X will | |||
be the GDR for Group1. | be the GDR for Group1. | |||
The hashvalue_RP for Group2 with RP RP2 is: | The hashvalue_RP for Group2 with RP RP2 is: | |||
(((2001:DB8::1:0:1234:1 & ::FFFF:FFFF:FFFF:0) >> 16) & 0xFFFFFFFF % | (((2001:db8::1:0:1234:1 & ::ffff:ffff:ffff:0) >> 16) & 0xffffffff % | |||
3) = ((::1:0:1234:0 >> 16) & 0xFFFFFFFF % 3) = (::1:0:1234 & | 3) = ((::1:0:1234:0 >> 16) & 0xffffffff % 3) = (::1:0:1234 & | |||
0xFFFFFFFF % 3) = ::1234 % 3 = 1 | 0xffffffff % 3) = ::1234 % 3 = 1 | |||
which is different from the ordinal number of router X (2). Hence, | which is different from the ordinal number of Router X (2). Hence, | |||
Router X will not be GDR for Group2. | Router X will not be GDR for Group2. | |||
5.2.2. Limitations | 5.2.2. Limitations | |||
The Modulo Hash Algorithm has poor failover characteristics when a | The Modulo Hash Algorithm has poor failover characteristics when a | |||
shared LAN has more than two GDRs. In the case of more than two GDRs | shared LAN has more than two GDRs. In the case of more than two GDRs | |||
on a LAN, when one GDR fails, all of the groups may be reassigned to | on a LAN, when one GDR fails, all of the groups may be reassigned to | |||
a different GDR, even if they were not assigned to the failed GDR. | a different GDR, even if they were not assigned to the failed GDR. | |||
However, many deployments use only two routers on a shared LAN for | However, many deployments use only two routers on a shared LAN for | |||
redundancy purposes. Future work may define new Hash Algorithms | redundancy purposes. Future work may define new Hash Algorithms | |||
where only groups assigned to the failed GDR get reassigned. | where only groups assigned to the failed GDR get reassigned. | |||
5.3. PIM Hello Options | 5.3. PIM Hello Options | |||
All PIM routers include a new option, called "Load Balancing | PIM routers include a new option, called "Load Balancing Capability | |||
Capability (DRLB-Cap)" in their PIM Hello messages. | (DRLB-Cap)" in their PIM Hello messages. | |||
Besides this DRLB-Cap Hello Option, the elected PIM DR also includes | Besides this DRLB-Cap Hello Option, the elected PIM DR also includes | |||
a new "DR Load Balancing List (DRLB-List) Hello Option". The DRLB- | a new "DR Load Balancing List (DRLB-List) Hello Option". The DRLB- | |||
List Hello Option consists of three Hash Masks as defined above and | List Hello Option consists of three Hash Masks as defined above and | |||
also a sorted list of GDR Candidate addresses on the LAN. | also a list of GDR Candidate addresses on the LAN. It is recommended | |||
that the GDR Candidate addresses are sorted in descending order. | ||||
This ensures that when using algorithms such as the Modulo algorithm | ||||
in this document, that it is predictable which GDR is responsible for | ||||
which groups, regardless of the order the DR learned about the | ||||
candidates. | ||||
5.3.1. PIM DR Load Balancing Capability (DRLB-Cap) Hello Option | 5.3.1. PIM DR Load Balancing Capability (DRLB-Cap) Hello Option | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Type = 34 | Length = 4 | | | Type = 34 | Length = 4 | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Reserved |Hash Algorithm | | | Reserved |Hash Algorithm | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
Figure 3: PIM DR Load Balancing Capability Hello Option | Figure 3: PIM DR Load Balancing Capability Hello Option | |||
Type: 34 | Type: 34 | |||
Length: 4 | Length: 4 | |||
Reserved: Transmitted as zero, ignored on receipt. | Reserved: Transmitted as zero, ignored on receipt. | |||
Hash Algorithm: Hash Algorithm type. 0 for the Modulo algorithm | Hash Algorithm: Hash Algorithm type. A value listed in the IANA | |||
defined in this document. | Designated Router Load Balancing Hash Algorithms registry. 0 is | |||
used for the Modulo algorithm defined in this document. | ||||
This DRLB-Cap Hello Option MUST be advertised by routers on all | This DRLB-Cap Hello Option MUST be advertised by routers on all | |||
interfaces where DR Load Balancing is enabled. | interfaces where DR Load Balancing is enabled. Note that the option | |||
is included at most once. | ||||
5.3.2. PIM DR Load Balancing List (DRLB-List) Hello Option | 5.3.2. PIM DR Load Balancing List (DRLB-List) Hello Option | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Type = 35 | Length | | | Type = 35 | Length | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Group Mask | | | Group Mask | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Source Mask | | | Source Mask | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| RP Mask | | | RP Mask | | |||
skipping to change at page 12, line 22 ¶ | skipping to change at page 12, line 40 ¶ | |||
RP Mask (32/128 bits): Mask applied to RP addresses as part of | RP Mask (32/128 bits): Mask applied to RP addresses as part of | |||
hash computation. | hash computation. | |||
All masks MUST have the same number of bits as the IP source | All masks MUST have the same number of bits as the IP source | |||
address in the PIM Hello IP header. | address in the PIM Hello IP header. | |||
GDR Candidate Address(es) (32/128 bits): List of GDR Candidate(s) | GDR Candidate Address(es) (32/128 bits): List of GDR Candidate(s) | |||
All addresses MUST be in the same address family as the PIM | All addresses MUST be in the same address family as the PIM | |||
Hello IP header. It is RECOMMENDED that the addresses are | Hello IP header. It is recommended that the addresses are | |||
sorted in descending order. | sorted in descending order. | |||
If the "Interface ID" option, as specified in [RFC6395], is | If the "Interface ID" option, as specified in [RFC6395], is | |||
present in a GDR Candidate's PIM Hello message, and the "Router | present in a GDR Candidate's PIM Hello message, and the "Router | |||
Identifier" portion is non-zero: | Identifier" portion is non-zero: | |||
+ For IPv4, the "GDR Candidate Address" will be set directly | + For IPv4, the "GDR Candidate Address" will be set directly | |||
to the "Router Identifier". | to the "Router Identifier". | |||
+ For IPv6, the "GDR Candidate Address" will be 96 bits of | + For IPv6, the "GDR Candidate Address" will be 96 bits of | |||
skipping to change at page 13, line 10 ¶ | skipping to change at page 13, line 28 ¶ | |||
elected PIM DR. It MUST be ignored if received from a non-DR. | elected PIM DR. It MUST be ignored if received from a non-DR. | |||
The option MUST also be ignored if the hash masks are not the | The option MUST also be ignored if the hash masks are not the | |||
correct number of bits, or GDR Candidate addresses are in the | correct number of bits, or GDR Candidate addresses are in the | |||
wrong address family. | wrong address family. | |||
5.4. PIM DR Operation | 5.4. PIM DR Operation | |||
The DR election process is still the same as defined in [RFC7761]. | The DR election process is still the same as defined in [RFC7761]. | |||
The DR advertises the new DRLB-List Hello Option, which contains mask | The DR advertises the new DRLB-List Hello Option, which contains mask | |||
values from user configuration (or default values), followed by a | values from user configuration (or default values), followed by a | |||
list of GDR Candidate Addresses. It is RECOMMENDED that the list be | list of GDR Candidate Addresses. Note that if a router included the | |||
sorted, from the highest value to the lowest value. The reason for | "Interface ID" option in the hello message, and the Router ID is non- | |||
sorting the list is to make the behavior deterministic, regardless of | zero, the Router ID will be used to form the GDR Candidate address of | |||
the order in which the DR learns of new candidates. Note that, as | the router, as discussed in the previous section. It is recommended | |||
non-DR routers, the DR also advertises the DRLB-Cap Hello Option to | that the list be sorted, from the highest value to the lowest value. | |||
indicate its ability to support the new functionality and the type of | The reason for sorting the list is to make the behavior | |||
GDR election Hash Algorithm. | deterministic, regardless of the order in which the DR learns of new | |||
candidates. Note that, as for non-DR routers, the DR also advertises | ||||
the DRLB-Cap Hello Option to indicate its ability to support the new | ||||
functionality and the type of GDR election Hash Algorithm it uses. | ||||
If a PIM DR receives a neighbor DRLB-Cap Hello Option, which contains | If a PIM DR receives a neighbor DRLB-Cap Hello Option, which contains | |||
the same Hash Algorithm as the DR, and the neighbor has the same DR | the same Hash Algorithm as the DR, and the neighbor has the same DR | |||
priority as the DR, PIM DR SHOULD consider the neighbor as a GDR | priority as the DR, PIM DR SHOULD consider the neighbor as a GDR | |||
Candidate and insert the GDR Candidate' Address into the list of the | Candidate and insert the GDR Candidate' Address into the list of the | |||
DRLB-List Option. However, the DR may have policies limiting which | DRLB-List Option. However, the DR may have policies limiting which | |||
GDR Candidates, or the number of GDR Candidates to include. | GDR Candidates, or the number of GDR Candidates to include. | |||
Likewise, the DR SHOULD include itself in the list of GDR Candidates, | Likewise, the DR SHOULD include itself in the list of GDR Candidates, | |||
but it is permissable not to do so, if for instance there is some | but it is permissible not to do so, if for instance there is some | |||
policy restricting the candidate set. | policy restricting the candidate set. | |||
If a PIM neighbor included in the list expires, stops announcing the | If a PIM neighbor included in the list expires, stops announcing the | |||
DRLB-Cap Hello Option, changes DR priority, changes Hash Algorithm or | DRLB-Cap Hello Option, changes DR priority, changes Hash Algorithm or | |||
otherwise becomes ineligible as a candidate, the DR SHOULD | otherwise becomes ineligible as a candidate, the DR SHOULD | |||
immediately send a triggered hello with a new list in the DRLB-List | immediately send a triggered hello with a new list in the DRLB-List | |||
option, excluding the neighbor. | option, excluding the neighbor. | |||
If a new router becomes eligible as a candidate, there is no urgency | If a new router becomes eligible as a candidate, there is no urgency | |||
in sending out an updated list. An updated list SHOULD be included | in sending out an updated list. An updated list SHOULD be included | |||
skipping to change at page 14, line 35 ¶ | skipping to change at page 15, line 6 ¶ | |||
changes, continue processing as below. Note that if the option does | changes, continue processing as below. Note that if the option does | |||
not pass the above checks, the below processing MUST be done as if | not pass the above checks, the below processing MUST be done as if | |||
the option was not announced. | the option was not announced. | |||
If the contents of the DRLB-List Option, the masks or the candidate | If the contents of the DRLB-List Option, the masks or the candidate | |||
list, differs from the previously saved copy, it is received for the | list, differs from the previously saved copy, it is received for the | |||
first time, or it is no longer being received or accepted, the option | first time, or it is no longer being received or accepted, the option | |||
MUST be processed as below. | MUST be processed as below. | |||
1. If the local router is included in the GDR Candidate Address(es) | 1. If the local router is included in the GDR Candidate Address(es) | |||
field, for each of the groups, or source and group pairs if the | field (it will look for its own address, or its Router ID if it | |||
group is in SSM mode, with local receiver interest, the router | announces a non-zero Router ID), for each of the groups, or | |||
MUST run the Hash Algorithm to determine which of them it is the | source and group pairs if the group is in SSM mode, with local | |||
GDR for. | receiver interest, the router MUST run the Hash Algorithm to | |||
determine which of them it is the GDR for. | ||||
If there is no change in the GDR status, then no further | If there is no change in the GDR status, then no further | |||
action is required. | action is required. | |||
If the router becomes the new GDR, then a multicast forwarding | If the router becomes the new GDR, then a multicast forwarding | |||
tree MUST be built [RFC7761]. | tree MUST be built [RFC7761]. | |||
If the router is no longer the GDR, then it uses an Assert as | If the router is no longer the GDR, then it uses an Assert as | |||
explained in [Section 5.7]. | explained in [Section 5.7]. | |||
skipping to change at page 15, line 20 ¶ | skipping to change at page 15, line 41 ¶ | |||
GDR changes may occur due to configuration change, due to GDR | GDR changes may occur due to configuration change, due to GDR | |||
candidates going down, and also new routers coming up and becoming | candidates going down, and also new routers coming up and becoming | |||
GDR candidates. This may occur while flows are being forwarded. If | GDR candidates. This may occur while flows are being forwarded. If | |||
the GDR for an active flow changes, there is likely to be some | the GDR for an active flow changes, there is likely to be some | |||
disruption, such as packet loss or duplicates. By using asserts, | disruption, such as packet loss or duplicates. By using asserts, | |||
packet loss is minimized, while allowing a small amount of | packet loss is minimized, while allowing a small amount of | |||
duplicates. | duplicates. | |||
When a router stops acting as the GDR for a group, or source and | When a router stops acting as the GDR for a group, or source and | |||
group pair if SSM, it MUST set the Assert metric preference to | group pair if SSM, it MUST set the Assert metric preference to | |||
maximum (0x7FFFFFFF) and the Assert metric to one less than maximum | maximum (0x7fffffff) and the Assert metric to one less than maximum | |||
(0xFFFFFFFE). This was also mentioned in the previous section. That | (0xfffffffe). That is, whenever it sends or receives an Assert for | |||
is, whenever it sends or receives an Assert for the group, it must | the group, it must use these values as the metric preference and | |||
use these values as the metric preference and metric rather than the | metric rather than the values provided by the unicast routing | |||
values provided by the unicast routing protocol. | protocol. | |||
The rest of this section is just for illustration purposes and not | The rest of this section is just for illustration purposes and not | |||
part of the protocol definition. | part of the protocol definition. | |||
To illustrate the behavior when there is a GDR change, consider the | To illustrate the behavior when there is a GDR change, consider the | |||
following scenario where there are two flows G1 and G2. R1 is the | following scenario where there are two flows G1 and G2. R1 is the | |||
GDR for G1, and R2 is the GDR for G2. When R3 comes up, it is | GDR for G1, and R2 is the GDR for G2. When R3 comes up, it is | |||
possible that R3 becomes GDR for both G1 and G2, hence R3 starts to | possible that R3 becomes GDR for both G1 and G2, hence R3 starts to | |||
build the forwarding tree for G1 and G2. If R1 and R2 stop | build the forwarding tree for G1 and G2. If R1 and R2 stop | |||
forwarding before R3 completes the process, packet loss might occur. | forwarding before R3 completes the process, packet loss might occur. | |||
skipping to change at page 16, line 15 ¶ | skipping to change at page 16, line 31 ¶ | |||
5.8. Backward Compatibility | 5.8. Backward Compatibility | |||
In the case of a hybrid Ethernet shared LAN (where some PIM routers | In the case of a hybrid Ethernet shared LAN (where some PIM routers | |||
support the functionality defined in this document, and some do not); | support the functionality defined in this document, and some do not); | |||
o If the DR does not support the new functionality, then there will | o If the DR does not support the new functionality, then there will | |||
be no load-balancing. | be no load-balancing. | |||
o If non-DR routers do not support the new functionality, they will | o If non-DR routers do not support the new functionality, they will | |||
not be considered as Candidate GDRs and it will not take part in | not be considered as Candidate GDRs and it will not take part in | |||
an load-balancing. Load-balancing may still happen on the link. | load-balancing. Load-balancing may still happen on the link. | |||
6. Operational Considerations | 6. Operational Considerations | |||
An administrator needs to consider what the total bandwidth | An administrator needs to consider what the total bandwidth | |||
requirements are and find a set of routers that together has enough | requirements are and find a set of routers that together has enough | |||
total capacity, while making sure that each of the routers can handle | available capacity, while making sure that each of the routers can | |||
its part, assuming that the traffic is distributed roughly equally | handle its part, assuming that the traffic is distributed roughly | |||
among the routers. Ideally, one should also have enough bandwidth to | equally among the routers. Ideally, one should also have enough | |||
handle the case where at least one router fails. All routers should | bandwidth to handle the case where at least one router fails. All | |||
have reachability to the sources, and RPs if applicable, that is not | routers should have reachability to the sources, and RPs if | |||
via the LAN. | applicable, that is not via the LAN. | |||
Care must be taken when choosing what hash masks to configure. One | Care must be taken when choosing what hash masks to configure. One | |||
would typically configure the same masks on all the routers, so that | would typically configure the same masks on all the routers, so that | |||
they are the same, regardless of which router is elected as DR. The | they are the same, regardless of which router is elected as DR. The | |||
default masks are likely suitable for most deployment. The RP Hash | default masks are likely suitable for most deployment. The RP Hash | |||
Mask must be configured (the default is no bits set) if one wishes to | Mask must be configured (the default is no bits set) if one wishes to | |||
hash based on the RP address rather than the group address for ASM. | hash based on the RP address rather than the group address for ASM. | |||
The default masks will use the entire group addresses, and source | The default masks will use the entire group addresses, and source | |||
addresses if SSM, as part of the hash. An administrator may set | addresses if SSM, as part of the hash. An administrator may set | |||
other masks that masks out part of the addresses to ensure that | other masks that masks out part of the addresses to ensure that | |||
skipping to change at page 18, line 5 ¶ | skipping to change at page 18, line 23 ¶ | |||
If for any reason, the DR includes a GDR in the announced list which | If for any reason, the DR includes a GDR in the announced list which | |||
announces a different algorithm from what the DR announces, the GDR | announces a different algorithm from what the DR announces, the GDR | |||
is required to ignore the announcement, and there will be no router | is required to ignore the announcement, and there will be no router | |||
acting as the DR for the flows that hash to that GDR. | acting as the DR for the flows that hash to that GDR. | |||
If a GDR is subverted, it could potentially be made to stop | If a GDR is subverted, it could potentially be made to stop | |||
forwarding all the traffic it is expected to forward. This is also | forwarding all the traffic it is expected to forward. This is also | |||
similar today to if a DR is subverted. | similar today to if a DR is subverted. | |||
An administrator may be able to achieve the desired load-balancing of | ||||
known flows, but an attacker may send a single high rate flow which | ||||
is served by a single GDR, or send multiple flows that are expected | ||||
to be hashed to the same GDR. | ||||
9. Acknowledgement | 9. Acknowledgement | |||
The authors would like to thank Steve Simlo and Taki Millonis for | The authors would like to thank Steve Simlo and Taki Millonis for | |||
helping with the original idea; Alia Atlas, Bill Atwood, Jake | helping with the original idea; Alia Atlas, Bill Atwood, Joe Clarke, | |||
Holland, Bharat Joshi, Anish Kachinthaya, Anvitha Kachinthaya and | Alissa Cooper, Jake Holland, Bharat Joshi, Anish Kachinthaya, Anvitha | |||
Alvaro Retana for reviews and comments; and Toerless Eckert and | Kachinthaya, Benjamin Kaduk, Mirja Kuhlewind, Barry Leiba, Ben Niven- | |||
Jenkins, Alvaro Retana, Adam Roach, Michael Scharf, Eric Vyncke and | ||||
Carl Wallace for reviews and comments; and Toerless Eckert and | ||||
Rishabh Parekh for helpful conversation on the document. | Rishabh Parekh for helpful conversation on the document. | |||
10. References | 10. References | |||
10.1. Normative References | 10.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
End of changes. 53 change blocks. | ||||
118 lines changed or deleted | 142 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |