draft-ietf-dhc-dhcpv6-failover-design-02.txt | draft-ietf-dhc-dhcpv6-failover-design-03.txt | |||
---|---|---|---|---|
Dynamic Host Configuration (DHC) T. Mrugalski | Dynamic Host Configuration (DHC) T. Mrugalski | |||
Internet-Draft ISC | Internet-Draft ISC | |||
Intended status: Standards Track K. Kinnear | Intended status: Standards Track K. Kinnear | |||
Expires: April 25, 2013 Cisco | Expires: January 16, 2014 Cisco | |||
October 22, 2012 | July 15, 2013 | |||
DHCPv6 Failover Design | DHCPv6 Failover Design | |||
draft-ietf-dhc-dhcpv6-failover-design-02 | draft-ietf-dhc-dhcpv6-failover-design-03 | |||
Abstract | Abstract | |||
DHCPv6 defined in [RFC3315] does not offer server redundancy. This | DHCPv6 defined in [RFC3315] does not offer server redundancy. This | |||
document defines a design for DHCPv6 failover, a mechanism for | document defines a design for DHCPv6 failover, a mechanism for | |||
running two servers on the same network with capability for either | running two servers on the same network with capability for either | |||
server to take over clients' leases in case of server failure or | server to take over clients' leases in case of server failure or | |||
network partition. This is a DHCPv6 Failover design document, it is | network partition. This is a DHCPv6 Failover design document, it is | |||
not protocol specification document. It is a second document in a | not protocol specification document. It is a second document in a | |||
planned series of three documents. DHCPv6 failover requirements are | planned series of three documents. DHCPv6 failover requirements are | |||
specified in [I-D.ietf-dhc-dhcpv6-failover-requirements]. A protocol | specified in [I-D.ietf-dhc-dhcpv6-failover-requirements]. A protocol | |||
specification document is planned to follow this document. | specification document is planned to follow this document. | |||
Status of this Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on April 25, 2013. | This Internet-Draft will expire on January 16, 2014. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2012 IETF Trust and the persons identified as the | Copyright (c) 2013 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Requirements Language . . . . . . . . . . . . . . . . . . . . 4 | 1. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
3. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 3. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
3.1. Additional Requirements . . . . . . . . . . . . . . . . . 6 | 3.1. Design Requirements . . . . . . . . . . . . . . . . . . . 6 | |||
3.2. Features out of Scope: Load Balancing . . . . . . . . . . 6 | 3.2. Features out of Scope: Load Balancing . . . . . . . . . . 6 | |||
4. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 6 | 4. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 6 | |||
4.1. Failover State Machine Overview . . . . . . . . . . . . . 8 | 4.1. Failover State Machine Overview . . . . . . . . . . . . . 8 | |||
4.2. Messages . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 4.2. Messages . . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
5. Connection Management . . . . . . . . . . . . . . . . . . . . 11 | 5. Connection Management . . . . . . . . . . . . . . . . . . . . 11 | |||
5.1. Creating Connections . . . . . . . . . . . . . . . . . . . 11 | 5.1. Creating Connections . . . . . . . . . . . . . . . . . . 11 | |||
5.2. Endpoint Identification . . . . . . . . . . . . . . . . . 12 | 5.2. Endpoint Identification . . . . . . . . . . . . . . . . . 13 | |||
6. Resource Allocation . . . . . . . . . . . . . . . . . . . . . 13 | 6. Resource Allocation . . . . . . . . . . . . . . . . . . . . . 13 | |||
6.1. Proportional Allocation . . . . . . . . . . . . . . . . . 14 | 6.1. Proportional Allocation . . . . . . . . . . . . . . . . . 14 | |||
6.2. Independent Allocation . . . . . . . . . . . . . . . . . . 16 | 6.2. Independent Allocation . . . . . . . . . . . . . . . . . 16 | |||
6.3. Choosing Allocation Algorithm . . . . . . . . . . . . . . 16 | 6.3. Choosing Allocation Algorithm . . . . . . . . . . . . . . 17 | |||
7. Information model . . . . . . . . . . . . . . . . . . . . . . 17 | 7. Information model . . . . . . . . . . . . . . . . . . . . . . 18 | |||
8. Failover Mechanisms . . . . . . . . . . . . . . . . . . . . . 21 | 8. Failover Mechanisms . . . . . . . . . . . . . . . . . . . . . 22 | |||
8.1. Time Skew . . . . . . . . . . . . . . . . . . . . . . . . 21 | 8.1. Time Skew . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
8.2. Time expression . . . . . . . . . . . . . . . . . . . . . 22 | 8.2. Time expression . . . . . . . . . . . . . . . . . . . . . 23 | |||
8.3. Lazy updates . . . . . . . . . . . . . . . . . . . . . . . 22 | 8.3. Lazy updates . . . . . . . . . . . . . . . . . . . . . . 23 | |||
8.4. MCLT concept . . . . . . . . . . . . . . . . . . . . . . . 22 | 8.4. MCLT concept . . . . . . . . . . . . . . . . . . . . . . 23 | |||
8.4.1. MCLT example . . . . . . . . . . . . . . . . . . . . . 24 | 8.4.1. MCLT example . . . . . . . . . . . . . . . . . . . . 25 | |||
8.5. Unreachability detection . . . . . . . . . . . . . . . . . 25 | 8.5. Unreachability detection . . . . . . . . . . . . . . . . 26 | |||
8.6. Re-allocating Leases . . . . . . . . . . . . . . . . . . . 25 | 8.6. Re-allocating Leases . . . . . . . . . . . . . . . . . . 26 | |||
8.7. Sending Binding Update . . . . . . . . . . . . . . . . . . 26 | 8.7. Sending Binding Update . . . . . . . . . . . . . . . . . 27 | |||
8.8. Receiving Binding Update . . . . . . . . . . . . . . . . . 28 | 8.8. Receiving Binding Update . . . . . . . . . . . . . . . . 29 | |||
8.9. Conflict Resolution . . . . . . . . . . . . . . . . . . . 28 | 8.9. Conflict Resolution . . . . . . . . . . . . . . . . . . . 30 | |||
8.10. Acknowledging Reception . . . . . . . . . . . . . . . . . 30 | 8.10. Acknowledging Reception . . . . . . . . . . . . . . . . . 32 | |||
9. Endpoint States . . . . . . . . . . . . . . . . . . . . . . . 30 | 9. Endpoint States . . . . . . . . . . . . . . . . . . . . . . . 32 | |||
9.1. State Machine Operation . . . . . . . . . . . . . . . . . 30 | 9.1. State Machine Operation . . . . . . . . . . . . . . . . . 32 | |||
9.2. State Machine Initialization . . . . . . . . . . . . . . . 33 | 9.2. State Machine Initialization . . . . . . . . . . . . . . 35 | |||
9.3. STARTUP State . . . . . . . . . . . . . . . . . . . . . . 33 | 9.3. STARTUP State . . . . . . . . . . . . . . . . . . . . . . 35 | |||
9.3.1. Operation in STARTUP State . . . . . . . . . . . . . . 34 | 9.3.1. Operation in STARTUP State . . . . . . . . . . . . . 36 | |||
9.3.2. Transition Out of STARTUP State . . . . . . . . . . . 34 | 9.3.2. Transition Out of STARTUP State . . . . . . . . . . . 36 | |||
9.4. PARTNER-DOWN State . . . . . . . . . . . . . . . . . . . . 35 | 9.4. PARTNER-DOWN State . . . . . . . . . . . . . . . . . . . 38 | |||
9.4.1. Operation in PARTNER-DOWN State . . . . . . . . . . . 35 | 9.4.1. Operation in PARTNER-DOWN State . . . . . . . . . . . 38 | |||
9.4.2. Transition Out of PARTNER-DOWN State . . . . . . . . . 36 | 9.4.2. Transition Out of PARTNER-DOWN State . . . . . . . . 39 | |||
9.5. RECOVER State . . . . . . . . . . . . . . . . . . . . . . 37 | 9.5. RECOVER State . . . . . . . . . . . . . . . . . . . . . . 40 | |||
9.5.1. Operation in RECOVER State . . . . . . . . . . . . . . 37 | 9.5.1. Operation in RECOVER State . . . . . . . . . . . . . 40 | |||
9.5.2. Transition Out of RECOVER State . . . . . . . . . . . 37 | 9.5.2. Transition Out of RECOVER State . . . . . . . . . . . 40 | |||
9.6. RECOVER-WAIT State . . . . . . . . . . . . . . . . . . . . 39 | ||||
9.6.1. Operation in RECOVER-WAIT State . . . . . . . . . . . 40 | 9.6. RECOVER-WAIT State . . . . . . . . . . . . . . . . . . . 41 | |||
9.6.2. Transition Out of RECOVER-WAIT State . . . . . . . . . 40 | 9.6.1. Operation in RECOVER-WAIT State . . . . . . . . . . . 41 | |||
9.7. RECOVER-DONE State . . . . . . . . . . . . . . . . . . . . 40 | 9.6.2. Transition Out of RECOVER-WAIT State . . . . . . . . 42 | |||
9.7.1. Operation in RECOVER-DONE State . . . . . . . . . . . 41 | 9.7. RECOVER-DONE State . . . . . . . . . . . . . . . . . . . 42 | |||
9.7.2. Transition Out of RECOVER-DONE State . . . . . . . . . 41 | 9.7.1. Operation in RECOVER-DONE State . . . . . . . . . . . 42 | |||
9.8. NORMAL State . . . . . . . . . . . . . . . . . . . . . . . 41 | 9.7.2. Transition Out of RECOVER-DONE State . . . . . . . . 42 | |||
9.8.1. Operation in NORMAL State . . . . . . . . . . . . . . 41 | 9.8. NORMAL State . . . . . . . . . . . . . . . . . . . . . . 43 | |||
9.8.2. Transition Out of NORMAL State . . . . . . . . . . . . 42 | 9.8.1. Operation in NORMAL State . . . . . . . . . . . . . . 43 | |||
9.9. COMMUNICATIONS-INTERRUPTED State . . . . . . . . . . . . . 43 | 9.8.2. Transition Out of NORMAL State . . . . . . . . . . . 44 | |||
9.9.1. Operation in COMMUNICATIONS-INTERRUPTED State . . . . 43 | 9.9. COMMUNICATIONS-INTERRUPTED State . . . . . . . . . . . . 44 | |||
9.9.2. Transition Out of COMMUNICATIONS-INTERRUPTED State . . 44 | 9.9.1. Operation in COMMUNICATIONS-INTERRUPTED State . . . . 45 | |||
9.10. POTENTIAL-CONFLICT State . . . . . . . . . . . . . . . . . 45 | 9.9.2. Transition Out of COMMUNICATIONS-INTERRUPTED State . 45 | |||
9.10.1. Operation in POTENTIAL-CONFLICT State . . . . . . . . 46 | 9.10. POTENTIAL-CONFLICT State . . . . . . . . . . . . . . . . 47 | |||
9.10.2. Transition Out of POTENTIAL-CONFLICT State . . . . . . 46 | 9.10.1. Operation in POTENTIAL-CONFLICT State . . . . . . . 47 | |||
9.11. RESOLUTION-INTERRUPTED State . . . . . . . . . . . . . . . 47 | 9.10.2. Transition Out of POTENTIAL-CONFLICT State . . . . . 47 | |||
9.11.1. Operation in RESOLUTION-INTERRUPTED State . . . . . . 48 | 9.11. RESOLUTION-INTERRUPTED State . . . . . . . . . . . . . . 49 | |||
9.11.2. Transition Out of RESOLUTION-INTERRUPTED State . . . . 48 | 9.11.1. Operation in RESOLUTION-INTERRUPTED State . . . . . 49 | |||
9.12. CONFLICT-DONE State . . . . . . . . . . . . . . . . . . . 48 | 9.11.2. Transition Out of RESOLUTION-INTERRUPTED State . . . 49 | |||
9.12.1. Operation in CONFLICT-DONE State . . . . . . . . . . . 48 | 9.12. CONFLICT-DONE State . . . . . . . . . . . . . . . . . . . 49 | |||
9.12.2. Transition Out of CONFLICT-DONE State . . . . . . . . 49 | 9.12.1. Operation in CONFLICT-DONE State . . . . . . . . . . 50 | |||
10. Proposed extensions . . . . . . . . . . . . . . . . . . . . . 49 | 9.12.2. Transition Out of CONFLICT-DONE State . . . . . . . 50 | |||
10.1. Active-active mode . . . . . . . . . . . . . . . . . . . . 49 | 10. Proposed extensions . . . . . . . . . . . . . . . . . . . . . 50 | |||
11. Dynamic DNS Considerations . . . . . . . . . . . . . . . . . . 50 | 10.1. Active-active mode . . . . . . . . . . . . . . . . . . . 50 | |||
11.1. Relationship between failover and dynamic DNS update . . . 50 | 11. Dynamic DNS Considerations . . . . . . . . . . . . . . . . . 51 | |||
11.2. Exchanging DDNS Information . . . . . . . . . . . . . . . 51 | 11.1. Relationship between failover and dynamic DNS update . . 51 | |||
11.3. Adding RRs to the DNS . . . . . . . . . . . . . . . . . . 53 | 11.2. Exchanging DDNS Information . . . . . . . . . . . . . . 52 | |||
11.4. Deleting RRs from the DNS . . . . . . . . . . . . . . . . 54 | 11.3. Adding RRs to the DNS . . . . . . . . . . . . . . . . . 54 | |||
11.5. Name Assignment with No Update of DNS . . . . . . . . . . 54 | 11.4. Deleting RRs from the DNS . . . . . . . . . . . . . . . 55 | |||
12. Reservations and failover . . . . . . . . . . . . . . . . . . 55 | 11.5. Name Assignment with No Update of DNS . . . . . . . . . 55 | |||
13. Security Considerations . . . . . . . . . . . . . . . . . . . 56 | 12. Reservations and failover . . . . . . . . . . . . . . . . . . 56 | |||
14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 56 | 13. Security Considerations . . . . . . . . . . . . . . . . . . . 57 | |||
15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 56 | 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 58 | |||
16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 57 | 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 58 | |||
16.1. Normative References . . . . . . . . . . . . . . . . . . . 57 | 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 58 | |||
16.2. Informative References . . . . . . . . . . . . . . . . . . 57 | 16.1. Normative References . . . . . . . . . . . . . . . . . . 58 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 58 | 16.2. Informative References . . . . . . . . . . . . . . . . . 58 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 59 | ||||
1. Requirements Language | 1. Requirements Language | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
2. Glossary | 2. Glossary | |||
This is a supplemental glossary that should be combined with | This is a supplemental glossary that should be combined with | |||
definitions in Section 3 of | definitions in Section 3 of | |||
[I-D.ietf-dhc-dhcpv6-failover-requirements]. | [I-D.ietf-dhc-dhcpv6-failover-requirements]. | |||
o auto-partner-down - a capability where a failover server will move | ||||
from COMMUNICATIONS-INTERRUPTED state to PARTNER-DOWN state | ||||
automatically, without operator intervention. | ||||
o Failover endpoint - The failover protocol allows for there to be a | o Failover endpoint - The failover protocol allows for there to be a | |||
unique failover 'endpoint' for each failover relationship in which | unique failover 'endpoint' for each failover relationship in which | |||
a failover server participates. The failover relationship is | a failover server participates. The failover relationship is | |||
defined by a relationship name, and includes the failover partner | defined by a relationship name, and includes the failover partner | |||
IP address, the role this server takes with respect to that | IP address, the role this server takes with respect to that | |||
partner (primary or secondary), and the prefixes associated with | partner (primary or secondary), and the prefixes associated with | |||
that relationship. Note that a single prefix can only be | that relationship. Note that a single prefix can only be | |||
associated with a single failover relationship. This failover | associated with a single failover relationship. This failover | |||
endpoint can take actions and hold unique states. Typically, | endpoint can take actions and hold unique states. Typically, | |||
there is a one failover endpoint per partner (server), although | there is one failover endpoint per partner (server), although | |||
there may be more. 'Server' and 'failover endpoint' are | there may be more. 'Server' and 'failover endpoint' are | |||
synonymous only if the server participates in only one failover | synonymous only if the server participates in only one failover | |||
relationship. However, for the sake of simplicity 'Server' is | relationship. However, for the sake of simplicity 'Server' is | |||
used throughout the document to refer to a failover endpoint | used throughout the document to refer to a failover endpoint | |||
unless to do so would be confusing. | unless to do so would be confusing. | |||
o Failover transmission - all messages exchanged between partners. | o Failover communication - all messages exchanged between partners. | |||
o Independent Allocation - a prefix allocation algorithm to split | o Independent Allocation - an allocation algorithm that splits the | |||
the available pool of resources between the primary and secondary | available pool of resources between the primary and secondary | |||
servers that is particularly well suited for vast pools (i.e. when | servers that is particularly well suited for vast pools (i.e. when | |||
available resources are not expected to deplete). See Section 6.2 | available resources are not expected to deplete). See Section 6.2 | |||
for details. | for details. | |||
o Partner - name of the other DHCPv6 server that participates in | o Partner - name of the other DHCPv6 server that participates in | |||
failover relationship. When the role (primary or secondary) is | failover relationship. When the role (primary or secondary) is | |||
not important, the other server is referred to as a "failover | not important, the other server is referred to as a "failover | |||
partner" or simply partner. | partner" or simply partner. | |||
o Primary Server - First out of two DHCPv6 servers that participate | o Primary Server - First out of two DHCPv6 servers that participate | |||
in a failover relationship. In active-passive mode this is the | in a failover relationship. In active-passive mode this is the | |||
server that handles most of the client traffic. Its failover | server that handles most of the client traffic. Its failover | |||
partner is referred to as secondary server. | partner is referred to as secondary server. | |||
o Proportional Allocation - a prefix allocation algorithm that | o Proportional Allocation - an allocation algorithm that splits the | |||
splits the available resources (addresses or prefixes) between the | available resources (addresses or prefixes) between the primary | |||
primary and secondary servers that is particularly well suited for | and secondary servers and maintains proportions between available | |||
more limited resources. See Section 6.1 for details. | resources on both. It is particularly well suited for more | |||
limited resources. See Section 6.1 for details. | ||||
o Resource - Any type of resource that is assignable using DHCPv6. | o Resource - Any type of resource that is managed by DHCPv6. | |||
Currently there are two types of such resources defined: a non- | Currently there are two types of such resources defined: a non- | |||
temporary IPv6 address and an IPv6 prefix. Due to the nature of | temporary IPv6 address and an IPv6 prefix. Due to the nature of | |||
temporary addresses, they are not covered by the failover | temporary addresses, they are not covered by the failover | |||
mechanism. Other resource types may be defined in the future. | mechanism. Other resource types may be defined in the future. | |||
o Responsive - A server that is responsive, will respond to DHCPv6 | o Responsive - A server that is responsive, will respond to DHCPv6 | |||
client requests. | client requests. | |||
o Secondary Server - Second of out two DHCPv6 servers that | o Secondary Server - Second of out two DHCPv6 servers that | |||
participate in a failover relationship. Its failover partner is | participate in a failover relationship. Its failover partner is | |||
skipping to change at page 6, line 5 | skipping to change at page 5, line 48 | |||
of the two servers. | of the two servers. | |||
This protocol defines active-passive mode, sometimes also called a | This protocol defines active-passive mode, sometimes also called a | |||
hot standby model. This means that during normal operation one | hot standby model. This means that during normal operation one | |||
server is active (i.e. actively responds to clients' requests) while | server is active (i.e. actively responds to clients' requests) while | |||
the second is passive (i.e. it does receive clients' requests, but | the second is passive (i.e. it does receive clients' requests, but | |||
does not respond to them and only maintains a copy of lease database | does not respond to them and only maintains a copy of lease database | |||
and is ready to take over incoming queries in case of primary server | and is ready to take over incoming queries in case of primary server | |||
failure). Active-active mode (i.e. both servers actively handling | failure). Active-active mode (i.e. both servers actively handling | |||
clients' requests) is currently not supported for the sake of | clients' requests) is currently not supported for the sake of | |||
simplicity. Such mode may be defined as an exension at a later time. | simplicity. Such a mode is likely to be defined as an exension at a | |||
later time and will probably be based on | ||||
[I-D.ietf-dhc-dhcpv6-load-balancing]. | ||||
The failover protocol is designed to provide lease stability for | The failover protocol is designed to provide lease stability for | |||
leases with lease times beyond a short period. Due to the additional | leases with lease times beyond a short period. Due in part to the | |||
overhead required, failover is not suitable for leases shorter than | additional overhead required as well as requirements to handle time | |||
30 seconds. The DHCPv6 Failover protocol MUST NOT be used for leases | skew between failover partners (See Section 8.1), failover is not | |||
shorter than 30 seconds. | suitable for leases shorter than 30 seconds. The DHCPv6 Failover | |||
protocol MUST NOT be used for leases shorter than 30 seconds. | ||||
This design attempts to fulfill all DHCPv6 failover requirements | This design attempts to fulfill all DHCPv6 failover requirements | |||
defined in [I-D.ietf-dhc-dhcpv6-failover-requirements]. | defined in [I-D.ietf-dhc-dhcpv6-failover-requirements]. | |||
3.1. Additional Requirements | 3.1. Design Requirements | |||
The following requirements are not related to failover mechanism in | The following requirements are not related to failover mechanism in | |||
general, but rather to this particular design. | general, but rather to this particular design. | |||
1. Minimize Asymmetry - while there are two distinct roles in | 1. Minimize Asymmetry - while there are two distinct roles in | |||
failover (primary and secondary server), the differences between | failover (primary and secondary server), the differences between | |||
those two roles should be as small as possible. This will yield | those two roles should be as small as possible. This will yield | |||
a simpler design as well as a simpler implementation of that | a simpler design as well as a simpler implementation of that | |||
design. | design. | |||
skipping to change at page 7, line 11 | skipping to change at page 7, line 9 | |||
specified in Section 5.1 of DHCPv6 Bulk Leasequery [RFC5460], but | specified in Section 5.1 of DHCPv6 Bulk Leasequery [RFC5460], but | |||
uses different message types. New failover-specific message types | uses different message types. New failover-specific message types | |||
are listed in Section 4.2. All information is sent over the | are listed in Section 4.2. All information is sent over the | |||
connection as typical DHCPv6 messages that convey DHCPv6 options, | connection as typical DHCPv6 messages that convey DHCPv6 options, | |||
following format defined in Section 22.1 of [RFC3315]. | following format defined in Section 22.1 of [RFC3315]. | |||
After initialization, the primary server establishes a TCP connection | After initialization, the primary server establishes a TCP connection | |||
with its partner. The primary server sends a CONNECT message with | with its partner. The primary server sends a CONNECT message with | |||
initial parameters. Secondary server responds with CONNECTACK. | initial parameters. Secondary server responds with CONNECTACK. | |||
If the primary server cannot immediately establish a connection with | ||||
its partner, it will continue to attempt to establish a connection. | ||||
See Section 5.1 for details. | ||||
Depending on the failover state of each partner, they MUST initiate | Depending on the failover state of each partner, they MUST initiate | |||
one of the binding update procedures. Each server MAY send an UPDREQ | one of the binding update procedures. Each server MAY send an UPDREQ | |||
message to request its partner to send all updates that have not been | message to request its partner to send all updates that have not been | |||
sent yet (this case applies when the partner has an existing database | sent yet (this case applies when the partner has an existing database | |||
and wants to update it). Alternatively, a server MAY choose to send | and wants to update it). Alternatively, a server MAY choose to send | |||
an UPDREQALL message to request a full lease database transmission | an UPDREQALL message to request a full lease database transmission | |||
including all leases (this case applies in case of booting up new | including all leases (this case applies in case of booting up new | |||
server after installation, corruption or complete loss of database, | server after installation, corruption or complete loss of database, | |||
or other catastrophic failure). | or other catastrophic failure). | |||
skipping to change at page 7, line 43 | skipping to change at page 7, line 45 | |||
addresses by sending a POOLREQ message. The primary server assigns | addresses by sending a POOLREQ message. The primary server assigns | |||
addresses to the secondary by sending a series of BNDUPD messages. | addresses to the secondary by sending a series of BNDUPD messages. | |||
When this process is complete, the primary server sends a POOLRESP | When this process is complete, the primary server sends a POOLRESP | |||
message to the secondary server. The secondary server may initiate | message to the secondary server. The secondary server may initiate | |||
such pool request at any time when in communication with primary | such pool request at any time when in communication with primary | |||
server. | server. | |||
Failover servers use a lazy update mechanism to update their failover | Failover servers use a lazy update mechanism to update their failover | |||
partner about changes to their lease state database. After a server | partner about changes to their lease state database. After a server | |||
performs any modifications to its lease state database (assign a new | performs any modifications to its lease state database (assign a new | |||
lease, extend an existing one, release or expire a lease), it sends | lease, extend, release or expire existing lease), it sends its | |||
its response to the client's request first (performing the "regular" | response to the client's request first (performing the "regular" | |||
DHCPv6 operation) and then informs its failover partner using a | DHCPv6 operation) and then informs its failover partner using a | |||
BNDUPD message. This BNDUPD message SHOULD be sent soon after the | BNDUPD message. This BNDUPD message SHOULD be sent soon after the | |||
response is sent to the DHCPv6 client, but there is no specific | response is sent to the DHCPv6 client, but there is no specific | |||
requirement of a minimum time in which to do so. | requirement of a minimum time in which to do so. | |||
The major problem with lazy update mechanism is the case when the | The major problem with lazy update mechanism is the case when the | |||
server crashes after sending a response to client, but before sending | server crashes after sending a response to client, but before sending | |||
the lazy update to its partner (or when communication between | the lazy update to its partner (or when communication between | |||
partners is interrupted). To solve this problem, the concept known | partners is interrupted). To solve this problem, the concept known | |||
as the Maximum Client Lead Time (initially designed for DHCPv4 | as the Maximum Client Lead Time (initially designed for DHCPv4 | |||
skipping to change at page 8, line 35 | skipping to change at page 8, line 38 | |||
details. For complete description, see Section 9. In case of a | details. For complete description, see Section 9. In case of a | |||
disagreement between the simplified and complete description, please | disagreement between the simplified and complete description, please | |||
follow Section 9. | follow Section 9. | |||
Each server MUST be in one of the well defines states. In each state | Each server MUST be in one of the well defines states. In each state | |||
a server may be either responsive (responds to clients' queries) or | a server may be either responsive (responds to clients' queries) or | |||
unresponsive (clients' queries are ignored). | unresponsive (clients' queries are ignored). | |||
A server starts its operation in short-lived STARTUP state. A server | A server starts its operation in short-lived STARTUP state. A server | |||
determines its partner reachability and state and sets its own state | determines its partner reachability and state and sets its own state | |||
based on that determination. It frequently returns back to the state | based on that determination. It typically returns back to the state | |||
it was in before shutdown. | it was in before shutdown, though the details can be complicated. | |||
See Section 9.3.2. | ||||
During typical operation when servers maintain communication, both | During typical operation when servers maintain communication, both | |||
are in NORMAL state. In that state only the primary responds to | are in NORMAL state. In that state only the primary responds to | |||
clients' requests. A secondary server in unresponsive to DHCPv6 | clients' requests. A secondary server is unresponsive. | |||
clients. | ||||
If a server discovers that its partner is no longer reachable, it | If a server discovers that its partner is no longer reachable, it | |||
goes to COMMUNICATIONS-INTERRUPTED state. A server must be extra | goes to COMMUNICATIONS-INTERRUPTED state. A server must be extra | |||
cautious as it can't distingush if its partner is down or just | cautious as it can't distingush if its partner is down or just | |||
communication between servers is interrupted. Since communication | communication between servers is interrupted. Since communication | |||
between partners is not possible, a server must act on the assumtion | between partners is not possible, a server must act on the assumtion | |||
that its partner is up. A failover server must follow a defined | that its partner is up. A failover server must follow a defined | |||
procedure, in particular, it MUST NOT extend any lease more than the | procedure, in particular, it MUST NOT extend any lease more than the | |||
MCLT beyond its partner's knowledge of the lease expiration time. | MCLT beyond its partner's knowledge of the lease expiration time. | |||
This imposes an additional burden on the server, in that clients will | This imposes an additional burden on the server, in that clients will | |||
return to the server for lease renewals more frequently than they | return to the server for lease renewals more frequently than they | |||
would otherwise. Therefore it is not recommended to operate for | would otherwise. Therefore it is not recommended to operate for | |||
prolonged periods in this state. Once communication is | prolonged periods in this state. Once communication is | |||
reestablished, a server may go into NORMAL, POTENTIAL-CONFLICT or | reestablished, a server may go into NORMAL, POTENTIAL-CONFLICT or | |||
PARTNER-DOWN state. It may also stay in COMMUNICATIONS-INTERRUPTED | PARTNER-DOWN state. It may also stay in COMMUNICATIONS-INTERRUPTED | |||
state if certain conditions are met. | state if certain conditions are met. | |||
Once a server is switched into PARTNER-DOWN (when auto-partner-down | Once a server is switched into PARTNER-DOWN (when auto-partner-down | |||
is used or as a result of administrative action), it can extend | is used or as a result of administrative action), it can extend | |||
leases, regardless of the original server that initially granted the | leases, regardless of the original server that initially granted the | |||
lease. In that state server handles leases from its own pool, but is | lease. In that state server handles leases from its own pool, but | |||
also able to serve pool from its downed partner. MCLT restrictions | once its own pool is depleted is also able to serve pool from its | |||
no longer apply. Operation in this mode is less demanding for the | downed partner. MCLT restrictions no longer apply. Operation in | |||
server that remains operational, than in COMMUNICATIONS-INTERRUPTED | this mode is less demanding for the server that remains operational, | |||
state, but PARTNER-DOWN does not offer any kind of redundancy. | than in COMMUNICATIONS-INTERRUPTED state, but PARTNER-DOWN does not | |||
offer any kind of redundancy. Even when in PARTNER-DOWN state, a | ||||
failover server continues to attempt to connect with its failover | ||||
partner. | ||||
When a server does not have an intact lease state database (e.g. due | A server switches into RECOVER state when any of a variety of | |||
to first time run or catastrophic failure) or detects that is partner | conditions are encountered: | |||
is in PARTNER-DOWN state and additional conditions are met, it | ||||
switches to RECOVER state. In that state the server acknowledges | o When a backup server contacts its failover partner for the first | |||
that content of its database is doubtful and it needs to refresh its | time. | |||
database from its partner. Once this operation is complete, it | ||||
switches to RECOVER-WAIT and later to RECOVER-DONE. | o When either server discovers that its failover partner has | |||
contacted it before but it has no local record of this contact. | ||||
If the record of previous contact is held in the lease-state | ||||
database, then this situation implies that the server has lost its | ||||
lease state database. | ||||
o When its failover partner is in PARTNER-DOWN state. | ||||
Any of these conditions signal that the server needs to refresh its | ||||
lease-state database from its partner. Once this operation is | ||||
complete, it switches to RECOVER-WAIT and later to RECOVER-DONE. See | ||||
Section 9.6.2. | ||||
Once servers reestablish connection, they discover each others' | Once servers reestablish connection, they discover each others' | |||
state. Depending on the conditions, they may return to NORMAL or | state. Depending on the conditions, they may return to NORMAL or | |||
move to POTENTINAL-CONFLICT if the partner is in a state that doesn't | move to POTENTINAL-CONFLICT if the partner is in a state that doesn't | |||
allow a simple re-integration of the server's lease state databases. | allow a simple re-integration of the server's lease state databases. | |||
It is a goal of this protocol to minimize the possibility that | It is a goal of this protocol to minimize the possibility that | |||
POTENTIAL-CONFLICT state is ever entered. Servers running in | POTENTIAL-CONFLICT state is ever entered. Servers running in | |||
POTENTIAL-CONFLICT do not respond to clients' requests and work only | POTENTIAL-CONFLICT do not respond to clients' requests and work only | |||
on resolving potential conflicts. Once outstanding lease updates are | on resolving potential conflicts. Once outstanding lease updates are | |||
exchanged, servers move to CONFLICT-DONE or NORMAL states. | exchanged, servers move to CONFLICT-DONE or NORMAL states. | |||
Servers that are recovering from potential conflicts and loose | Servers that are recovering from potential conflicts and loose | |||
communication, switch to RESOLUTION-INTERRUPTED. | communication, switch to RESOLUTION-INTERRUPTED. | |||
A Server that is being shut down sends a DISCONNECT message. See | A server that is being shut down sends a DISCONNECT message. See | |||
Section 4.2. | Section 4.2. A server that receives a DISCONNECT message moves into | |||
COMMUNICATIONS-INTERRUPTED state. | ||||
4.2. Messages | 4.2. Messages | |||
The failover protocol is centered around the message exchanges used | The failover protocol is centered around the message exchanges used | |||
by one server to update its partner and respond to received updates. | by one server to update its partner and respond to received updates. | |||
The following list enumerates these messages. | ||||
It should be noted that no specific formats or message type values | It should be noted that no specific formats or message type values | |||
are assigned at this stage. Appropriate implementation details will | are assigned in this document. Appropriate implementation details | |||
be specified in a separate protocol specification document. | will be specified in a separate protocol specification document. The | |||
following list enumerates these messages: | ||||
o BNDUPD - The binding update message is used to send the binding | o BNDUPD - The binding update message is used to send the binding | |||
lease changes to the partner. One message may contain one or more | lease changes to the partner. One message may contain one or more | |||
lease updates. The partner is expected to respond with a BNDACK | lease updates. The partner is expected to respond with a BNDACK | |||
message. | message. | |||
o BNDACK - The binding acknowledgement is used for confirmation of | o BNDACK - The binding acknowledgement is used for confirmation of | |||
the received BNDUPD message. It may contain a positive or | the received BNDUPD message. It may contain a positive or | |||
negative response (e.g. due to detected lease conflict). | negative response (e.g. due to detected lease conflict). | |||
o POOLREQ - The Pool Request message is used by one server | o POOLREQ - The Pool Request message is used by one server | |||
(typically secondary) to request allocation of resources | (typically secondary) to request allocation of resources | |||
(addresses or prefixes) from its partner. The partner responds | (addresses or prefixes) from its partner. The partner responds | |||
with POOLRSP. | with POOLRESP. | |||
o POOLRSP - The Pool Response message is used by one server | o POOLRESP - The Pool Response message is used by one server | |||
(typically primary) to repond to its partner's request for | (typically primary) to repond to its partner's request for | |||
resources allocation. One POOLRSP message may contain more than | resources allocation. One POOLRESP message may contain more than | |||
one pool. | one pool. | |||
o UPDREQ - The update request message is used by one server to | o UPDREQ - The update request message is used by one server to | |||
request that its partner send all binding database changes that | request that its partner send all binding database changes that | |||
has not been sent and confirmed already. Requested partner is | has not been sent and confirmed already. Requested partner is | |||
expected to respond with zero or more BNDUPD messages, followed by | expected to respond with zero or more BNDUPD messages, followed by | |||
UPDDONE that signals end of updates. | UPDDONE that signals end of updates. | |||
o UPDREQALL - The update request all is used by one server to | o UPDREQALL - The update request all is used by one server to | |||
request that all binding database information be sent in order to | request that all binding database information be sent in order to | |||
skipping to change at page 11, line 48 | skipping to change at page 12, line 22 | |||
initiated the connection attempt MUST send a CONNECT message down the | initiated the connection attempt MUST send a CONNECT message down the | |||
connection. | connection. | |||
When a connection attempt is received, the only information that the | When a connection attempt is received, the only information that the | |||
receiving server has is the IP address of the partner initiating a | receiving server has is the IP address of the partner initiating a | |||
connection. If it has any relationships with the connecting server | connection. If it has any relationships with the connecting server | |||
for which it is a seconary server, it should just await the CONNECT | for which it is a seconary server, it should just await the CONNECT | |||
message to determine which relationship this connection is to serve. | message to determine which relationship this connection is to serve. | |||
If it has no secondary relationships with the connecting server, it | If it has no secondary relationships with the connecting server, it | |||
SHOULD drop the connection. | SHOULD drop the connection. The goal is to limit the resources | |||
expended dealing with attempts to create a spurious failover | ||||
connection. | ||||
To summarize -- a primary server MUST use a connection that it has | To summarize -- a primary server MUST use a connection that it has | |||
initiated in order to send a CONNECT message. Every server that is a | initiated in order to send a CONNECT message. Every server that is a | |||
secondary server in a relationship simply listens for connection | secondary server in a relationship simply listens for connection | |||
attempts from the primary server. | attempts from the primary server. | |||
Once a connection is established, the primary server MUST send a | Once a connection is established, the primary server MUST send a | |||
CONNECT message across the connection. A secondary server MUST wait | CONNECT message across the connection. A secondary server MUST wait | |||
for the CONNECT message from a primary server. If the secondary | for the CONNECT message from a primary server. If the secondary | |||
server doesn't receive a CONNECT message from the primary server in | server doesn't receive a CONNECT message from the primary server in | |||
skipping to change at page 12, line 45 | skipping to change at page 13, line 21 | |||
useful. A failover endpoint is always associated with a set of | useful. A failover endpoint is always associated with a set of | |||
DHCPv6 prefixes that are configured on the DHCPv6 server where the | DHCPv6 prefixes that are configured on the DHCPv6 server where the | |||
endpoint appears. A DHCPv6 prefix MUST NOT be associated with more | endpoint appears. A DHCPv6 prefix MUST NOT be associated with more | |||
than one failover endpoint. | than one failover endpoint. | |||
The failover protocol SHOULD be configured with one failover | The failover protocol SHOULD be configured with one failover | |||
relationship between each pair of failover servers. In this case | relationship between each pair of failover servers. In this case | |||
there is one failover endpoint for that relationship on each failover | there is one failover endpoint for that relationship on each failover | |||
partner. This failover relationship MUST have a unique name. | partner. This failover relationship MUST have a unique name. | |||
There is typically little need for addtional relationships between | There is typically little need for additional relationships between | |||
any two servers but there MAY be more than one failover relationship | any two servers but there MAY be more than one failover relationship | |||
between two servers -- however each MUST have a unique relationship | between two servers -- however each MUST have a unique relationship | |||
name. | name. | |||
Any failover endpoint can take actions and hold unique states. | Any failover endpoint can take actions and hold unique states. | |||
This document frequently describes the behavior of the protocol in | This document frequently describes the behavior of the protocol in | |||
terms of primary and secondary servers, not primary and secondary | terms of primary and secondary servers, not primary and secondary | |||
failover endpoints. However, it is important to remember that every | failover endpoints. However, it is important to remember that every | |||
'server' described in this document is in reality a failover endpoint | 'server' described in this document is in reality a failover endpoint | |||
skipping to change at page 13, line 32 | skipping to change at page 14, line 7 | |||
receiving server. | receiving server. | |||
6. Resource Allocation | 6. Resource Allocation | |||
Currently there are two allocation algorithms defined for resources | Currently there are two allocation algorithms defined for resources | |||
(addresses or prefixes). Additional allocation schemes may be | (addresses or prefixes). Additional allocation schemes may be | |||
defined as future extensions. | defined as future extensions. | |||
1. Proportional Allocation - This allocation algorithm is a direct | 1. Proportional Allocation - This allocation algorithm is a direct | |||
application of the algorithm defined in [dhcpv4-failover] to | application of the algorithm defined in [dhcpv4-failover] to | |||
DHCPv6. Available resources are split between the primary and | DHCPv6. Remaining available resources are split between the | |||
secondary servers. Released resources are always returned to the | primary and secondary servers in a configured proportion. | |||
primary server. Primary and secondary servers may initiate a | Released resources are always returned to the primary server. | |||
rebalancing procedure when disparity between resources available | Primary and secondary servers may initiate a rebalancing | |||
to each server reaches a preconfigured threshold. Only resources | procedure when disparity between resources available to each | |||
that are not leased to any clients are "owned" by one of the | server reaches a preconfigured threshold. Only resources that | |||
servers. This algorithm is particularly well suited for | are not leased to any clients are "owned" by one of the servers. | |||
scenarios where amount of available resources is limited, as may | This algorithm is particularly well suited for scenarios where | |||
be the case with prefix delegation. See Section 6.1 for details. | amount of available resources is limited, as may be the case with | |||
prefix delegation. See Section 6.1 for details. | ||||
2. Independent Allocation - This allocation algorithm assumes that | 2. Independent Allocation - This allocation algorithm assumes that | |||
available resources are split between primary and secondary | available resources are split between primary and secondary | |||
servers as well. In this case, however, resources are assigned | servers as well. In this case, however, resources are assigned | |||
to a specific server for all time, regardless if they are | to a specific server for all time, regardless if they are | |||
available or currently used. This algorithm is much simpler than | available or currently used. This algorithm is much simpler than | |||
proportional allocation, because resource imbalance doesn't have | proportional allocation, because resource imbalance doesn't have | |||
to be checked and there is no rebalancing for independent | to be checked and there is no rebalancing for independent | |||
allocation. This algorithm is particularly well suited for | allocation. This algorithm is particularly well suited for | |||
scenarios where the there is an abundance of available resources | scenarios where the there is an abundance of available resources | |||
which is typically the case for DHCPv6 address allocation. See | which is typically the case for DHCPv6 address allocation. See | |||
Section 6.2 for details. | Section 6.2 for details. | |||
6.1. Proportional Allocation | 6.1. Proportional Allocation | |||
In this allocation scheme, each server has its own pool of available | In this allocation scheme, each server has its own pool of available | |||
resources. Note that a resource is not "owned" by a particular | resources. Remaining available resources are split between the | |||
server throughout its entire lifetime. Only a resource which is | primary and secondary servers in a configured proportion. Note that | |||
available is "owned" by a particular server -- once it has been | a resource is not "owned" by a particular server throughout its | |||
leased to a client, it is not owned by either failover partner. When | entire lifetime. Only a resource which is available is "owned" by a | |||
it finally becomes available again, it will be owned initially by the | particular server -- once it has been leased to a client, it is not | |||
primary server, and it may or may not be allocated to the secondary | owned by either failover partner. When it finally becomes available | |||
server by the primary server. | again, it will be owned initially by the primary server, and it may | |||
or may not be allocated to the secondary server by the primary | ||||
server. | ||||
The flow of a resource is as follows: initially a resource is owned | The flow of a resource is as follows: initially a resource is owned | |||
by the primary server. It may be allocated to the secondary server | by the primary server. It may be allocated to the secondary server | |||
if it is available, and then it is owned by the secondary server. | if it is available, and then it is owned by the secondary server. | |||
Either server can allocate available resources which they own to | Either server can allocate available resources which they own to | |||
clients, in which case they cease to own them. When the client | clients, in which case they cease to own them. When the client | |||
releases the resource or the lease on it expires, it will again | releases the resource or the lease on it expires, it will again | |||
become available and will be owned by the primary. | become available and will be owned by the primary. | |||
A resource will not become owned by the server which allocated it | A resource will not become owned by the server which allocated it | |||
skipping to change at page 14, line 51 | skipping to change at page 15, line 28 | |||
The initial allocation when the servers first integrate is triggered | The initial allocation when the servers first integrate is triggered | |||
by the POOLREQ message from the secondary to the primary. This is | by the POOLREQ message from the secondary to the primary. This is | |||
followed by the POOLRESP message where the primary tells the | followed by the POOLRESP message where the primary tells the | |||
secondary how many resources it allocated to the secondary. Then, | secondary how many resources it allocated to the secondary. Then, | |||
the primary sends the allocated resources to the secondary via BNDUPD | the primary sends the allocated resources to the secondary via BNDUPD | |||
messages. The POOLREQ/POOLRESP message is a trigger to the primary | messages. The POOLREQ/POOLRESP message is a trigger to the primary | |||
to perform a scan of its database and to ensure that the secondary | to perform a scan of its database and to ensure that the secondary | |||
has enough resources (based on some configured ratio). | has enough resources (based on some configured ratio). | |||
The primary server SHOULD examine some or all of its database from | ||||
time to time to determine if resources should be shifted between the | ||||
primary and secondary (in either direction). The POOLREQ/POOLRESP | ||||
message exchange allows the secondary server to explicitly request | ||||
that the primary server examine the entirety of its database to | ||||
ensure that the secondary has the approprite resources available. | ||||
Servers frequently have several kinds of resources available on a | Servers frequently have several kinds of resources available on a | |||
particular network segment. The failover protocol assumes that both | particular network segment. The failover protocol assumes that both | |||
primary and secondary servers are configured in such a way that each | primary and secondary servers are configured in such a way that each | |||
knows the type and number of resources on every network segment | knows the type and number of resources on every network segment | |||
participating in the failover protocol. The primary server is | participating in the failover protocol. The primary server is | |||
responsible for allocating the secondary server the correct | responsible for allocating the secondary server the correct | |||
proportion of available resources of each kind, and the secondary | proportion of available resources of each kind, and the secondary | |||
server is responsible for being configured in such a way that it can | server MUST be configured in such a way that it can tell the kind of | |||
tell the kind of every resource based solely on the IP or prefix | every resource based solely on the IP or prefix address itself. | |||
address itself. | ||||
The resources are delegated to the secondary using the BNDUPD message | The resources are delegated to the secondary using the BNDUPD message | |||
with a state of FREE_BACKUP, which indicates the resource is now | with a state of FREE_BACKUP, which indicates the resource is now | |||
available for allocation by the secondary. Once the message is sent, | available for allocation by the secondary. Once the message is sent, | |||
the primary MUST NOT use these resources for allocation to DHCPv6 | the primary MUST NOT use these resources for allocation to DHCPv6 | |||
clients. | clients. | |||
Available resources can be delegated back to the primary server in | Available resources can be delegated back to the primary server in | |||
certain cases. BNDUPD will contain state FREE for leases that were | certain cases. BNDUPD will contain state FREE for leases that were | |||
previously in FREE_BACKUP state. | previously in FREE_BACKUP state. | |||
The POOLREQ/POOLRESP message exchange initiated by the secondary is | The POOLREQ/POOLRESP message exchange initiated by the secondary is | |||
valid at any time, and the primary server SHOULD, whenever it | valid at any time both partners remain in contact, and the primary | |||
receives the POOLREQ message, scan its database of prefixes and | server SHOULD, whenever it receives the POOLREQ message, scan its | |||
determine if the secondary needs more resources from any of the | database of prefixes and determine if the secondary needs more | |||
prefixes. | resources from any of the prefixes. | |||
In order to support a reasonably dynamic balance of the resources | In order to support a reasonably dynamic balance of the resources | |||
between the failover partners, the primary server needs to do | between the failover partners, the primary server needs to do | |||
additional work to ensure that the secondary server has as many | additional work to ensure that the secondary server has as many | |||
resources as it needs (but that it doesn't have more than it needs). | resources as it needs (but that it doesn't have more than it needs). | |||
The primary server SHOULD examine the balance of available resources | The primary server SHOULD examine the balance of available resources | |||
between the primary and secondary for a particular prefix whenever | between the primary and secondary for a particular prefix whenever | |||
the number of available resources for either the primary or secondary | the number of available resources for either the primary or secondary | |||
changes by more than a configured limit. The primary server SHOULD | changes by more than a configured limit. The primary server SHOULD | |||
skipping to change at page 16, line 4 | skipping to change at page 16, line 37 | |||
to minimize the overhead of maintaining this balance. | to minimize the overhead of maintaining this balance. | |||
An example of a threshold approach is: do not attempt to re-balance | An example of a threshold approach is: do not attempt to re-balance | |||
the prefixes on the primary and secondary until the out of balance | the prefixes on the primary and secondary until the out of balance | |||
value exceeds a configured value. | value exceeds a configured value. | |||
The primary server can, at any time, send an available resource to | The primary server can, at any time, send an available resource to | |||
the secondary using a BNDUPD with the state BACKUP. The primary | the secondary using a BNDUPD with the state BACKUP. The primary | |||
server can attempt to take an available resource away from the | server can attempt to take an available resource away from the | |||
secondary by sending a BNDUPD with the state FREE. If the secondary | secondary by sending a BNDUPD with the state FREE. If the secondary | |||
accepts the BNDUPD, then it is now available to the PRIMARY and not | accepts the BNDUPD, then the resource is now available to the primary | |||
available to the secondary. Of course, the secondary MUST reject | and not available to the secondary. Of course, the secondary MUST | |||
that BNDUPD if it has already used that resource for a DHCP client. | reject that BNDUPD if it has already used that resource for a DHCP | |||
client. | ||||
6.2. Independent Allocation | 6.2. Independent Allocation | |||
In this allocation scheme, available resources are permanently (until | In this allocation scheme, available resources are permanently (until | |||
server configuration changes) split between servers. Available | server configuration changes) split between servers. Available | |||
resources are split between the primary and secondary servers as part | resources are split between the primary and secondary servers as part | |||
of initial connection establishment. Once resources are allocated to | of initial connection establishment. Once resources are allocated to | |||
each server, there is no need to reassign them. This algorithm is | each server, there is no need to reassign them. The resource | |||
simpler than proportional allocation since it requires similar | allocation is algorithmic in nature, and does not require a message | |||
initial communication, but does not require a rebalancing mechanism. | exchange for each resources allocated. This algorithm is simpler | |||
It assumes that the pool assigned to each server will never deplete. | than proportional allocation since it requires similar initial | |||
communication, but does not require a rebalancing mechanism. It | ||||
assumes that the pool assigned to each server will never deplete. | ||||
That is often a reasonable assumption for IPv6 addresses (e.g. | That is often a reasonable assumption for IPv6 addresses (e.g. | |||
servers are often assigned a /64 pool that contains many more | servers are often assigned a /64 pool that contains many more | |||
addresses than existing electronic devices on Earth). This | addresses than existing electronic devices on Earth). This | |||
allocation mechanism SHOULD be used for IPv6 addresses, unless the | allocation mechanism SHOULD be used for IPv6 addresses, unless the | |||
configured address pool is small or is otherwise administratively | configured address pool is small or is otherwise administratively | |||
limited. | limited. | |||
Once each server is assigned a resource pool during initial | Once each server is assigned a resource pool during initial | |||
connection establishment, it may allocate assigned resources to | connection establishment, it may allocate assigned resources to | |||
clients. Once a client release a resource or its lease is expired, | clients. Once a client releases a resource or its lease is expired, | |||
the returned resource returns to pool for the server that leased it. | the returned resource returns to pool for the server that leased it. | |||
Resources never changes servers. | Resources never changes servers. | |||
Resources using the independent allocation approach are ignored when | ||||
a server processes a POOLREQ message. | ||||
During COMMUNICATION-INTERRUPTED events, a partner MAY continue | During COMMUNICATION-INTERRUPTED events, a partner MAY continue | |||
extending existing leases when requested by clients. A healthy | extending existing leases when requested by clients. A healthy | |||
partner MUST NOT lease resources that were assigned to its downed | partner MUST NOT lease resources that were assigned to its downed | |||
partner and later released by a client unless it is in PARTNER-DOWN | partner and later released by a client unless it is in PARTNER-DOWN | |||
state. Server SHOULD use its own pool first before starting new | state. Server SHOULD use its own pool first before starting new | |||
assignements from its downed partner's pool. As the assumption is | assignements from its downed partner's pool. As the assumption is | |||
that independent allocation should be used only when available | that independent allocation should be used only when available | |||
resources are vast and not expected to be fully used at any given | resources are vast and not expected to be fully used at any given | |||
time, it is very unlikely that the server will ever need to use its | time, it is very unlikely that the server will ever need to use its | |||
downed partner pools. | downed partner pools. This makes a recovery even after prolonged | |||
down-time much easier. | ||||
6.3. Choosing Allocation Algorithm | 6.3. Choosing Allocation Algorithm | |||
All implementations MUST support proportional allocation algorithm | All implementations MUST support proportional allocation algorithm | |||
and SHOULD support independent allocation. If the implementation | and SHOULD support independent allocation. If the implementation | |||
implement both and let the user configure it, the default algorithm | implements both and lets the user choose between them, the default | |||
used SHOULD be proportional allocation algorithm. | algorithm used SHOULD be proportional allocation algorithm. | |||
Proportional allocation mechanism is more flexible as it can | Proportional allocation mechanism is more flexible as it can | |||
dynamically rebalance available resources between servers. That | dynamically rebalance available resources between servers. That | |||
balance includes additional burden for the servers and generates more | balance includes additional burden for the servers and generates more | |||
traffic between servers. Proportional algorithm can be considered as | traffic between servers. Proportional algorithm can be considered | |||
managing available resources more efficiently than idenpendent. That | more efficient at managing available resources, compared to | |||
is important aspect when working in a network that is nearing address | idenpendent. That is important aspect when working in a network that | |||
and/or prefix depletion. | is nearing address and/or prefix depletion. | |||
Independent allocation can be used when the number of available | Independent allocation can be used when the number of available | |||
resources are large and there is no realistic danger of running out | resources are large and there is no realistic danger of running out | |||
of resources. Use of the independent allocation makes communication | of resources. Use of the independent allocation makes communication | |||
between partners simpler. | between partners simpler. It also makes recovery easier and | |||
potential conflict less likely to appear. | ||||
Typically indepentent allocation is used for IPv6 addresses, because | Typically independent allocation is used for IPv6 addresses, because | |||
even for /64 pools a server will never run out of addresses to | even for /64 pools a server will never run out of addresses to | |||
assign, so there is no need to rebalance. For the prefix delegation | assign, so there is no need to rebalance. For the prefix delegation | |||
mechanism, available resources are much smaller, so there is a danger | mechanism, available resources are typically much smaller, so there | |||
of running out of addresses. Therefore typically proportional | is a danger of running out of prefixes. Therefore typically | |||
allocation will be used for prefix delegations. Independent | proportional allocation will be used for prefix delegations. | |||
allocation may be used, but the implication must be well understood. | Independent allocation still may be used, but the implication must be | |||
For example in a network that delegates /64 prefixes out out /48 | well understood. For example in a network that delegates /64 | |||
prefix (so there can be up to 65536 prefixes delegated) and a 1000 | prefixes out out /48 prefix (so there can be up to 65536 prefixes | |||
requesting routers, it is safe to use independent allocation. | delegated) and a 1000 requesting routers, it is safe to use | |||
independent allocation. | ||||
It should be stressed out that independent allocation algorithm | It should be stressed out that independent allocation algorithm | |||
SHOULD NOT be used when number of resources is limited and there is a | SHOULD NOT be used when number of resources is limited and there is a | |||
realistic danger of depleting resources. If this recommendation is | realistic danger of depleting resources. If this recommendation is | |||
violated, it may lead to a case, when one server denies clients due | violated, it may lead to a case, when one server denies clients due | |||
to pool depletion despite the fact the the other partner still have | to pool depletion despite the fact the the other partner still have | |||
many resources available. | many resources available. | |||
With independent allocation it is very unlikely to remaining healthy | ||||
server to allocate resources from its unavailable partner's pool. | ||||
That makes recovery easier and any potential conflicts are less | ||||
likely to appear. | ||||
7. Information model | 7. Information model | |||
In most DHCP servers a resource (an IP address or a prefix) can take | In most DHCP servers a resource (an IP address or a prefix) can take | |||
on several different binding-status values, sometimes also called | on several different binding-status values, sometimes also called | |||
lease states. While no two DHCP servers probably have exactly the | lease states. While no two DHCP server implementations probably have | |||
same possible binding-status values, the DHCP RFC enforces some | exactly the same possible binding-status values, [RFC3315] enforces | |||
commonality among the general semantics of the binding-status values | some commonality among the general semantics of the binding-status | |||
used by various DHCP server implementations. | values used by various DHCP server implementations. | |||
In order to transmit binding database updates between one server and | In order to transmit binding database updates between one server and | |||
another using the failover protocol, some common denominator binding- | another using the failover protocol, some common denominator binding- | |||
status values must be defined. It is not expected that these values | status values must be defined. It is not expected that these values | |||
correspond with any actual implementation of the DHCP protocol in a | correspond with any actual implementation of the DHCP protocol in a | |||
DHCP server, but rather that the binding-status values defined in | DHCP server, but rather that the binding-status values defined in | |||
this document should be a common denominator of those in use by many | this document should be a common denominator of those in use by many | |||
DHCP server implementations. | DHCP server implementations. | |||
The lease binding-status values defined for the failover protocol are | The lease binding-status values defined for the failover protocol are | |||
listed below. Unless otherwise noted below, there MAY be client | listed below. Unless otherwise noted below, there MAY be client | |||
information associated with each of these binding-status value. | information associated with each of these binding-status value. | |||
ACTIVE -- The lease is assigned to a client. Client identification | ACTIVE -- The lease is assigned to a client. Client identification | |||
data MUST appear. | data MUST appear. | |||
EXPIRED -- indicates that a client's binding on a given lease has | EXPIRED -- indicates that a client's binding on a given lease has | |||
expired. When the partner acks the BNDUPD of an expired lease, | expired. When the partner acks the BNDUPD of an expired lease, | |||
the server sets its internal state to FREE*. Client | the server sets its internal state to FREE*. Client identification | |||
identification SHOULD appear. | SHOULD appear. | |||
RELEASED -- indicates that a client sent in RELEASE message. When | RELEASED -- indicates that a client sent in RELEASE message. When | |||
the partner acks the BNDUPD of a released lease, the server sets | the partner acks the BNDUPD of a released lease, the server sets | |||
its internal state to FREE*. Client identification SHOULD appear. | its internal state to FREE*. Client identification SHOULD appear. | |||
FREE* -- Once a lease is expired or released, its state becomes | FREE* -- Once a lease is expired or released, its state becomes | |||
FREE*. Depending on which algorithm and which pool was used to | FREE*. Depending on which algorithm and which pool was used to | |||
allocate a given lease, FREE* may either mean FREE or FREE_BACKUP. | allocate a given lease, FREE* may either mean FREE or FREE_BACKUP. | |||
Implementations do not have to implement this FREE* state, but may | Implementations do not have to implement this FREE* state, but may | |||
choose to switch to the destination state directly. For a clarity | choose to switch to the destination state directly. For a clarity | |||
of representation, this transitional FREE* state is treated as a | of representation, this transitional FREE* state is treated as a | |||
separate state. | separate state. | |||
FREE -- Is used when a DHCP server needs to communicate that a | FREE -- Is used when a DHCP server needs to communicate that a | |||
resource is unused by any client, but it was not just released, | resource is unused by any client, but it was not just released, | |||
expired or reset by a network administrator. When the partner | expired or reset by a network administrator. When the partner | |||
acks the BNDUPD of a FREE lease, the server marks the lease as | acks the BNDUPD of a FREE lease, the server marks the lease as | |||
skipping to change at page 19, line 6 | skipping to change at page 20, line 8 | |||
DHCP system. The primary reason for entering such state is | DHCP system. The primary reason for entering such state is | |||
reception of DECLINE message for said lease. Client | reception of DECLINE message for said lease. Client | |||
identification MUST NOT appear. | identification MUST NOT appear. | |||
RESET -- indicates that this resource was previously abandoned, but | RESET -- indicates that this resource was previously abandoned, but | |||
was made available by operator command. This is a distinct state | was made available by operator command. This is a distinct state | |||
so that the reason that the resource became FREE can be | so that the reason that the resource became FREE can be | |||
determined. Client identification MAY appear. | determined. Client identification MAY appear. | |||
The lease state machine has been presented in Figure 1. Most states | The lease state machine has been presented in Figure 1. Most states | |||
are stationary, i.e. the lease stays in a given state untile exernal | are stationary, i.e. the lease stays in a given state until exernal | |||
event triggers transition to another state. The only transitive | event triggers transition to another state. The only transitive | |||
state is FREE*. One it is reached, the the state machine immediately | state is FREE*. One it is reached, the the state machine immediately | |||
transitions to either FREE or FREE_BACKUP state. | transitions to either FREE or FREE_BACKUP state. | |||
+---------+ | +---------+ | |||
/------------->| ACTIVE |<--------------\ | /------------->| ACTIVE |<--------------\ | |||
| +---------+ | | | +---------+ | | |||
| | | | | | | | | | | | |||
| /--(8)--/ (3) \--(9)-\ | | | /--(8)--/ (3) \--(9)-\ | | |||
| | | | | | | | | | | | |||
| V V V | | | V V V | | |||
| +-------+ +--------+ +---------+ | | | +-------+ +--------+ +---------+ | | |||
| |EXPIRED| |RELEASED| |ABANDONED| | | | |EXPIRED| |RELEASED| |ABANDONED| | | |||
| +-------+ +--------+ +---------+ | | | +-------+ +--------+ +---------+ | | |||
| | | | | | | | | | | | |||
| | | (10) | | | | | (10) | | |||
| | | V | | | | | V | | |||
| | | +---------+ | | | | | +---------+ | | |||
| | | | RESET | | | | | | | RESET | | | |||
| | | +---------+ | | | | | +---------+ | | |||
| | | | | | | | | | | | |||
| \--(4)--\ (4) /--(4)--/ | | | \--(4)--\ (4) /--(4)--/ | | |||
| | | | | | | | | | | | |||
(1) V V V (2) | (1) V V V (2) | |||
| /---------\ | | | /---------\ | | |||
| | FREE* | | | | | FREE* | | | |||
| \---------/ | | | \---------/ | | |||
| | | | | | | | | | |||
| /-(5)--/ \-(6)-\ | | | /-(5)--/ \-(6)-\ | | |||
| | | | | | | | | | |||
| V V | | | V V | | |||
| +-------+ +-----------+ | | | +-------+ +-----------+ | | |||
\----| FREE |<--(7)-->|FREE_BACKUP|-----/ | \----| FREE |<--(7)-->|FREE_BACKUP|-----/ | |||
+-------+ +-----------+ | +-------+ +-----------+ | |||
FREE* transition | ||||
Figure 1: Lease State Machine | Figure 1: Lease State Machine | |||
Transitions between states are results of the following events: | Transitions between states are results of the following events: | |||
1. Primary server allocates a lease. | 1. Primary server allocates a lease. | |||
2. Secondary server allocates a lease. | 2. Secondary server allocates a lease. | |||
3. Client sends RELEASE and the lease is released. | 3. Client sends RELEASE and the lease is released. | |||
4. Partner acknowledges state change. This transition MAY also | 4. Partner acknowledges state change. This transition MAY also | |||
occur if the server is in PARTNER-DOWN state and the MCLT has | occur if the server is in PARTNER-DOWN state and the MCLT has | |||
passed since the entry in RELEASED, EXPIRED, or RESET states. | passed since the entry in RELEASED, EXPIRED, or RESET states. | |||
5. The lease belongs to a pool that is governed by the | 5. The lease belongs to a pool that is governed by the | |||
proportional allocation, or independent allocation is used and | proportional allocation, or independent allocation is used and | |||
this lease belongs to primary server. | this lease belongs to primary server pool. | |||
6. The lease belongs to a pool that is governed by the | 6. The lease belongs to a pool that is governed by the | |||
independent allocation and the lease belongs to the secondary | independent allocation and the lease belongs to the secondary | |||
server. | server. | |||
7. Pool rebalance event occurs (POOLREQ/POOLRSP messages are | 7. Pool rebalance event occurs (POOLREQ/POOLRESP messages are | |||
exchanged). Addresses (or prefixes) belonging to the primary | exchanged). Addresses (or prefixes) belonging to the primary | |||
server can be assigned to the secondary server pool (transition | server can be assigned to the secondary server pool (transition | |||
from FREE to FREE_BACKUP) or vice versa. | from FREE to FREE_BACKUP) or vice versa. | |||
8. The lease is expired. | 8. The lease has expired. | |||
9. DECLINE message is received or a lease is deemed unusable for | 9. DECLINE message is received or a lease is deemed unusable for | |||
other reasons. | other reasons. | |||
10. An administrative action is taken to recover an abandoned | 10. An administrative action is taken to recover an abandoned | |||
lease back to usable state. This transition MAY occur due to an | lease back to usable state. This transition MAY occur due to an | |||
implementation specific handling on ABANDONED resource. One | implementation specific handling on ABANDONED resource. One | |||
possible example of such use is a Neighbor Discovery or ICMP Echo | possible example of such use is a Neighbor Discovery or ICMP Echo | |||
check if the address is still in use. | check if the address is still in use. | |||
The resource that is no longer in use (due to expiration or release), | The resource that is no longer in use (due to expiration or release), | |||
becomes FREE*. Depending of what allocation algorithm is used, the | becomes FREE*. Depending of what allocation algorithm is used, the | |||
resource that is no longer is use, returns to primary (FREE) or | resource that is no longer is use, returns to primary (FREE) or | |||
secondary pool (FREE_BACKUP). The conditions for specific | secondary pool (FREE_BACKUP). The conditions for specific | |||
transitions are depicted in Figure 2. | transitions are depicted in Figure 2. | |||
+---------------+---------+-----------+ | +---------------+---------+-----------+ | |||
| \ Pool owner| | | | | \ Pool owner| | | | |||
| \-------\ | Primary | Secondary | | | \-------\ | Primary | Secondary | | |||
|Algorithm \ | | | | |Algorithm \ | | | | |||
+---------------+---------+-----------+ | +---------------+---------+-----------+ | |||
| Proportional | FREE | FREE | | | Proportional | FREE | FREE | | |||
skipping to change at page 21, line 33 | skipping to change at page 22, line 37 | |||
length of expected downtime of the primary server, and is not | length of expected downtime of the primary server, and is not | |||
directly influenced by the total number of DHCP clients supported by | directly influenced by the total number of DHCP clients supported by | |||
the server pair. | the server pair. | |||
8. Failover Mechanisms | 8. Failover Mechanisms | |||
This section lays out an overview of the communication between | This section lays out an overview of the communication between | |||
partners and other mechanisms required for failover operation. As | partners and other mechanisms required for failover operation. As | |||
this is a design document, not a protocol specification, high level | this is a design document, not a protocol specification, high level | |||
ideas are presented without implementation specific details (e.g. on- | ideas are presented without implementation specific details (e.g. on- | |||
wire protocol formats). Specific protocol details are out of the | wire protocol formats). | |||
scope of this document, and may be specified in a separate draft. | ||||
8.1. Time Skew | 8.1. Time Skew | |||
Partners exchange information about known lease states. To reliably | Partners exchange information about known lease states. To reliably | |||
compare a known lease state with an update received from a partner, | compare a known lease state with an update received from a partner, | |||
servers must be able to reliably compare the times stored in the | servers must be able to reliably compare the times stored in the | |||
known lease state with the times received in the update. Although a | known lease state with the times received in the update. Although a | |||
simple approach would be to require both partners to use synchronized | simple approach would be to require both partners to use synchronized | |||
time, e.g. by using NTP, such a service may not always be available | time, e.g. by using NTP, such a service may not always be available | |||
in some scenarios that failover expects to cover. Therefore a | in some scenarios that failover expects to cover. Therefore a | |||
skipping to change at page 22, line 29 | skipping to change at page 23, line 30 | |||
used in creation of DUID-LLT (see Section 9.2 of [RFC3315]). | used in creation of DUID-LLT (see Section 9.2 of [RFC3315]). | |||
Time differences are expressed in seconds and are signed. | Time differences are expressed in seconds and are signed. | |||
8.3. Lazy updates | 8.3. Lazy updates | |||
Lazy update refers to the requirement placed on a server implementing | Lazy update refers to the requirement placed on a server implementing | |||
a failover protocol to update its failover partner whenever the | a failover protocol to update its failover partner whenever the | |||
binding database changes. A failover protocol which didn't support | binding database changes. A failover protocol which didn't support | |||
lazy update would require the failover partner update to complete | lazy update would require the failover partner update to complete | |||
before a DHCPv6 server could respond to a DHCPv6 client request. The | before a DHCPv6 server could respond to a DHCPv6 client request. | |||
lazy update mechanism allows a server to allocate a new or extend an | Such approach is often referred to as 'lockstep' and is the opposite | |||
existing lease and then update its failover partner as time permits. | of lazy updates. The lazy update mechanism allows a server to | |||
allocate a new or extend an existing lease and then update its | ||||
failover partner as time permits. | ||||
Although the lazy update mechanism does not introduce additional | Although the lazy update mechanism does not introduce additional | |||
delays in server response times, it introduces other difficulties. | delays in server response times, it introduces other difficulties. | |||
The key problem with lazy update is that when a server fails after | The key problem with lazy update is that when a server fails after | |||
updating a client with a particular lease time and before updating | updating a client with a particular lease time and before updating | |||
its partner, the partner will believe that a lease has expired even | its partner, the partner will believe that a lease has expired even | |||
though the client still retains a valid lease on that address or | though the client still retains a valid lease on that address or | |||
prefix. | prefix. | |||
8.4. MCLT concept | 8.4. MCLT concept | |||
skipping to change at page 23, line 17 | skipping to change at page 24, line 26 | |||
partner with a potential expiration time which is longer than the | partner with a potential expiration time which is longer than the | |||
lease time previously given to the client and which is longer than | lease time previously given to the client and which is longer than | |||
the lease time that the server has been configured to give a client. | the lease time that the server has been configured to give a client. | |||
This allows that server to give a longer lease time to the client the | This allows that server to give a longer lease time to the client the | |||
next time the client renews its lease, since the time that it will | next time the client renews its lease, since the time that it will | |||
give to the client will not exceed the MCLT beyond the potential | give to the client will not exceed the MCLT beyond the potential | |||
expiration time acknowledged by its partner. | expiration time acknowledged by its partner. | |||
The fundamental relationship on which much of the correctness of this | The fundamental relationship on which much of the correctness of this | |||
protocol depends is that the lease expiration time known to a DHCPv6 | protocol depends is that the lease expiration time known to a DHCPv6 | |||
client MUST NOT under any circumstances be more than the maximum | client MUST NOT be greater by more than the MCLT beyond the potential | |||
client lead time (MCLT) greater than the potential expiration time | expiration time known to that server's failover partner. | |||
known to a server's partner. | ||||
The remainder of this section makes the above fundamental | The remainder of this section makes the above fundamental | |||
relationship more explicit. | relationship more explicit. | |||
This protocol requires a DHCPv6 server to deal with several different | This protocol requires a DHCPv6 server to deal with several different | |||
lease intervals and places specific restrictions on their | lease intervals and places specific restrictions on their | |||
relationships. The purpose of these restrictions is to allow the | relationships. The purpose of these restrictions is to allow the | |||
other server in the pair to be able to make certain assumptions in | other server in the pair to be able to make certain assumptions in | |||
the absence of an ability to communicate between servers. | the absence of an ability to communicate between servers. | |||
skipping to change at page 24, line 21 | skipping to change at page 25, line 26 | |||
The following example demonstrates the MCLT concept in practice. The | The following example demonstrates the MCLT concept in practice. The | |||
values used are arbitrarily chosen are and not a recommendation for | values used are arbitrarily chosen are and not a recommendation for | |||
actual values. The MCLT in this case is 1 hour. The desired valid | actual values. The MCLT in this case is 1 hour. The desired valid | |||
lifetime is 3 days, and its renewal time is half the valid lifetime. | lifetime is 3 days, and its renewal time is half the valid lifetime. | |||
When a server makes an offer for a new lease on an IP address to a | When a server makes an offer for a new lease on an IP address to a | |||
DHCPv6 client, it determines the desired valid lifetime (in this | DHCPv6 client, it determines the desired valid lifetime (in this | |||
case, 3 days). It then examines the acknowledged potential valid | case, 3 days). It then examines the acknowledged potential valid | |||
lifetime (which in this case is zero) and determines the remainder of | lifetime (which in this case is zero) and determines the remainder of | |||
the time left to run, which is also zero. To this it adds the MCLT. | the time left to run, which is also zero. It adds the MCLT to this | |||
Since the actual valid lifetime cannot be allowed to exceed the | value. Since the actual valid lifetime cannot be allowed to exceed | |||
remainder of the current acknowledged potential valid lifetime plus | the remainder of the current acknowledged potential valid lifetime | |||
the MCLT, the offer made to the client is for the remainder of the | plus the MCLT, the offer made to the client is for the remainder of | |||
current acknowledged potential valid lifetime (i.e., zero) plus the | the current acknowledged potential valid lifetime (i.e. zero) plus | |||
MCLT. Thus, the actual valid lifetime is 1 hour. | the MCLT. Thus, the actual valid lifetime is 1 hour. | |||
Once the server has sent the REPLY to the DHCPv6 client, it will | Once the server has sent the REPLY to the DHCPv6 client, it will | |||
update its failover partner with the lease information. However, the | update its failover partner with the lease information. However, the | |||
desired potential valid lifetime will be composed of one half of the | desired potential valid lifetime will be composed of one half of the | |||
current actual valid lifetime added to the desired valid lifetime. | current actual valid lifetime added to the desired valid lifetime. | |||
Thus, the failover partner is updated with a BNDUPD with a potential | Thus, the failover partner is updated with a BNDUPD with a potential | |||
valid lifetime of 3 days + 1/2 hour. | valid lifetime of 3 days + 1/2 hour. | |||
When the primary server receives a BNDACK to its update of the | When the primary server receives a BNDACK to its update of the | |||
secondary server's (partner's) potential valid lifetime, it records | secondary server's (partner's) potential valid lifetime, it records | |||
skipping to change at page 25, line 27 | skipping to change at page 26, line 32 | |||
Once the initial actual client valid lifetime of the MCLT is past, | Once the initial actual client valid lifetime of the MCLT is past, | |||
the protocol operates effectively like the DHCPv6 protocol does today | the protocol operates effectively like the DHCPv6 protocol does today | |||
in its behavior concerning valid lifetimes. However, the guarantee | in its behavior concerning valid lifetimes. However, the guarantee | |||
that the actual client valid lifetime will never exceed the remaining | that the actual client valid lifetime will never exceed the remaining | |||
acknowledged partner server potential valid lifetime by more than the | acknowledged partner server potential valid lifetime by more than the | |||
MCLT allows full recovery from a variety of failures. | MCLT allows full recovery from a variety of failures. | |||
8.5. Unreachability detection | 8.5. Unreachability detection | |||
Each partner maintains an FO_SEND timer for each partner connection. | Each partner MUST maintain a FO_SEND timer for each failover | |||
The FO_SEND timer is reset every time any message is transmitted. If | connection. The FO_SEND timer is reset every time any message is | |||
the timer reaches the FO_SEND_MAX value, a CONTACT message is | transmitted. If the timer reaches the FO_SEND_MAX value, a CONTACT | |||
transmitted and timer is reset. The CONTACT message may be | message is transmitted and timer is reset. The CONTACT message may | |||
transmitted at any time. | be transmitted at any time. Implementation MAY use additional | |||
mechanisms to detect partner unreachability. | ||||
8.6. Re-allocating Leases | Implementors are advised to keep in mind that the timer based CONTACT | |||
message mechanism is not perfect and may not detect some failures. | ||||
In particular, if the partner is using one interface to reach clients | ||||
("downlink") and another to reach its partner ("uplink"), it is | ||||
possible that communication with the clients will break, yet the | ||||
mechanism will still claim full reachability. For that reason it is | ||||
beneficial to share the same interface for client traffic and | ||||
communication with the failover partner. That approach may have | ||||
drawbacks in some network topologies. | ||||
8.6. Re-allocating Leases | ||||
When in PARTNER-DOWN state there is a waiting period after which a | When in PARTNER-DOWN state there is a waiting period after which a | |||
resource can be re-allocated to another client. For resources which | resource can be re-allocated to another client. For resources which | |||
are available when the server enters PARTNER-DOWN state, the period | are available when the server enters PARTNER-DOWN state, the period | |||
is the MCLT from entry into PARTNER-DOWN state. For resources which | is the MCLT from the entry into PARTNER-DOWN state. For resources | |||
are not available when the server enters PARTNER-DOWN state, the | which are not available when the server enters PARTNER-DOWN state, | |||
period is the MCLT after the later of the following times: the | the period is the MCLT after the later of the following times: the | |||
potential valid lifetime, the most recently transmitted potential | potential valid lifetime, the most recently transmitted potential | |||
valid lifetime, the most recently received acknowledged potential | valid lifetime, the most recently received acknowledged potential | |||
valid lifetime, and the most recently transmitted acknowledged | valid lifetime, and the most recently transmitted acknowledged | |||
potential valid lifetime. If this time would be earlier than the | potential valid lifetime. If this time would be earlier than the | |||
current time plus the MCLT, then the time the server entered PARTNER- | current time plus the MCLT, then the time the server entered PARTNER- | |||
DOWN state plus the maximum-client-lead-time is used. | DOWN state plus the maximum-client-lead-time is used. | |||
In any other state, a server cannot reallocate a resource from one | In any other state, a server cannot reallocate a resource from one | |||
client to another without first notifying its partner (through a | client to another without first notifying its partner (through a | |||
BNDUPD message) and receiving acknowledgement (through a BNDACK mes- | BNDUPD message) and receiving acknowledgement (through a BNDACK mes- | |||
skipping to change at page 26, line 23 | skipping to change at page 27, line 40 | |||
RELEASED respectively. The partner server would then be notified | RELEASED respectively. The partner server would then be notified | |||
that this resource was EXPIRED or RELEASED through a BNDUPD. When | that this resource was EXPIRED or RELEASED through a BNDUPD. When | |||
the sending server received the BNDACK for that resource showing it | the sending server received the BNDACK for that resource showing it | |||
was FREE, it would move the resource from EXPIRED or RELEASED to | was FREE, it would move the resource from EXPIRED or RELEASED to | |||
FREE, and it would be available for allocation by the primary server | FREE, and it would be available for allocation by the primary server | |||
to any clients. | to any clients. | |||
A server MAY reallocate a resource in the EXPIRED or RELEASED state | A server MAY reallocate a resource in the EXPIRED or RELEASED state | |||
to the same client with no restrictions provided it has not sent a | to the same client with no restrictions provided it has not sent a | |||
BNDUPD message to its partner. This situation would exist if the | BNDUPD message to its partner. This situation would exist if the | |||
lease expired or was released after the transition into PARTNER- DOWN | lease expired or was released after the transition into PARTNER-DOWN | |||
state, for instance. | state, for instance. | |||
8.7. Sending Binding Update | 8.7. Sending Binding Update | |||
This and the following section is written as though every BNDUPD | This and the following section is written as though every BNDUPD | |||
message contains only a single binding update transaction in order to | message contains only a single binding update transaction in order to | |||
reduce the complexity of the discussion. Note that while a server | reduce the complexity of the discussion. Note that while a server | |||
MAY generate BNDUPD messages with multiple binding update | MAY generate BNDUPD messages with multiple binding update | |||
transactions, every server MUST be able to process a BNDUPD message | transactions, every server MUST be able to process a BNDUPD message | |||
which contains multiple binding update transactions and generate the | which contains multiple binding update transactions and generate the | |||
corresponding BNDACK messages with status for multiple binding update | corresponding BNDACK messages with status for multiple binding update | |||
transactions. | transactions. | |||
Each server updates its failover partner about recent changes in | Each server updates its failover partner about recent changes in | |||
skipping to change at page 26, line 44 | skipping to change at page 28, line 20 | |||
corresponding BNDACK messages with status for multiple binding update | corresponding BNDACK messages with status for multiple binding update | |||
transactions. | transactions. | |||
Each server updates its failover partner about recent changes in | Each server updates its failover partner about recent changes in | |||
lease states. Each update MUST include at least the following | lease states. Each update MUST include at least the following | |||
information: | information: | |||
1. resource type - non-temporary address or a prefix. Resource | 1. resource type - non-temporary address or a prefix. Resource | |||
type can be indicated by the container that conveys the actual | type can be indicated by the container that conveys the actual | |||
resource (e.g. an IA_NA option indicates non-temporary IPv6 | resource (e.g. an IA_NA option indicates non-temporary IPv6 | |||
address). | address); | |||
2. resource information - the actual address or prefix. That is | 2. resource information - the actual address or prefix. That is | |||
conveyed using the appropriate option, e.g. an IAADDR for an | conveyed using the appropriate option, e.g. an IAADDR for an | |||
address or an IAPREFIX for prefix. | address or an IAPREFIX for a prefix; | |||
3. valid life time requested by client | 3. valid life time requested by client*; | |||
4. valid life time sent to client | ||||
4. valid life time sent to client*; | ||||
5. IAID - Identity Association used by the client, while obtaining | 5. IAID - Identity Association used by the client, while obtaining | |||
a given lease. (Note1: one client may use many IAIDs | a given lease. (Note1: one client may use many IAIDs | |||
simulatenously. Note2: IAID for IA, TA and PD are orthogonal | simulatenously. Note2: IAID for IA, TA and PD are orthogonal | |||
number spaces.) | number spaces.)*; | |||
6. Next Expected Client Transmission - time interval since Client | 6. Next Expected Client Transmission - time interval since Client | |||
Last Transmission Time, when a response from a client is | Last Transmission Time, when a response from a client is | |||
expected. | expected*; | |||
7. potential valid life time - a lifetime that the server is | 7. potential valid life time - a lifetime that the server is | |||
willing to set if there were no MCLT/failover restrictions | willing to set if there were no MCLT/failover restrictions | |||
imposed. | imposed*; | |||
8. preferred life time sent to client - the actual value sent back | 8. preferred life time sent to client - the actual value sent back | |||
to the client | to the client*; | |||
9. CLTT - Client Last Transaction Time, a timestamp of the last | 9. CLTT - Client Last Transaction Time, a timestamp of the last | |||
received transmission from a client | received transmission from a client*; | |||
10. Client DUID | 10. Client DUID*. | |||
Items marked with asterisk MUST appear only if the lease is/was | ||||
associated with a client. Otherwise it MUST NOT appear, e.g. for | ||||
updates from FREE to FREE_BACKUP state. Server MUST reject updates | ||||
that does not include any of the aforementioned information. | ||||
The BNDUPD message MAY contain additional information related to the | The BNDUPD message MAY contain additional information related to the | |||
updated lease. The additional information MAY include, but is not | updated lease. The additional information MAY include, but is not | |||
limited to: | limited to: | |||
1. assigned FQDN name, defined in [RFC4704] | 1. assigned FQDN name, defined in [RFC4704]; | |||
2. Options Requested by the client, i.e. content of the ORO | 2. Options Requested by the client, i.e. content of the ORO; | |||
3. Remote-ID, defined in [RFC4649] | 3. Remote-ID, defined in [RFC4649]; | |||
4. Relay-ID, defined in [RFC5460], section 5.4.1 | 4. Relay-ID, defined in [RFC5460], section 5.4.1; | |||
5. Link-layer address | 5. Link-layer address [RFC6939]; | |||
[I-D.ietf-dhc-dhcpv6-client-link-layer-addr-opt] | ||||
6. Any other options the updating partner deems useful. | 6. Any other options the updating partner deems useful. | |||
Receiving partner MAY store received additional information, but it | Receiving partner MAY store received additional information, but it | |||
MAY choose to ignore them as well. Some information may be useful, | MAY choose to ignore them as well. Some information may be useful, | |||
so it is a good idea to keep or update them. One reason is FQDN | so it is a good idea to keep or update it. One reason is FQDN | |||
information. A server SHOULD be prepared to clean up DNS information | information. A server SHOULD be prepared to clean up DNS information | |||
once the lease expires or is released. Another reason the partner | once the lease expires or is released. See Section Section 11 for | |||
may be interested in keepin additional data is a better support for | detailed discussion about Dynamic DNS. Another reason the partner | |||
may be interested in keeping additional data is a better support for | ||||
leasequery [RFC5007] or bulk leasequery [RFC5460], which features | leasequery [RFC5007] or bulk leasequery [RFC5460], which features | |||
queries based on Relay-ID, by link address and by Remote-ID. | queries based on Relay-ID, by link address and by Remote-ID. | |||
8.8. Receiving Binding Update | 8.8. Receiving Binding Update | |||
When a server receives a BNDUPD message, it needs to decide how to | When a server receives a BNDUPD message, it needs to decide how to | |||
process the binding update transaction it contains and whether that | process the binding update transaction it contains and whether that | |||
transaction represents a conflict of any sort. The conflict | transaction represents a conflict of any sort. The conflict | |||
resolution process MUST be used on the receipt of every BNDUPD | resolution process MUST be used on the receipt of every BNDUPD | |||
message, not just those that are received while in POTENTIAL-CONFLICT | message, not just those that are received while in POTENTIAL-CONFLICT | |||
skipping to change at page 28, line 25 | skipping to change at page 30, line 7 | |||
1. Two clients, one resource - This is the duplicate resource | 1. Two clients, one resource - This is the duplicate resource | |||
allocation conflict. There two different clients each allocated | allocation conflict. There two different clients each allocated | |||
the same resource. See Section 8.9. | the same resource. See Section 8.9. | |||
2. Two resources, one client conflict - This conflict exists when a | 2. Two resources, one client conflict - This conflict exists when a | |||
client on one server is associated with a one resource, and on | client on one server is associated with a one resource, and on | |||
the other server with a different resource in the same or related | the other server with a different resource in the same or related | |||
subnet. This does not refer to the case where a single client | subnet. This does not refer to the case where a single client | |||
has resources in multiple different subnets or administrative | has resources in multiple different subnets or administrative | |||
domains, but rather the case where on the same subnet the client | domains (i.e. a mobile client that changed its location), but | |||
has a lease on one IP address in one server and on a different IP | rather the case where on the same subnet the client has a lease | |||
address on the other server. | on one IP address in one server and on a different IP address on | |||
the other server. | ||||
This conflict may or may not be a problem for a given DHCP server | This conflict may or may not be a problem for a given DHCP server | |||
implementation and policy. If implementations and policies | implementation and policy. If implementations and policies | |||
allow, both resources can be assigned to a given client. In the | allow, both resources can be assigned to a given client. In the | |||
event that a DHCP server requires that a DHCP client have only | event that a DHCP server requires that a DHCP client have only | |||
one outstanding lease of a given type, the conflict MUST be | one outstanding lease of a given type, the conflict MUST be | |||
resolved by accepting the lease which has the latest CLTT. | resolved by accepting the lease which has the latest CLTT. | |||
It should be further clarified that DHCPv6 protocol makes | ||||
assignments based on (client DUID, resource type, iaid) triplet. | ||||
The possibility of using different IAIDs was omitted in this | ||||
paragraph for clarity. If one client is assigned multiple | ||||
resources of the same type, but with different IAIDs, there is no | ||||
conflict. Also, iaid values for different resource types are | ||||
orthogonal, i.e. IA_NA with iaid=1 is different than IA_PD with | ||||
iaid=1 and there is no conflict. | ||||
3. binding-status conflict - This is normal conflict, where one | 3. binding-status conflict - This is normal conflict, where one | |||
server is updating the other with newer information. See | server is updating the other with newer information. See | |||
Section 8.9 for details of how to resolve these conflicts. | Section 8.9 for details of how to resolve these conflicts. | |||
8.9. Conflict Resolution | 8.9. Conflict Resolution | |||
The server receiving a lease update from its partner must evaluate | The server receiving a lease update from its partner must evaluate | |||
the received lease information to see if it is consistent with | the received lease information to see if it is consistent with | |||
already known state and decide which information - the previously | already known state and decide which information - the previously | |||
known or that just received - is "better". The server should take | known or that just received - is "better". The server should take | |||
into consideration the following aspects: if the lease is already | into consideration the following aspects: if the lease is already | |||
assigned to a specific client, who had contact with client recently, | assigned to a specific client, who had contact with client recently, | |||
start time of the lease, etc. | start time of the lease, etc. | |||
When analyzing a BNDUPD message from a partner server, if there is | When analyzing a BNDUPD message from a partner server, if there is | |||
insufficient information in the BNDUPD to process it, then reject the | insufficient information in the BNDUPD to process it, then reject the | |||
BNDUPD with reject-reason 3: "Missing binding information". | BNDUPD with reject-reason "Missing binding information". | |||
If the resource in the BNDUPD is not a resource associated with the | If the resource in the BNDUPD is not a resource associated with the | |||
failover endpoint which received the BNDUPD message, then reject it | failover endpoint which received the BNDUPD message, then reject it | |||
with reject-reason 1: "Illegal IP address (not part of any address | with reject-reason "Illegal IP address or prefix (not part of any | |||
pool)". | address or prefix pool)". | |||
Every BNDUPD message SHOULD contain a client-last-transaction-time | Every BNDUPD message SHOULD contain a client-last-transaction-time | |||
option, which MUST, if it appears, be the time that the server last | option, which MUST, if it appears, be the time that the server last | |||
interacted with the DHCP client. It MUST NOT be, for instance, the | interacted with the DHCP client. It MUST NOT be, for instance, the | |||
time that the lease on an IP address expired. If there has been no | time that the lease on an IP address expired. If there has been no | |||
interaction with the DHCP client in question (or there is no DHCP | interaction with the DHCP client in question (or there is no DHCP | |||
client presently associated with this resource), then there will be | client presently associated with this resource), then there will be | |||
no client-last-transaction-time option in the BNDUPD message. | no client-last-transaction-time option in the BNDUPD message. | |||
The list in Figure 3 presents the conflict resolution outcome. To | The list in Figure 3 presents the conflict resolution outcome. To | |||
skipping to change at page 29, line 37 | skipping to change at page 31, line 32 | |||
for those rules that are listed with "time" -- if a BNDUPD doesn't | for those rules that are listed with "time" -- if a BNDUPD doesn't | |||
have a client-last-transaction-time value, then it MUST NOT be | have a client-last-transaction-time value, then it MUST NOT be | |||
considered later than the client-last-transaction-time in the | considered later than the client-last-transaction-time in the | |||
receiving server's binding. If the BNDUPD contains a client-last- | receiving server's binding. If the BNDUPD contains a client-last- | |||
transaction-time value and the receiving server's binding does not, | transaction-time value and the receiving server's binding does not, | |||
then the client-last-transaction-time value in the BNDUPD MUST be | then the client-last-transaction-time value in the BNDUPD MUST be | |||
considered later than the server's. | considered later than the server's. | |||
binding-status in received BNDUPD. | binding-status in received BNDUPD. | |||
binding-status | binding-status | |||
in receiving FREE RESET | in receiving FREE RESET | |||
server ACTIVE EXPIRED RELEASED FREE_BACKUP ABANDONED | server ACTIVE EXPIRED RELEASED FREE_BACKUP ABANDONED | |||
ACTIVE accept(5) time(2) time(1) time(2) accept | ACTIVE accept(5) time(2) time(1) time(2) accept | |||
EXPIRED time(1) accept accept accept accept | EXPIRED time(1) accept accept accept accept | |||
RELEASED time(1) time(1) accept accept accept | RELEASED time(1) time(1) accept accept accept | |||
FREE/FREE_BACKUP accept accept accept accept accept | FREE/FREE_BACKUP accept accept accept accept accept | |||
RESET time(3) accept accept accept accept | RESET time(3) accept accept accept accept | |||
ABANDONED reject(4) reject(4) reject(4) reject(4) accept | ABANDONED reject(4) reject(4) reject(4) reject(4) accept | |||
Figure 3: Conflict Resolution | Figure 3: Conflict Resolution | |||
time(1): If the client-last-transaction-time in the BNDUPD is later | time(1): If the client-last-transaction-time in the BNDUPD is later | |||
than the client-last-transaction-time in the receiving server's | than the client-last-transaction-time in the receiving server's | |||
binding, accept it, else reject it. | binding, accept it, else reject it. | |||
time(2): If the current time is later than the receiving servers' | time(2): If the current time is later than the receiving server's | |||
lease-expiration-time, accept it, else reject it. | lease-expiration-time, accept it, else reject it. | |||
time(3): If the client-last-transaction-time in the BNDUPD is later | time(3): If the client-last-transaction-time in the BNDUPD is later | |||
than the start-time-of-state in the receiving server's binding, | than the start-time-of-state in the receiving server's binding, | |||
accept it, else reject it. | accept it, else reject it. | |||
(1,2,3): If rejecting, use reject reason "Outdated binding | (1,2,3): If rejecting, use reject reason "Outdated binding | |||
information". | information". | |||
(4): Use reject reason "Less critical binding information". | (4): Use reject reason "Less critical binding information". | |||
skipping to change at page 30, line 33 | skipping to change at page 32, line 30 | |||
change the flag in a lease that says that it should be transmitted to | change the flag in a lease that says that it should be transmitted to | |||
the failover partner. If this flag is set, then it should be | the failover partner. If this flag is set, then it should be | |||
transmitted, but if it is not already set, the rejection of a lease | transmitted, but if it is not already set, the rejection of a lease | |||
state update SHOULD NOT trigger an automatic update of the failover | state update SHOULD NOT trigger an automatic update of the failover | |||
partner sending the rejected update. The potential for update storms | partner sending the rejected update. The potential for update storms | |||
is too great, and in the unusual case where the servers simply can't | is too great, and in the unusual case where the servers simply can't | |||
agree, that disagreement is better than an update storm. | agree, that disagreement is better than an update storm. | |||
8.10. Acknowledging Reception | 8.10. Acknowledging Reception | |||
Upon acceptance of a binding lease, server must notify its partner | ||||
that it updated its database. Server SHOULD NOT send BNDACK before | ||||
its database is updated. BNDACK MUST contain at lease minimum set of | ||||
information required to unabiguously identify BNDUDP. | ||||
9. Endpoint States | 9. Endpoint States | |||
9.1. State Machine Operation | 9.1. State Machine Operation | |||
Each server (or, more accurately, failover endpoint) can take on a | Each server (or, more accurately, failover endpoint) can take on a | |||
variety of failover states. These states play a crucial role in | variety of failover states. These states play a crucial role in | |||
determining the actions that a server will perform when processing a | determining the actions that a server will perform when processing a | |||
request from a DHCPv6 client as well as dealing with changing | request from a DHCPv6 client as well as dealing with changing | |||
external conditions (e.g., loss of connection to a failover partner). | external conditions (e.g., loss of connection to a failover partner). | |||
The failover state in which a server is running controls the | The failover state in which a server is running controls the | |||
following behaviors: | following behaviors: | |||
o Responsiveness -- the server is either responsive to DHCPv6 client | o Responsiveness -- the server is either responsive to DHCPv6 client | |||
requests or it is not. | requests or it is not. | |||
o Allocation Pool -- which pool of addresses (or prefixes) can be | o Allocation Pool -- which pool of addresses (or prefixes) can be | |||
used for allocation on receipt of a SOLICIT message. | used for advertisement on receipt of a SOLICIT or allocation on | |||
receipt of a REQUEST message. | ||||
o MCLT -- ensure that valid lifetimes are not beyond what the | o MCLT -- ensure that valid lifetimes are not beyond what the | |||
partner has acked plus the MCLT (or not). | partner has acked plus the MCLT (or not). | |||
A server will transition from one failover state to another based on | A server will transition from one failover state to another based on | |||
the specific values held by the following state variables: | the specific values held by the following state variables: | |||
o Current failover state. | o Current failover state. | |||
o Communications status (OK or not OK). | o Communications status (OK or not OK). | |||
o Partner's failover state (if known). | o Partner's failover state (if known). | |||
Several events can cause the transition from one failover state to | Several events can cause the transition from one failover state to | |||
another. | another. | |||
o Change in communications status (OK or not OK). | o Change in communications status (OK or not OK); | |||
o Change in partner's failover state. | o Change in partner's failover state; | |||
o Receipt of particular messages. | o Explicit administrative action; | |||
o Receipt of particular messages; | ||||
o Expiration of timers. | o Expiration of timers. | |||
Whenever either of the last two of the above state variables changes | Whenever either of the last two of the above state variables changes | |||
state, the state machine is invoked, which may then trigger a change | state, the state machine is invoked, which may then trigger a change | |||
in the current failove state. Thus, whenever the communications | in the current failove state. Thus, whenever the communications | |||
status changes, the state machine processing is invoked. This may or | status changes, the state machine processing is invoked. This may or | |||
may not result in a change in the current failover state. | may not result in a change in the current failover state. | |||
Whenever a server transitions to a new failover state, the new state | Whenever a server transitions to a new failover state, the new state | |||
skipping to change at page 32, line 5 | skipping to change at page 34, line 5 | |||
the communications status is OK. In addition, whenever a server | the communications status is OK. In addition, whenever a server | |||
makes a transition into a new state, it MUST record the new state, | makes a transition into a new state, it MUST record the new state, | |||
its current understanding of its partner's state, and the time at | its current understanding of its partner's state, and the time at | |||
which it entered the new state in stable storage. | which it entered the new state in stable storage. | |||
The following state transition diagram gives a condensed view of the | The following state transition diagram gives a condensed view of the | |||
state machine. If there is a difference between the words describing | state machine. If there is a difference between the words describing | |||
a particular state and the diagram below, the words should be | a particular state and the diagram below, the words should be | |||
considered authoritative. | considered authoritative. | |||
+---------------+ V +--------------+ | In the state transition diagram below, the "+" or "-" in the upper | |||
| RECOVER -|+| | | STARTUP - | | right corner of each state is a notation about whether communication | |||
|(unresponsive) | +->+(unresponsive)| | is ongoing with the other server. | |||
+------+--------+ +--------------+ | ||||
+-Comm. OK +-----------------+ | +---------------+ V +--------------+ | |||
| Other State: | PARTNER DOWN - +<----------------------+ | | RECOVER -|+| | | STARTUP - | | |||
| RESOLUTION-INTER. | (responsive) | ^ | |(unresponsive) | +->+(unresponsive)| | |||
All POTENTIAL- +----+------------+ | | +------+--------+ +--------------+ | |||
Others CONFLICT------------ | --------+ | | +-Comm. OK +-----------------+ | |||
| CONFLICT-DONE Comm. OK | +--------------+ | | | Other State: | PARTNER DOWN - +<---------------------+ | |||
UPDREQ or Other State: | +--+ RESOLUTION - | | | | RESOLUTION-INTER. | (responsive) | ^ | |||
UPDREQALL | | | | | INTERRUPTED | | | All POTENTIAL- +----+------------+ | | |||
Rcv UPDDONE RECOVER All | | | (responsive) | | | Others CONFLICT------------ | --------+ | | |||
| +---------------+ | Others | | +------------+-+ | | | CONFLICT-DONE Comm. OK | +--------------+ | | |||
+->+RECOVER-WAIT +-| RECOVER | | | ^ | | | UPDREQ or Other State: | +--+ RESOLUTION - | | | |||
|(unresponsive) | WAIT or | | Comm. | Ext. | | UPDREQALL | | | | | INTERRUPTED | | | |||
+-----------+---+ DONE | | OK Comm. Cmd----->+ | Rcv UPDDONE RECOVER All | | | (responsive) | | | |||
Comm.---+ Wait MCLT | V V V Failed | | | +---------------+ | Others | | +------------+-+ | | |||
Changed | V +---+ +---+-----+--+-+ | | | +->+RECOVER-WAIT +-| RECOVER | | | ^ | | | |||
| +---+----------++ | | POTENTIAL + +-------+ | | |(unresponsive) | WAIT or | | Comm. | Ext. | | |||
| |RECOVER-DONE +-| Wait | CONFLICT +------+ | | +-----------+---+ DONE | | OK Comm. Cmd---->+ | |||
+->+(unresponsive) | for |(unresponsive)| Primary | | Comm.---+ Wait MCLT | V V V Failed | | |||
+------+--------+ Other +>+----+--------++ resolve Comm. | | Changed | V +---+ +---+-----+--+-+ | | | |||
Comm. OK State: | | ^ conflict Changed | | | +---+----------++ | | POTENTIAL + +-------+ | | |||
+---Other State:-+ RECOVER | Secondary | V V | | | | |RECOVER-DONE +-| Wait | CONFLICT +------+ | | |||
| | | DONE | resolve | ++----------+---++ | | +->+(unresponsive) | for |(unresponsive)| Primary | | |||
| All Others: POTENT. | | conflict | |CONFLICT-DONE-|+| | | +------+--------+ Other +>+----+--------++ resolve Comm. | | |||
| Wait for CONFLICT- | ----+ see (9.10) | | (responsive) | | | Comm. OK State: | | ^ conflict Changed| | |||
| Other State: V V | +------+---------+ | | +---Other State:-+ RECOVER | Secondary | V V | | | |||
| NORMAL or RECOVER ++------------+---+ Other State: NORMAL | | | | | DONE | resolve | ++----------+---++ | | |||
| | DONE | NORMAL + +<--------------+ | | | All Others: POTENT. | | conflict | |CONFLICT-DONE-|+| | | |||
| +--+----------+-->+ (balanced) +-------External Command--->+ | | Wait for CONFLICT--|-----+ | | | (responsive) | | | |||
| ^ ^ +--------+--------+ | | | Other State: V V | +------+---------+ | | |||
| | | | | | | | NORMAL or RECOVER ++------------+---+ | Other State: NORMAL | | |||
| Wait for Comm. OK Comm. Failed | | | | | DONE | NORMAL + +<--------------+ | | |||
| Other Other | External | | +--+----------+-->+ (balanced) +-------External Command-->+ | |||
| State: State: | | Command | | ^ ^ +--------+--------+ | | |||
| RECOVER-DONE NORMAL Start Safe Comm. OK or | | | | | | | | |||
| | COMM. INT. Period Timer Other State: Safe | | Wait for Comm. OK Comm. Failed | | | |||
| Comm. OK. | V All Others Period | | Other Other | | External | |||
| Other State: | +---------+--------+ | expiration | | State: State: | | Command | |||
| RECOVER +--+ COMMUNICATIONS - +----+ | | | RECOVER-DONE NORMAL Start Safe Comm. OK or | |||
| +-------------+ INTERRUPTED | | | | | COMM. INT. Period Timer Other State: Safe | |||
RECOVER | (responsive) +-------------------------->+ | | Comm. OK. | V All Others Period | |||
RECOVER-WAIT--------->+------------------+ | | Other State: | +---------+--------+ | expiration | |||
| RECOVER +--+ COMMUNICATIONS - +----+ | | ||||
| +-------------+ INTERRUPTED | | | ||||
RECOVER | (responsive) +------------------------->+ | ||||
RECOVER-WAIT--------->+------------------+ | ||||
Figure 4: Failover Endpoint State Machine | Figure 4: Failover Endpoint State Machine | |||
9.2. State Machine Initialization | 9.2. State Machine Initialization | |||
The state machine is characterized by storage (in stable storage) of | The state machine is characterized by storage (in stable storage) of | |||
at least the following information: | at least the following information: | |||
o Current failover state. | o Current failover state. | |||
skipping to change at page 34, line 43 | skipping to change at page 36, line 48 | |||
available. In this case, the newly commissioned failover server will | available. In this case, the newly commissioned failover server will | |||
not operate until its partner comes online -- but it has operational | not operate until its partner comes online -- but it has operational | |||
responsibilities as a DHCP server nonetheless. To properly handle | responsibilities as a DHCP server nonetheless. To properly handle | |||
this situation, a server SHOULD be configurable in such a way as to | this situation, a server SHOULD be configurable in such a way as to | |||
move directly into PARTNER-DOWN state after the startup period | move directly into PARTNER-DOWN state after the startup period | |||
expires if it has been unable to contact its partner during the | expires if it has been unable to contact its partner during the | |||
startup period. | startup period. | |||
Step 2: | Step 2: | |||
If the previous state is one where communications was "OK", then set | Implementations will differ in the ways that they deal with the state | |||
the previous state to the state that is the result of the | machine for failover endpoint states. In many cases, state | |||
communications failed state transition (if such transition exists -- | transitions will occur when communications goes from "OK" to failoed, | |||
some states don't have a communications failed state transition, | or from failed to "OK", and some implementations will implement a | |||
since they allow both communications OK and failed). | portion of their state machine processing based on these changes. | |||
In these cases, during startup, if the previous state is one where | ||||
communications was "OK", then set the previous state to the state | ||||
that is the result of the communications failed state transition when | ||||
in that state (if such transition exists -- some states don't have a | ||||
communications failed state transition, since they allow both | ||||
communications OK and failed). | ||||
Step 3: | Step 3: | |||
Start the STARTUP state timer. The time that a server remains in the | Start the STARTUP state timer. The time that a server remains in the | |||
STARTUP state (absent any communications with its partner) is | STARTUP state (absent any communications with its partner) is | |||
implementation dependent but SHOULD be short. It SHOULD be long | implementation dependent but SHOULD be short. It SHOULD be long | |||
enough for a TCP connection to be created to a heavily loaded partner | enough for a TCP connection to be created to a heavily loaded partner | |||
across a slow network. | across a slow network. | |||
Step 4: | Step 4: | |||
skipping to change at page 35, line 44 | skipping to change at page 38, line 12 | |||
If the startup time expires the server SHOULD transition to the | If the startup time expires the server SHOULD transition to the | |||
PREVIOUS-STATE. | PREVIOUS-STATE. | |||
9.4. PARTNER-DOWN State | 9.4. PARTNER-DOWN State | |||
PARTNER-DOWN state is a state either server can enter. When in this | PARTNER-DOWN state is a state either server can enter. When in this | |||
state, the server assumes that it is the only server operating and | state, the server assumes that it is the only server operating and | |||
serving the client base. If one server is in PARTNER-DOWN state, the | serving the client base. If one server is in PARTNER-DOWN state, the | |||
other server MUST NOT be operating. | other server MUST NOT be operating. | |||
A server can enter PARTNER-DOWN state either as a result of operator | ||||
intervention (when an operator determines that the server's partner | ||||
is, indeed, down), or as a result of the auto-partner-down capability | ||||
where PARTNER-DOWN state is entered automatically after a server has | ||||
been in COMMUNICATIONS-INTERRUPTED state for a pre-determined period | ||||
of time. | ||||
9.4.1. Operation in PARTNER-DOWN State | 9.4.1. Operation in PARTNER-DOWN State | |||
The server MUST be responsive in PARTNER-DOWN state. | The server MUST be responsive in PARTNER-DOWN state, regardess if it | |||
is primary or secondary. | ||||
It will allow renewal of all outstanding leases on IP addresses. For | It will allow renewal of all outstanding leases on addresses or | |||
those IP addresses for which the server is using proportional | prefixes. For those resources for which the server is using | |||
allocation, it will allocate IP addresses from its own pool, and | proportional allocation, it will allocate resources from its own | |||
after a fixed period of time (the MCLT interval) has elapsed from | pool, and after a fixed period of time (the MCLT interval) has | |||
entry into PARTNER-DOWN state, it will allocate IP addresses from the | elapsed from entry into PARTNER-DOWN state, it may allocate IP | |||
set of all available IP addresses. | addresses from the set of all available pools. Server SHOULD fully | |||
deplete its own pool, before starting allocations from its downed | ||||
partner. | ||||
Any IP address tagged as available for allocation by the other server | Any resource tagged as available for allocation by the other server | |||
(at entry to PARTNER-DOWN state) MUST NOT be allocated to a new | (at entry to PARTNER-DOWN state) MUST NOT be allocated to a new | |||
client until the maximum-client-lead-time beyond the entry into | client until the MCLT beyond the entry into PARTNER-DOWN state has | |||
PARTNER-DOWN state has elapsed. | elapsed. | |||
A server in PARTNER-DOWN state MUST NOT allocate an IP address to a | A server in PARTNER-DOWN state MUST NOT allocate a resource to a DHCP | |||
DHCP client different from that to which it was allocated at the | client different from that to which it was allocated at the entrance | |||
entrance to PARTNER-DOWN state until the maximum-client-lead-time | to PARTNER-DOWN state until the MCLT beyond the maximum of the | |||
beyond the maximum of the following times: client expiration time, | following times: client expiration time, most recently transmitted | |||
most recently transmitted potential-expiration-time, most recently | potential-expiration-time, most recently received ack of potential- | |||
received ack of potential-expiration-time from the partner, and most | expiration-time from the partner, and most recently acked potential- | |||
recently acked potential-expiration-time to the partner. If this | expiration-time to the partner. If this time would be earlier than | |||
time would be earlier than the current time plus the maximum-client- | the current time plus the maximum-client-lead-time, then the time the | |||
lead-time, then the time the server entered PARTNER-DOWN state plus | server entered PARTNER-DOWN state plus the maximum-client-lead-time | |||
the maximum-client-lead-time is used. | is used. | |||
The server is not restricted by the MCLT when offering lease times | The server is not restricted by the MCLT when offering lease times | |||
while in PARTNER-DOWN state. | while in PARTNER-DOWN state. | |||
In the unlikely case, when there are two servers operating in a | In the unlikely case, when there are two servers operating in a | |||
PARTNER-DOWN state, there is a chance of duplicate leases assigned. | PARTNER-DOWN state, there is a chance of duplicate leases assigned. | |||
This leads to a POTENTIAL-CONFLICT (unresponsive) state when they re- | This leads to a POTENTIAL-CONFLICT (unresponsive) state when they re- | |||
establish contact. The duplicate lease issue can be postponed to a | establish contact. The duplicate lease issue can be postponed to a | |||
large extent by the server granting new leases first from its own | large extent by the server granting new leases first from its own | |||
pool. Therefore the server operating in PARTNER-DOWN state MUST use | pool. Therefore the server operating in PARTNER-DOWN state MUST use | |||
skipping to change at page 40, line 27 | skipping to change at page 42, line 21 | |||
the server will transition into RECOVER-DONE state. | the server will transition into RECOVER-DONE state. | |||
This is to allow any IP addresses that were allocated by this server | This is to allow any IP addresses that were allocated by this server | |||
prior to loss of its client binding information in stable storage to | prior to loss of its client binding information in stable storage to | |||
contact the other server or to time out. | contact the other server or to time out. | |||
If this is the first time this server has run failover -- as | If this is the first time this server has run failover -- as | |||
determined by the information received from the partner, not | determined by the information received from the partner, not | |||
necessarily only as determined by this server's stable storage (as | necessarily only as determined by this server's stable storage (as | |||
that may have been lost), then the waiting time discussed above may | that may have been lost), then the waiting time discussed above may | |||
be skipped, and the server may transition immediately to RECOVER-DONE | be skipped, and the server MAY transition immediately to RECOVER-DONE | |||
state. | state. | |||
If the server has never before run failover, then there is no need to | If the server has never before run failover, then there is no need to | |||
wait in this state -- but, again, to determine if this server has run | wait in this state -- but, again, to determine if this server has run | |||
failover it is vital that the information provided by the partner be | failover it is vital that the information provided by the partner be | |||
utilized, since the stable storage of this server may have been lost. | utilized, since the stable storage of this server may have been lost. | |||
If communications fails while a server is in RECOVER-WAIT state, it | If communications fails while a server is in RECOVER-WAIT state, it | |||
has no effect on the operation of this state. The server SHOULD | has no effect on the operation of this state. The server SHOULD | |||
continue to operate its timer, and the timer expires during the | continue to operate its timer, and the timer expires during the | |||
skipping to change at page 41, line 7 | skipping to change at page 42, line 46 | |||
timer. | timer. | |||
9.7. RECOVER-DONE State | 9.7. RECOVER-DONE State | |||
This state exists to allow an interlocked transition for one server | This state exists to allow an interlocked transition for one server | |||
from RECOVER state and another server from PARTNER-DOWN or | from RECOVER state and another server from PARTNER-DOWN or | |||
COMMUNICATIONS-INTERRUPTED state into NORMAL state. | COMMUNICATIONS-INTERRUPTED state into NORMAL state. | |||
9.7.1. Operation in RECOVER-DONE State | 9.7.1. Operation in RECOVER-DONE State | |||
A server in RECOVER-DONE state MUST respond only to DHCPREQUEST/ | A server in RECOVER-DONE state MUST respond only to RENEW, REBIND, | |||
RENEWAL and DHCPREQUEST/REBINDING DHCP messages. | CONFIRM and INFORMATION-REQUEST client messages. | |||
9.7.2. Transition Out of RECOVER-DONE State | 9.7.2. Transition Out of RECOVER-DONE State | |||
When a server in RECOVER-DONE state determines that its partner | When a server in RECOVER-DONE state determines that its partner | |||
server has entered NORMAL or RECOVER-DONE state, then it will | server has entered NORMAL or RECOVER-DONE state, then it will | |||
transition into NORMAL state. | transition into NORMAL state. | |||
If communications fails while in RECOVER-DONE state, a server will | If communication fails while in RECOVER-DONE state, a server will | |||
stay in RECOVER-DONE state. | stay in RECOVER-DONE state. | |||
9.8. NORMAL State | 9.8. NORMAL State | |||
NORMAL state is the state used by a server when it is communicating | NORMAL state is the state used by a server when it is communicating | |||
with the other server, and any required resynchronization has been | with the other server, and any required resynchronization has been | |||
performed. While some bindings database synchronization is performed | performed. While some bindings database synchronization is performed | |||
in NORMAL state, potential conflicts are resolved prior to entry into | in NORMAL state, potential conflicts are resolved prior to entry into | |||
NORMAL state as is binding database data loss. | NORMAL state as is binding database data loss. | |||
When entering NORMAL state, a server will send to the other server | When entering NORMAL state, a server will send to the other server | |||
all currently unacknowledged binding updates as BNDUPD messages. | all currently unacknowledged binding updates as BNDUPD messages. | |||
When the above process is complete, if the server entering NORMAL | When the above process is complete, if the server entering NORMAL | |||
state is a secondary server, then it will request IP addresses for | state is a secondary server, then it will request resources | |||
allocation using the POOLREQ message. | (addresses and/or prefixes) for allocation using the POOLREQ message. | |||
9.8.1. Operation in NORMAL State | 9.8.1. Operation in NORMAL State | |||
When in NORMAL state a server will operate in the following manner: | Primary server is responsive in NORMAL state. Secondary is | |||
unresponsive in NORMAL state. | ||||
When in NORMAL state a primary server will operate in the following | ||||
manner: | ||||
Lease time calculations | Lease time calculations | |||
As discussed in Section 8.4, the lease interval given to a DHCP | As discussed in Section 8.4, the lease interval given to a DHCP | |||
client can never be more than the MCLT greater than the most | client can never be more than the MCLT greater than the most | |||
recently received potential- expiration-time from the failover | recently received potential-expiration-time from the failover | |||
partner or the current time, whichever is later. | partner or the current time, whichever is later. | |||
As long as a server adheres to this constraint, the specifics of | As long as a server adheres to this constraint, the specifics of | |||
the lease interval that it gives to a DHCP client or the value of | the lease interval that it gives to a DHCP client or the value of | |||
the potential-expiration-time sent to its failover partner are | the potential-expiration-time sent to its failover partner are | |||
implementation dependent. | implementation dependent. | |||
Lazy update of partner server | Lazy update of partner server | |||
After sending an REPLY that includes lease update to a client, the | After sending an REPLY that includes lease update to a client, the | |||
server servicing a DHCP client request attempts to update its | server servicing a DHCP client request attempts to update its | |||
partner with the new binding information. Server transmits both | partner with the new binding information. Server transmits both | |||
desired valid lifetime and actual valid lifetime. | desired valid lifetime and actual valid lifetime. | |||
Reallocation of IP addresses between clients | Reallocation of resources between clients | |||
Whenever a client binding is released or expires, a BNDUPD mes- | Whenever a client binding is released or expires, a BNDUPD message | |||
sage must be sent to the partner, setting the binding state to | must be sent to the partner, setting the binding state to RELEASED | |||
RELEASED or EXPIRED. However, until a BNDACK is received for this | or EXPIRED. However, until a BNDACK is received for this message, | |||
message, the IP address cannot be allocated to another client. It | the resource cannot be allocated to another client. It cannot be | |||
cannot be allocated to the same client again if a BNDUPD was sent, | allocated to the same client again if a BNDUPD was sent, otherwise | |||
otherwise it can. See Section 8.6. | it can. See Section 8.6 for details. | |||
In normal state, each server receives binding updates from its | In NORMAL state, each server receives binding updates from its | |||
partner server in BNDUPD messages. It records these in its client | partner server in BNDUPD messages. It records these in its client | |||
binding database in stable storage and then sends a corresponding | binding database in stable storage and then sends a corresponding | |||
BNDACK message to its partner server. | BNDACK message to its partner server. | |||
9.8.2. Transition Out of NORMAL State | 9.8.2. Transition Out of NORMAL State | |||
If an external command is received by a server in NORMAL state | If an external command is received by a server in NORMAL state | |||
informing it that its partner is down, then transition into PARTNER- | informing it that its partner is down, then transition into PARTNER- | |||
DOWN state. Generally, this would be an unusual situation, where | DOWN state. Generally, this would be an unusual situation, where | |||
some external agency knew the partner server was down. Using the | some external agency knew the partner server was down. Using the | |||
command in this case would be appropriate if the polling interval and | command in this case would be appropriate if the polling interval and | |||
timeout were long. | timeout were long. | |||
If a server in NORMAL state fails to receive acks to messages sent to | If a server in NORMAL state fails to receive acks to messages sent to | |||
its partner for an implementation dependent period of time, it MAY | its partner for an implementation dependent period of time, it MAY | |||
move into COMMUNICATIONS-INTERRUPTED state. This situation might | move into COMMUNICATIONS-INTERRUPTED state. This situation might | |||
occur if the partner server was capable of maintaining the TCP con- | occur if the partner server was capable of maintaining the TCP con- | |||
nection between the server and also capable of sending a CONTACT mes- | nection between the server and also capable of sending a CONTACT mes- | |||
sage every tSend seconds, but was (for some reason) incapable of pro- | sage periodically, but was (for some reason) incapable of pro- | |||
cessing BNDUPD messages. | cessing BNDUPD messages. | |||
If the communications is determined to not be "ok" (as defined in | If the communications is determined to not be "ok" (as defined in | |||
Section 8.5), then transition into COMMUNICATIONS-INTERRUPTED state. | Section 8.5), then transition into COMMUNICATIONS-INTERRUPTED state. | |||
If a server in NORMAL state receives any messages from its partner | If a server in NORMAL state receives any messages from its partner | |||
where the partner has changed state from that expected by the server | where the partner has changed state from that expected by the server | |||
in NORMAL state, then the server should transition into | in NORMAL state, then the server should transition into | |||
COMMUNICATIONS-INTERRUPTED state and take the appropriate state tran- | COMMUNICATIONS-INTERRUPTED state and take the appropriate state tran- | |||
sition from there. For example, it would be expected for the partner | sition from there. For example, it would be expected for the partner | |||
skipping to change at page 43, line 12 | skipping to change at page 45, line 7 | |||
partner, the server should transition into COMMUNICATIONS-INTERRUPTED | partner, the server should transition into COMMUNICATIONS-INTERRUPTED | |||
state. | state. | |||
9.9. COMMUNICATIONS-INTERRUPTED State | 9.9. COMMUNICATIONS-INTERRUPTED State | |||
A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is | A server goes into COMMUNICATIONS-INTERRUPTED state whenever it is | |||
unable to communicate with its partner. Primary and secondary | unable to communicate with its partner. Primary and secondary | |||
servers cycle automatically (without administrative intervention) | servers cycle automatically (without administrative intervention) | |||
between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network | between NORMAL and COMMUNICATIONS-INTERRUPTED state as the network | |||
connection between them fails and recovers, or as the partner server | connection between them fails and recovers, or as the partner server | |||
cycles between operational and non-operational. No duplicate IP | cycles between operational and non-operational. No duplicate | |||
address allocation can occur while the servers cycle between these | resource allocation can occur while the servers cycle between these | |||
states. | states. | |||
When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been | When a server enters COMMUNICATIONS-INTERRUPTED state, if it has been | |||
configured to support an automatic transition out of COMMUNICATIONS- | configured to support an automatic transition out of COMMUNICATIONS- | |||
INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period" | INTERRUPTED state and into PARTNER-DOWN state (i.e., a "safe period" | |||
has been configured, see section 10), then a timer MUST be started | has been configured, see section TODO), then a timer MUST be started | |||
for the length of the configured safe period. | for the length of the configured safe period. | |||
A server transitioning into the COMMUNICATIONS-INTERRUPTED state from | A server transitioning into the COMMUNICATIONS-INTERRUPTED state from | |||
the NORMAL state SHOULD raise some alarm condition to alert | the NORMAL state SHOULD raise some alarm condition to alert | |||
administrative staff to a potential problem in the DHCP subsystem. | administrative staff to a potential problem in the DHCP subsystem. | |||
9.9.1. Operation in COMMUNICATIONS-INTERRUPTED State | 9.9.1. Operation in COMMUNICATIONS-INTERRUPTED State | |||
In this state a server MUST respond to all DHCP client requests. | In this state a server MUST respond to all DHCP client requests. | |||
When allocating new leases, each server allocates from its own pool, | When allocating new leases, each server allocates from its own pool, | |||
skipping to change at page 43, line 45 | skipping to change at page 45, line 40 | |||
given out by the receiving server or not, although the renewal period | given out by the receiving server or not, although the renewal period | |||
MUST NOT exceed the maximum client lead time (MCLT) beyond the latest | MUST NOT exceed the maximum client lead time (MCLT) beyond the latest | |||
of: 1) the potential valid lifetime already acknowledged by the other | of: 1) the potential valid lifetime already acknowledged by the other | |||
server, or 2) the actual valid lifetime sent to the DHCPv6 client, or | server, or 2) the actual valid lifetime sent to the DHCPv6 client, or | |||
3) the potential valid lifetime received from the partner server. | 3) the potential valid lifetime received from the partner server. | |||
However, since the server cannot communicate with its partner in this | However, since the server cannot communicate with its partner in this | |||
state, the acknowledged potential valid lifetime will not be updated | state, the acknowledged potential valid lifetime will not be updated | |||
in any new bindings. This is likely to eventually cause the actual | in any new bindings. This is likely to eventually cause the actual | |||
valid lifetimes to be the current time plus the MCLT (unless this is | valid lifetimes to be the current time plus the MCLT (unless this is | |||
greater than the desired-client-lease- time). | greater than the desired-client-lease-time). | |||
The server should continue to try to establish a connection with its | The server should continue to try to establish a connection with its | |||
partner. | partner. | |||
9.9.2. Transition Out of COMMUNICATIONS-INTERRUPTED State | 9.9.2. Transition Out of COMMUNICATIONS-INTERRUPTED State | |||
If the safe period timer expires while a server is in the | If the safe period timer expires while a server is in the | |||
COMMUNICATIONS-INTERRUPTED state, it will transition immediately into | COMMUNICATIONS-INTERRUPTED state, it will transition immediately into | |||
PARTNER-DOWN state. | PARTNER-DOWN state. | |||
skipping to change at page 48, line 44 | skipping to change at page 50, line 6 | |||
If communications is restored with the other server, then the server | If communications is restored with the other server, then the server | |||
in RESOLUTION-INTERRUPTED state will transition into POTENTIAL- | in RESOLUTION-INTERRUPTED state will transition into POTENTIAL- | |||
CONFLICT state. | CONFLICT state. | |||
9.12. CONFLICT-DONE State | 9.12. CONFLICT-DONE State | |||
This state indicates that during the process where the two servers | This state indicates that during the process where the two servers | |||
are attempting to re-integrate with each other, the primary server | are attempting to re-integrate with each other, the primary server | |||
has received all of the updates from the secondary server. It make a | has received all of the updates from the secondary server. It make a | |||
transition into CONFLICT-DONE state in order that it may be totally | transition into CONFLICT-DONE state in order that it may be totally | |||
responsive to the client load, as opposed to NORMAL state where it | responsive to the client load. There is no operational difference | |||
would be in a "balanced" responsive state, running the load balancing | between CONFLICT-DONE and NORMAL for primary as in both states it | |||
algorithm. | responds to all clients' requests. The distinction between CONFLICT- | |||
DONE and NORMAL states will be more apparent when load balancing | ||||
extension will be defined. | ||||
9.12.1. Operation in CONFLICT-DONE State | 9.12.1. Operation in CONFLICT-DONE State | |||
A primary server in CONFLICT-DONE state is fully responsive to all | A primary server in CONFLICT-DONE state is fully responsive to all | |||
DHCP clients (similar to the situation in COMMUNICATIONS-INTERRUPTED | DHCP clients (similar to the situation in COMMUNICATIONS-INTERRUPTED | |||
state). | state). | |||
If communications fails, remain in CONFLICT-DONE state. If | If communications fails, remain in CONFLICT-DONE state. If | |||
communications becomes OK, remain in CONFLICT-DONE state until the | communications becomes OK, remain in CONFLICT-DONE state until the | |||
conditions for transition out become satisfied. | conditions for transition out become satisfied. | |||
skipping to change at page 49, line 46 | skipping to change at page 51, line 11 | |||
Discussion: Do DHCPv6 clients actually do this? DHCPv4 clients were | Discussion: Do DHCPv6 clients actually do this? DHCPv4 clients were | |||
rumored to wait for a "while" to accept the best offer, but to a | rumored to wait for a "while" to accept the best offer, but to a | |||
first approximation, they all take the first offer they receive that | first approximation, they all take the first offer they receive that | |||
is even acceptable. | is even acceptable. | |||
The benefit of this approach, compared to the "basic" active--passive | The benefit of this approach, compared to the "basic" active--passive | |||
solution is that there is no delay between primary failure and the | solution is that there is no delay between primary failure and the | |||
moment when secondary starts serving requests. | moment when secondary starts serving requests. | |||
Discussion: The possibility of setting both servers preference to an | ||||
equal value could theoretically work as a crude attempt to provide | ||||
load balancing. It wouldn't do much good on its own, as one (faster) | ||||
server could be chosen more frequently (assuming that with equal | ||||
preference sets clients will pick first responding server, which is | ||||
not mandated by DHCPv6). We could design a simple mechanism of | ||||
dynamically updating preference depending on usage of available | ||||
resources. This concept hasn't been investigated in detail yet. | ||||
11. Dynamic DNS Considerations | 11. Dynamic DNS Considerations | |||
DHCP servers (and clients) can use DNS Dynamic Updates as described | DHCP servers (and clients) can use DNS Dynamic Updates as described | |||
in RFC 2136 [RFC2136] to maintain DNS name-mappings as they maintain | in RFC 2136 [RFC2136] to maintain DNS name-mappings as they maintain | |||
DHCP leases. Many different administrative models for DHCP-DNS | DHCP leases. Many different administrative models for DHCP-DNS | |||
integration are possible. Descriptions of several of these models, | integration are possible. Descriptions of several of these models, | |||
and guidelines that DHCP servers and clients should follow in | and guidelines that DHCP servers and clients should follow in | |||
carrying them out, are laid out in RFC 4704 [RFC4704]. | carrying them out, are laid out in RFC 4704 [RFC4704]. | |||
The nature of the failover protocol introduces some issues concerning | The nature of the failover protocol introduces some issues concerning | |||
skipping to change at page 56, line 19 | skipping to change at page 57, line 22 | |||
resource is no longer reserved. | resource is no longer reserved. | |||
13. Security Considerations | 13. Security Considerations | |||
DHCPv6 failover is an extension of a standard DHCPv6 protocol, so all | DHCPv6 failover is an extension of a standard DHCPv6 protocol, so all | |||
security considerations from [RFC3315], Section 23 and [RFC3633], | security considerations from [RFC3315], Section 23 and [RFC3633], | |||
Section 15 related to the server apply. | Section 15 related to the server apply. | |||
As traffic exchange between clients and server is not encrypted, an | As traffic exchange between clients and server is not encrypted, an | |||
attacker than penetrated the network and is able to intercept | attacker than penetrated the network and is able to intercept | |||
traffic, will not gain anything by also sniffing communication | traffic, will not gain any additional information by also sniffing | |||
between partners. | communication between partners. | |||
An attacker that can impersonate one partner can efficiently perform | An attacker that is able to impersonate one partner can efficiently | |||
a denial of service attack on the remaining uncompromised server. | perform a denial of service attack on the remaining uncompromised | |||
Several techniques may be used: pretending that conflict resolution | server. Several techniques may be used: pretending that conflict | |||
is required, requesting rebalance, claming that a valid lease was | resolution is required, requesting rebalance, claming that a valid | |||
released or declined etc. For that reason the communication between | lease was released or declined etc. For that reason the | |||
servers SHOULD support failover connections over TLS, as explained in | communication between servers SHOULD support failover connections | |||
Section Section 5.1. Such secure connection SHOULD be optional and | over TLS, as explained in Section Section 5.1. Such secure | |||
configurable by the administrator. | connection SHOULD be optional and configurable by the administrator. | |||
A server MUST NOT operate in PARTNER-DOWN if its partner is up. | ||||
Network administrator is expected to switch remaining active server | ||||
to PARTNER-DOWN state only if he or she is sure that the other server | ||||
is indeed down. Failing to obey this requirement will result in both | ||||
servers likely assigning duplicate leases to different clients. | ||||
Implementors should take that into consideration if they decide to | ||||
implement timer-based transition to PARTNER-DOWN state. | ||||
Running a network protected by DHCPv6 failover requires more | ||||
resources than running without it. In particular some of the | ||||
resources are allocated to the secondary server and they are not | ||||
usable in a normal (i.e. non failures) operation. While limiting | ||||
this pool may be preferable from resource utilisation perspective, it | ||||
must be reasonably large pool, so the secondary may take over once | ||||
primary becomes unavailable. | ||||
TODO: Security considerations section contains loose notes and will | TODO: Security considerations section contains loose notes and will | |||
be transformed into consistent text once the core design solidifies. | be transformed into consistent text once the core design solidifies. | |||
14. IANA Considerations | 14. IANA Considerations | |||
IANA is not requested to perform any actions at this time. | IANA is not requested to perform any actions at this time. | |||
15. Acknowledgements | 15. Acknowledgements | |||
This document extensively uses concepts, definitions and other parts | This document extensively uses concepts, definitions and other parts | |||
of [dhcpv4-failover] document. Authors would like to thank Shawn | of [dhcpv4-failover] document. Authors would like to thank Shawn | |||
Routher, Greg Rabil, and Bernie Volz for their significant | Routher, Greg Rabil, and Bernie Volz for their significant | |||
involvement and contributions. Authors would like to thank | involvement and contributions. Authors would like to thank Marcin | |||
VithalPrasad Gaitonde for his insightful comments. | Siodelski for his thorough review and VithalPrasad Gaitonde for his | |||
insightful comments. | ||||
This work has been partially supported by Department of Computer | This work has been partially supported by Department of Computer | |||
Communications (a division of Gdansk University of Technology) and | Communications (a division of Gdansk University of Technology) and | |||
the Polish Ministry of Science and Higher Education under the | the Polish Ministry of Science and Higher Education under the | |||
European Regional Development Fund, Grant No. POIG.01.01.02-00-045/ | European Regional Development Fund, Grant No. POIG.01.01.02-00-045/ | |||
09-00 (Future Internet Engineering Project). | 09-00 (Future Internet Engineering Project). | |||
16. References | 16. References | |||
16.1. Normative References | 16.1. Normative References | |||
[I-D.ietf-dhc-dhcpv6-client-link-layer-addr-opt] | ||||
Halwasia, G., Systems, C., and W. Dec, "Client Link-layer | ||||
Address Option in DHCPv6", | ||||
draft-ietf-dhc-dhcpv6-client-link-layer-addr-opt-03 (work | ||||
in progress), October 2012. | ||||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
[RFC3315] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., | [RFC3315] Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., | |||
and M. Carney, "Dynamic Host Configuration Protocol for | and M. Carney, "Dynamic Host Configuration Protocol for | |||
IPv6 (DHCPv6)", RFC 3315, July 2003. | IPv6 (DHCPv6)", RFC 3315, July 2003. | |||
[RFC3633] Troan, O. and R. Droms, "IPv6 Prefix Options for Dynamic | [RFC3633] Troan, O. and R. Droms, "IPv6 Prefix Options for Dynamic | |||
Host Configuration Protocol (DHCP) version 6", RFC 3633, | Host Configuration Protocol (DHCP) version 6", RFC 3633, | |||
December 2003. | December 2003. | |||
[RFC4703] Stapp, M. and B. Volz, "Resolution of Fully Qualified | [RFC4703] Stapp, M. and B. Volz, "Resolution of Fully Qualified | |||
Domain Name (FQDN) Conflicts among Dynamic Host | Domain Name (FQDN) Conflicts among Dynamic Host | |||
Configuration Protocol (DHCP) Clients", RFC 4703, | Configuration Protocol (DHCP) Clients", RFC 4703, October | |||
October 2006. | 2006. | |||
[RFC4704] Volz, B., "The Dynamic Host Configuration Protocol for | [RFC4704] Volz, B., "The Dynamic Host Configuration Protocol for | |||
IPv6 (DHCPv6) Client Fully Qualified Domain Name (FQDN) | IPv6 (DHCPv6) Client Fully Qualified Domain Name (FQDN) | |||
Option", RFC 4704, October 2006. | Option", RFC 4704, October 2006. | |||
[RFC6939] Halwasia, G., Bhandari, S., and W. Dec, "Client Link-Layer | ||||
Address Option in DHCPv6", RFC 6939, May 2013. | ||||
16.2. Informative References | 16.2. Informative References | |||
[I-D.ietf-dhc-dhcpv6-failover-requirements] | [I-D.ietf-dhc-dhcpv6-failover-requirements] | |||
Mrugalski, T. and K. Kinnear, "DHCPv6 Failover | Mrugalski, T. and K. Kinnear, "DHCPv6 Failover | |||
Requirements", | Requirements", draft-ietf-dhc-dhcpv6-failover- | |||
draft-ietf-dhc-dhcpv6-failover-requirements-02 (work in | requirements-06 (work in progress), July 2013. | |||
progress), September 2012. | ||||
[I-D.ietf-dhc-dhcpv6-load-balancing] | ||||
Kostur, A., "DHC Load Balancing Algorithm for DHCPv6", | ||||
draft-ietf-dhc-dhcpv6-load-balancing-00 (work in | ||||
progress), December 2012. | ||||
[RFC2136] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound, | [RFC2136] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound, | |||
"Dynamic Updates in the Domain Name System (DNS UPDATE)", | "Dynamic Updates in the Domain Name System (DNS UPDATE)", | |||
RFC 2136, April 1997. | RFC 2136, April 1997. | |||
[RFC4649] Volz, B., "Dynamic Host Configuration Protocol for IPv6 | [RFC4649] Volz, B., "Dynamic Host Configuration Protocol for IPv6 | |||
(DHCPv6) Relay Agent Remote-ID Option", RFC 4649, | (DHCPv6) Relay Agent Remote-ID Option", RFC 4649, August | |||
August 2006. | 2006. | |||
[RFC5007] Brzozowski, J., Kinnear, K., Volz, B., and S. Zeng, | [RFC5007] Brzozowski, J., Kinnear, K., Volz, B., and S. Zeng, | |||
"DHCPv6 Leasequery", RFC 5007, September 2007. | "DHCPv6 Leasequery", RFC 5007, September 2007. | |||
[RFC5460] Stapp, M., "DHCPv6 Bulk Leasequery", RFC 5460, | [RFC5460] Stapp, M., "DHCPv6 Bulk Leasequery", RFC 5460, February | |||
February 2009. | 2009. | |||
[dhcpv4-failover] | [dhcpv4-failover] | |||
Droms, R., Kinnear, K., Stapp, M., Volz, B., Gonczi, S., | Droms, R., Kinnear, K., Stapp, M., Volz, B., Gonczi, S., | |||
Rabil, G., Dooley, M., and A. Kapur, "DHCP Failover | Rabil, G., Dooley, M., and A. Kapur, "DHCP Failover | |||
Protocol", draft-ietf-dhc-failover-12 (work in progress), | Protocol", draft-ietf-dhc-failover-12 (work in progress), | |||
March 2003. | March 2003. | |||
Authors' Addresses | Authors' Addresses | |||
Tomasz Mrugalski | Tomasz Mrugalski | |||
End of changes. 126 change blocks. | ||||
406 lines changed or deleted | 528 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |