draft-ietf-rtgwg-backoff-algo-04.txt   draft-ietf-rtgwg-backoff-algo-05.txt 
Network Working Group B. Decraene Network Working Group B. Decraene
Internet-Draft Orange Internet-Draft Orange
Intended status: Standards Track S. Litkowski Intended status: Standards Track S. Litkowski
Expires: July 13, 2017 Orange Business Service Expires: November 1, 2017 Orange Business Service
H. Gredler H. Gredler
RtBrick Inc RtBrick Inc
A. Lindem A. Lindem
Cisco Systems Cisco Systems
P. Francois P. Francois
C. Bowers C. Bowers
Juniper Networks, Inc. Juniper Networks, Inc.
January 9, 2017 April 30, 2017
SPF Back-off algorithm for link state IGPs SPF Back-off algorithm for link state IGPs
draft-ietf-rtgwg-backoff-algo-04 draft-ietf-rtgwg-backoff-algo-05
Abstract Abstract
This document defines a standard algorithm to back-off link-state IGP This document defines a standard algorithm to back-off link-state IGP
SPF computations. SPF computations.
Having one standard algorithm improves interoperability by reducing Having one standard algorithm improves interoperability by reducing
the probability and/or duration of transient forwarding loops during the probability and/or duration of transient forwarding loops during
the IGP convergence when the IGP reacts to multiple proximate IGP the IGP convergence when the IGP reacts to multiple temporally close
events. IGP events.
Requirements Language Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
skipping to change at page 2, line 4 skipping to change at page 2, line 4
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 13, 2017. This Internet-Draft will expire on November 1, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 26 skipping to change at page 2, line 26
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. High level goals . . . . . . . . . . . . . . . . . . . . . . 3 2. High level goals . . . . . . . . . . . . . . . . . . . . . . 3
3. Definitions and parameters . . . . . . . . . . . . . . . . . 4 3. Definitions and parameters . . . . . . . . . . . . . . . . . 4
4. Principles of SPF delay algorithm . . . . . . . . . . . . . . 4 4. Principles of SPF delay algorithm . . . . . . . . . . . . . . 5
5. Specification of the SPF delay state machine . . . . . . . . 5 5. Specification of the SPF delay state machine . . . . . . . . 5
5.1. States . . . . . . . . . . . . . . . . . . . . . . . . . 5 5.1. States . . . . . . . . . . . . . . . . . . . . . . . . . 5
5.2. States Transitions . . . . . . . . . . . . . . . . . . . 6 5.2. States Transitions . . . . . . . . . . . . . . . . . . . 6
5.3. FSM Events . . . . . . . . . . . . . . . . . . . . . . . 7 5.3. FSM Events . . . . . . . . . . . . . . . . . . . . . . . 7
6. Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 8 6. Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 9
7. Impact on micro-loops . . . . . . . . . . . . . . . . . . . . 9 7. Partial Deployment . . . . . . . . . . . . . . . . . . . . . 9
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 8. Impact on micro-loops . . . . . . . . . . . . . . . . . . . . 10
9. Security considerations . . . . . . . . . . . . . . . . . . . 9 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 10. Security considerations . . . . . . . . . . . . . . . . . . . 10
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10
11.1. Normative References . . . . . . . . . . . . . . . . . . 9 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 10
11.2. Informative References . . . . . . . . . . . . . . . . . 9 12.1. Normative References . . . . . . . . . . . . . . . . . . 10
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 12.2. Informative References . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11
1. Introduction 1. Introduction
Link state IGPs, such as IS-IS [ISO10589-Second-Edition] and OSPF Link state IGPs, such as IS-IS [ISO10589-Second-Edition] and OSPF
[RFC2328], perform distributed route computation on all routers in [RFC2328], perform distributed route computation on all routers in
the area/level. In order to have consistent routing tables across the area/level. In order to have consistent routing tables across
the network, such distributed computation requires that all routers the network, such distributed computation requires that all routers
have the same version of the network topology (Link State DataBase have the same version of the network topology (Link State DataBase
(LSDB)) and perform their computation at the same time. (LSDB)) and perform their computation at the same time.
In general, when the network is stable, there is a desire to compute In general, when the network is stable, there is a desire to compute
a new SPF as soon as a failure is detected in order to quickly route a new SPF as soon as a failure is detected in order to quickly route
around the failure. However, when the network is experiencing around the failure. However, when the network is experiencing
multiple proximate failures over a short period of time, there is a multiple temporally close failures over a short period of time, there
conflicting desire to limit the frequency of SPF computations. is a conflicting desire to limit the frequency of SPF computations.
Indeed, this allows a reduction in control plane resources used by Indeed, this allows a reduction in control plane resources used by
IGPs and all protocols/subsystems reacting on the attendant route IGPs and all protocols/subsystems reacting on the attendant route
change, such as LDP, RSVP-TE, BGP, Fast ReRoute computations, FIB change, such as LDP, RSVP-TE, BGP, Fast ReRoute computations, FIB
updates... This also reduces the churn on routers and in the network updates... This also reduces the churn on routers and in the network
and, in particular, reduces the side effects such as micro-loops that and, in particular, reduces the side effects such as micro-loops that
ensue during IGP convergence. ensue during IGP convergence.
To allow for this, IGPs implement an SPF back-off algorithm. To allow for this, IGPs implement an SPF back-off algorithm.
However, different implementations have choosen different algorithms. However, different implementations have choosen different algorithms.
Hence, in a multi-vendor network, it's not possible to ensure that Hence, in a multi-vendor network, it's not possible to ensure that
skipping to change at page 3, line 35 skipping to change at page 3, line 38
computations for the same duration, this document specifies a computations for the same duration, this document specifies a
standard algorithm. Optionally, implementations may offer standard algorithm. Optionally, implementations may offer
alternative algorithms. alternative algorithms.
2. High level goals 2. High level goals
The high level goals of this algorithm are the following: The high level goals of this algorithm are the following:
o Very fast convergence for a single event (e.g., link failure). o Very fast convergence for a single event (e.g., link failure).
o Paced fast convergence for multiple proximate IGP events while IGP o Paced fast convergence for multiple temporally close IGP events
stability is considered acceptable. while IGP stability is considered acceptable.
o Delayed convergence when IGP stability is problematic. This will o Delayed convergence when IGP stability is problematic. This will
allow the IGP and related processes to conserve resources during allow the IGP and related processes to conserve resources during
the period of instability. the period of instability.
o Always try to avoid different SPF_DELAY timers values across o Always try to avoid different SPF_DELAY timers values across
different routers in the area/level. Even though not all routers different routers in the area/level. Even though not all routers
will receive IGP messages at the same time, due to differences will receive IGP messages at the same time, due to differences
both in the distance from the originator of the IGP event and in both in the distance from the originator of the IGP event and in
flooding implementations. flooding implementations.
skipping to change at page 4, line 43 skipping to change at page 4, line 43
seconds. Note that this allows the IGP network to stabilize. seconds. Note that this allows the IGP network to stabilize.
TIME_TO_LEARN_INTERVAL: This is the maximum duration typically needed TIME_TO_LEARN_INTERVAL: This is the maximum duration typically needed
to learn all the IGP events related to a single component failure to learn all the IGP events related to a single component failure
(e.g., router failure, SRLG failure), e.g., 1 second. It's mostly (e.g., router failure, SRLG failure), e.g., 1 second. It's mostly
dependent on failure detection time variation between all routers dependent on failure detection time variation between all routers
that are adjacent to the failure. Additionally, it may depend on the that are adjacent to the failure. Additionally, it may depend on the
different IGP implementations across the network, related to different IGP implementations across the network, related to
origination and flooding of their link state advertisements. origination and flooding of their link state advertisements.
HOLD_DOWN_INTERVAL: The time required with no received IGP events HOLDDOWN_INTERVAL: The time required with no received IGP events
before considering the IGP to be stable again and allowing the before considering the IGP to be stable again and allowing the
SPF_DELAY to be restored to INITIAL_WAIT. e.g., 3 seconds. SPF_DELAY to be restored to INITIAL_SPF_DELAY. e.g., 3 seconds.
SPF_TIMER: The Finite State Machine (FSM) abstract timer that uses
the computed SPF delay. Upon expiration, the Route Table Computation
(as defined above) is performed.
4. Principles of SPF delay algorithm 4. Principles of SPF delay algorithm
For this first IGP event, we assume that there has been a single For this first IGP event, we assume that there has been a single
simple change in the network which can be taken into account using a simple change in the network which can be taken into account using a
single routing computation (e.g., link failure, prefix (metric) single routing computation (e.g., link failure, prefix (metric)
change) and we optimize for very fast convergence, delaying the change) and we optimize for very fast convergence, delaying the
routing computation by INITIAL_SPF_DELAY. Under this assumption, routing computation by INITIAL_SPF_DELAY. Under this assumption,
there is no benefit in delaying the routing computation. In a there is no benefit in delaying the routing computation. In a
typical network, this is the most common type of IGP event. Hence, typical network, this is the most common type of IGP event. Hence,
skipping to change at page 8, line 21 skipping to change at page 9, line 7
Event 6: HOLDDOWN_TIMER expiration, while in SHORT_WAIT. Event 6: HOLDDOWN_TIMER expiration, while in SHORT_WAIT.
Actions on event 6: Actions on event 6:
o Deactivate LEARN_TIMER. o Deactivate LEARN_TIMER.
o Transition to QUIET state. o Transition to QUIET state.
6. Parameters 6. Parameters
All the parameters MUST be configurable. All the delays All the parameters MUST be configurable [I-D.ietf-isis-yang-isis-cfg]
[I-D.ietf-ospf-yang] at the protocol instance granularity. They MAY
be configurable at the area/level granularity. All the delays
(INITIAL_SPF_DELAY, SHORT_SPF_DELAY, LONG_SPF_DELAY, (INITIAL_SPF_DELAY, SHORT_SPF_DELAY, LONG_SPF_DELAY,
TIME_TO_LEARN_INTERVAL, HOLD_DOWN_INTERVAL) SHOULD be configurable at TIME_TO_LEARN_INTERVAL, HOLDDOWN_INTERVAL) SHOULD be configurable at
the millisecond granularity. They MUST be configurable at least at the millisecond granularity. They MUST be configurable at least at
the tenth of second granularity. The configurable range for all the the tenth of second granularity. The configurable range for all the
parameters SHOULD at least be from 0 milliseconds to 60 seconds. parameters SHOULD at least be from 0 milliseconds to 60 seconds.
This document does not propose default values for the parameters This document does not propose default values for the parameters
because these values are expected to be context dependent. because these values are expected to be context dependent.
Implementations are free to propose their own default values. Implementations are free to propose their own default values.
In order to satisfy the goals stated in Section 2, operators are
RECOMMENDED to configure delay intervals such that SPF_INITIAL_DELAY
<= SPF_SHORT_DELAY and SPF_SHORT_DELAY <= SPF_LONG_DELAY.
When setting (default) values, one SHOULD consider the customers and When setting (default) values, one SHOULD consider the customers and
their application requirements, the computational power of the their application requirements, the computational power of the
routers, the size of the network, and, in particular, the number of routers, the size of the network, and, in particular, the number of
IP prefixes advertised in the IGP, the frequency and number of IGP IP prefixes advertised in the IGP, the frequency and number of IGP
events, the number of protocols reactions/computations triggered by events, the number of protocols reactions/computations triggered by
IGP SPF (e.g., BGP, PCEP, Traffic Engineering CSPF, Fast ReRoute IGP SPF (e.g., BGP, PCEP, Traffic Engineering CSPF, Fast ReRoute
computations). computations).
Note that some or all of these factors may change over the life of Note that some or all of these factors may change over the life of
the network. In case of doubt, it's RECOMMENDED to play it safe and the network. In case of doubt, it's RECOMMENDED to play it safe and
start with safe, i.e., longer timers. start with safe, i.e., longer timers.
For the standard algorithm to be effective in mitigating micro-loops, For the standard algorithm to be effective in mitigating micro-loops,
it is RECOMMENDED that all routers in the IGP domain, or at least all it is RECOMMENDED that all routers in the IGP domain, or at least all
the routers in the same area/level, have exactly the same configured the routers in the same area/level, have exactly the same configured
values. values.
7. Impact on micro-loops 7. Partial Deployment
In general, the SPF delay algorithm is only effective in mitigating
micro-loops if it is deployed on all routers in the IGP domain or, at
least, all routers in an IGP area/level. The impact of partial
deployment is based on the particular event, topology, and the SPF
algorithm(s) used on other routers in the IGP area/level. In cases
where the previous SPF algorithm was implemented uniformly, partial
deployment will increase the frequency and duration of micro-loops.
Hence, it is RECOMMENDED that all routers in the IGP domain or at
least within the same area/level be migrated to the SPF algorithm
described herein at roughly the same time.
Note that this is not a new consideration as over times, network
operators have changed SPF delay parameters in order to accommodate
new customer requirements for fast convergence, as permitted by new
software and hardware. They may also have progressively replaced an
implementation with a given SPF delay algorithm by another
implementation with a different one.
8. Impact on micro-loops
Micro-loops during IGP convergence are due to a non-synchronized or Micro-loops during IGP convergence are due to a non-synchronized or
non-ordered update of the forwarding information tables (FIB) non-ordered update of the forwarding information tables (FIB)
[RFC5715] [RFC6976] [I-D.ietf-rtgwg-spf-uloop-pb-statement]. FIBs [RFC5715] [RFC6976] [I-D.ietf-rtgwg-spf-uloop-pb-statement]. FIBs
are installed after multiple steps such as SPF wait time, SPF are installed after multiple steps such as SPF wait time, SPF
computation, FIB distribution, and FIB update. This document only computation, FIB distribution, and FIB update. This document only
addresses the first contribution. This standardized procedure addresses the first contribution. This standardized procedure
reduces the probability and/or duration of micro-loops when IGPs reduces the probability and/or duration of micro-loops when IGPs
experience multiple proximate events. It does not prevent all micro- experience multiple temporally close events. It does not prevent all
loops. However, it is beneficial and is less complex and costly to micro-loops. However, it is beneficial and is less complex and
implement when compared to full solutions such as [RFC5715] or costly to implement when compared to full solutions such as [RFC5715]
[RFC6976]. or [RFC6976].
8. IANA Considerations 9. IANA Considerations
No IANA actions required. No IANA actions required.
9. Security considerations 10. Security considerations
The algorithm presented in this document does not compromise IGP The algorithm presented in this document does not compromise IGP
security. An attacker having the ability to generate IGP events security. An attacker having the ability to generate IGP events
would be able to delay the IGP convergence time. The LONG_SPF_DELAY would be able to delay the IGP convergence time. The LONG_SPF_DELAY
state may help mitigate the effects of Denial-of-Service (DOS) state may help mitigate the effects of Denial-of-Service (DOS)
attacks generating many IGP events. attacks generating many IGP events.
10. Acknowledgements 11. Acknowledgements
We would like to acknowledge Les Ginsberg, Uma Chunduri, and Mike We would like to acknowledge Les Ginsberg, Uma Chunduri, Mike Shand
Shand for the discussions and comments related to this document. and Alexander Vainshtein for the discussions and comments related to
this document.
11. References 12. References
11.1. Normative References 12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>. <http://www.rfc-editor.org/info/rfc2119>.
11.2. Informative References 12.2. Informative References
[I-D.ietf-isis-yang-isis-cfg]
Litkowski, S., Yeung, D., Lindem, A., Zhang, Z., and L.
Lhotka, "YANG Data Model for IS-IS protocol", draft-ietf-
isis-yang-isis-cfg-17 (work in progress), March 2017.
[I-D.ietf-ospf-yang]
Yeung, D., Qu, Y., Zhang, Z., Chen, I., and A. Lindem,
"Yang Data Model for OSPF Protocol", draft-ietf-ospf-
yang-07 (work in progress), March 2017.
[I-D.ietf-rtgwg-spf-uloop-pb-statement] [I-D.ietf-rtgwg-spf-uloop-pb-statement]
Litkowski, S., Decraene, B., and M. Horneffer, "Link State Litkowski, S., Decraene, B., and M. Horneffer, "Link State
protocols SPF trigger and delay algorithm impact on IGP protocols SPF trigger and delay algorithm impact on IGP
micro-loops", draft-ietf-rtgwg-spf-uloop-pb-statement-02 micro-loops", draft-ietf-rtgwg-spf-uloop-pb-statement-03
(work in progress), December 2015. (work in progress), March 2017.
[ISO10589-Second-Edition] [ISO10589-Second-Edition]
International Organization for Standardization, International Organization for Standardization,
"Intermediate system to Intermediate system intra-domain "Intermediate system to Intermediate system intra-domain
routeing information exchange protocol for use in routeing information exchange protocol for use in
conjunction with the protocol for providing the conjunction with the protocol for providing the
connectionless-mode Network Service (ISO 8473)", ISO/ connectionless-mode Network Service (ISO 8473)", ISO/
IEC 10589:2002, Second Edition, Nov 2002. IEC 10589:2002, Second Edition, Nov 2002.
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328,
 End of changes. 24 change blocks. 
39 lines changed or deleted 81 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/