draft-ietf-tcpm-alternativebackoff-ecn-07.txt | draft-ietf-tcpm-alternativebackoff-ecn-08.txt | |||
---|---|---|---|---|
Network Working Group N. Khademi | Network Working Group N. Khademi | |||
Internet-Draft M. Welzl | Internet-Draft M. Welzl | |||
Intended status: Experimental University of Oslo | Intended status: Experimental University of Oslo | |||
Expires: September 21, 2018 G. Armitage | Expires: February 8, 2019 G. Armitage | |||
Swinburne University of Technology | Netflix | |||
G. Fairhurst | G. Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
March 20, 2018 | August 7, 2018 | |||
TCP Alternative Backoff with ECN (ABE) | TCP Alternative Backoff with ECN (ABE) | |||
draft-ietf-tcpm-alternativebackoff-ecn-07 | draft-ietf-tcpm-alternativebackoff-ecn-08 | |||
Abstract | Abstract | |||
Active Queue Management (AQM) mechanisms allow for burst tolerance | Active Queue Management (AQM) mechanisms allow for burst tolerance | |||
while enforcing short queues to minimise the time that packets spend | while enforcing short queues to minimise the time that packets spend | |||
enqueued at a bottleneck. This can cause noticeable performance | enqueued at a bottleneck. This can cause noticeable performance | |||
degradation for TCP connections traversing such a bottleneck, | degradation for TCP connections traversing such a bottleneck, | |||
especially if there are only a few flows or their bandwidth-delay- | especially if there are only a few flows or their bandwidth-delay- | |||
product is large. An Explicit Congestion Notification (ECN) signal | product is large. The reception of a Congestion Experienced (CE) ECN | |||
indicates that an AQM mechanism is used at the bottleneck, and | mark indicates that an AQM mechanism is used at the bottleneck, and | |||
therefore the bottleneck network queue is likely to be short. This | therefore the bottleneck network queue is likely to be short. | |||
document therefore proposes an update to RFC3168, which changes the | Feedback of this signal allows the TCP sender-side ECN reaction in | |||
TCP sender-side ECN reaction in congestion avoidance to reduce the | congestion avoidance to reduce the Congestion Window (cwnd) by a | |||
Congestion Window (cwnd) by a smaller amount than the congestion | smaller amount than the congestion control algorithm's reaction to | |||
control algorithm's reaction to inferred packet loss. | inferred packet loss. This specification therefore defines an | |||
experimental change to the TCP reaction specified in RFC3168, as | ||||
permitted by RFC 8311. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on September 21, 2018. | This Internet-Draft will expire on February 8, 2019. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 3 | 3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 3.1. Choice of ABE Multiplier . . . . . . . . . . . . . . . . 4 | |||
4.1. Why Use ECN to Vary the Degree of Backoff? . . . . . . . 4 | 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
4.2. Focus on ECN as Defined in RFC3168 . . . . . . . . . . . 5 | 4.1. Why Use ECN to Vary the Degree of Backoff? . . . . . . . 6 | |||
4.3. Choice of ABE Multiplier . . . . . . . . . . . . . . . . 5 | 4.2. An RTT-based response to indicated congestion . . . . . . 7 | |||
5. ABE Deployment Requirements . . . . . . . . . . . . . . . . . 7 | 5. ABE Deployment Requirements . . . . . . . . . . . . . . . . . 7 | |||
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | 6. ABE Experiment Goals . . . . . . . . . . . . . . . . . . . . 8 | |||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | |||
8. Implementation Status . . . . . . . . . . . . . . . . . . . . 8 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 9 | 9. Implementation Status . . . . . . . . . . . . . . . . . . . . 9 | |||
10. Revision Information . . . . . . . . . . . . . . . . . . . . 9 | 10. Security Considerations . . . . . . . . . . . . . . . . . . . 9 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 | 11. Revision Information . . . . . . . . . . . . . . . . . . . . 9 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 10 | 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
11.2. Informative References . . . . . . . . . . . . . . . . . 11 | 12.1. Normative References . . . . . . . . . . . . . . . . . . 11 | |||
12.2. Informative References . . . . . . . . . . . . . . . . . 11 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
1. Introduction | 1. Introduction | |||
Explicit Congestion Notification (ECN) [RFC3168] makes it possible | Explicit Congestion Notification (ECN) [RFC3168] makes it possible | |||
for an Active Queue Management (AQM) mechanism to signal the presence | for an Active Queue Management (AQM) mechanism to signal the presence | |||
of incipient congestion without incurring packet loss. This lets the | of incipient congestion without necessarily incurring packet loss. | |||
network deliver some packets to an application that would have been | This lets the network deliver some packets to an application that | |||
dropped if the application or transport did not support ECN. This | would have been dropped if the application or transport did not | |||
packet loss reduction is the most obvious benefit of ECN, but it is | support ECN. This packet loss reduction is the most obvious benefit | |||
often relatively modest. Other benefits of deploying ECN have been | of ECN, but it is often relatively modest. Other benefits of | |||
documented in RFC8087 [RFC8087]. | deploying ECN have been documented in RFC8087 [RFC8087]. | |||
The rules for ECN were originally written to be very conservative, | The rules for ECN were originally written to be very conservative, | |||
and required the congestion control algorithms of ECN-Capable | and required the congestion control algorithms of ECN-Capable | |||
transport protocols to treat ECN congestion signals exactly the same | transport protocols to treat indications of congestion signalled by | |||
as they would treat an inferred packet loss [RFC3168]. | ECN exactly the same as they would treat an inferred packet loss | |||
[RFC3168]. Research has demonstrated the benefits of reducing | ||||
Research has demonstrated the benefits of reducing network delays | network delays that are caused by interaction of loss-based TCP | |||
that are caused by interaction of loss-based TCP congestion control | congestion control and excessive buffering [BUFFERBLOAT]. This has | |||
and excessive buffering [BUFFERBLOAT]. This has led to the creation | led to the creation of AQM mechanisms like PIE [RFC8033] and CoDel | |||
of new AQM mechanisms like PIE [RFC8033] and CoDel | ||||
[CODEL2012][RFC8289], which prevent bloated queues that are common | [CODEL2012][RFC8289], which prevent bloated queues that are common | |||
with unmanaged and excessively large buffers deployed across the | with unmanaged and excessively large buffers deployed across the | |||
Internet [BUFFERBLOAT]. | Internet [BUFFERBLOAT]. | |||
The AQM mechanisms mentioned above aim to keep a sustained queue | The AQM mechanisms mentioned above aim to keep a sustained queue | |||
short while tolerating transient (short-term) packet bursts. | short while tolerating transient (short-term) packet bursts. | |||
However, currently used loss-based congestion control mechanisms | However, currently used loss-based congestion control mechanisms | |||
cannot always utilise a bottleneck link well where there are short | cannot always utilise a bottleneck link well where there are short | |||
queues. For example, a TCP sender must be able to store at least an | queues. For example, a TCP sender using the Reno congestion control | |||
end-to-end bandwidth-delay product (BDP) worth of data at the | needs to be able to store at least an end-to-end bandwidth-delay | |||
bottleneck buffer if it is to maintain full path utilisation in the | product (BDP) worth of data at the bottleneck buffer if it is to | |||
face of loss-induced reduction of cwnd [RFC5681], which effectively | maintain full path utilisation in the face of loss-induced reduction | |||
doubles the amount of data that can be in flight, the maximum round- | of the congestion window (cwnd) [RFC5681], which effectively doubles | |||
trip time (RTT) experience, and the path's effective RTT using the | the amount of data that can be in flight, the maximum round-trip time | |||
network path. | (RTT) experience, and the path's effective RTT using the network | |||
path. | ||||
Modern AQM mechanisms can use ECN to signal the early signs of | Modern AQM mechanisms can use ECN to signal the early signs of | |||
impending queue buildup long before a tail-drop queue would be forced | impending queue buildup long before a tail-drop queue would be forced | |||
to resort to dropping packets. It is therefore appropriate for the | to resort to dropping packets. It is therefore appropriate for the | |||
transport protocol congestion control algorithm to have a more | transport protocol congestion control algorithm to have a more | |||
measured response when an early-warning signal of congestion is | measured response when it receives an indication with an early- | |||
received in the form of an ECN CE-marked packet. Recognizing these | warning of congestion after the remote endpoint receives an ECN CE- | |||
changes in modern AQM practices, more recent rules have relaxed the | marked packet. Recognizing these changes in modern AQM practices, | |||
strict requirement that ECN signals be treated identically to | the strict requirement that ECN CE signals be treated identically to | |||
inferred packet loss [RFC8311]. Following these newer, more flexible | inferred packet loss have been relaxed [RFC8311]. This document | |||
rules, this document defines a new sender-side-only congestion | therefore defines a new sender-side-only congestion control response, | |||
control response, called "ABE" (Alternative Backoff with ECN). ABE | called "ABE" (Alternative Backoff with ECN). ABE improves TCP's | |||
improves TCP's average throughput when routers use AQM controlled | average throughput when routers use AQM controlled buffers that allow | |||
buffers that allow for short queues only. | for short queues only. | |||
2. Definitions | 2. Definitions | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
3. Specification | 3. Specification | |||
This specification updates the congestion control algorithm of an | This specification changes the congestion control algorithm of an | |||
ECN-Capable TCP transport protocol by changing the TCP sender | ECN-Capable TCP transport protocol by changing the TCP sender | |||
response to feedback from the TCP receiver that indicates reception | response to feedback from the TCP receiver that indicates reception | |||
of a CE-marked packet, i.e., receipt of a packet with the ECN-Echo | of a CE-marked packet, i.e., receipt of a packet with the ECN-Echo | |||
flag (defined in [RFC3168]) set. | flag (defined in [RFC3168]) set, following the process defined in | |||
[RFC8311]. | ||||
It updates the following text in section 6.1.2 of the ECN | The TCP sender response is currently specified in section 6.1.2 of | |||
specification [RFC3168] : | the ECN specification [RFC3168], updated by [RFC8311]: | |||
The indication of congestion should be treated just as a | The indication of congestion should be treated just as a | |||
congestion loss in non-ECN-Capable TCP. That is, the TCP source | congestion loss in non-ECN-Capable TCP. That is, the TCP source | |||
halves the congestion window "cwnd" and reduces the slow start | halves the congestion window "cwnd" and reduces the slow start | |||
threshold "ssthresh". | threshold "ssthresh", unless otherwise specified by an | |||
Experimental RFC in the IETF document stream. | ||||
Replacing this with: | This is replaced with: | |||
Receipt of a packet with the ECN-Echo flag SHOULD trigger the TCP | Receipt of a packet with the ECN-Echo flag SHOULD trigger the TCP | |||
source to set the slow start threshold (ssthresh) to 0.8 times the | source to set the slow start threshold (ssthresh) to 0.8 times the | |||
FlightSize, with a lower bound of 2 * SMSS applied to the result. | FlightSize, with a lower bound of 2 * SMSS applied to the result. | |||
As in [RFC5681], the TCP sender also reduces the cwnd value to no | As in [RFC5681], the TCP sender also reduces the cwnd value to no | |||
more than the new ssthresh value. RFC 3168 section 6.1.2 provides | more than the new ssthresh value. RFC 3168 section 6.1.2 provides | |||
guidance on setting a cwnd less than 2 * SMSS. | guidance on setting a cwnd less than 2 * SMSS. | |||
3.1. Choice of ABE Multiplier | ||||
ABE decouples the reaction of a TCP sender to inferred packet loss | ||||
and indication of ECN-signalled congestion in the congestion | ||||
avoidance phase. To achieve this, ABE uses a different scaling | ||||
factor in Equation 4 in Section 3.1 of [RFC5681]. The description | ||||
respectively uses beta_{loss} and beta_{ecn} to refer to the | ||||
multiplicative decrease factors applied in response to inferred | ||||
packet loss, and in response to a receiver indicating ECN-signalled | ||||
congestion. For non-ECN-enabled TCP connections, only beta_{loss} | ||||
applies. | ||||
In other words, in response to inferred packet loss: | ||||
ssthresh = max (FlightSize * beta_{loss}, 2 * SMSS) | ||||
and in response to an indication of an ECN-signalled congestion: | ||||
ssthresh = max (FlightSize * beta_{ecn}, 2 * SMSS) | ||||
and | ||||
cwnd = ssthresh | ||||
(If ssthresh == 2 * SMSS, RFC 3168 section 6.1.2 provides guidance | ||||
on setting a cwnd lower than 2 * SMSS.) | ||||
where FlightSize is the amount of outstanding data in the network, | ||||
upper-bounded by the smaller of the sender's cwnd and the receiver's | ||||
advertised window (rwnd) [RFC5681]. The higher the values of | ||||
beta_{loss} and beta_{ecn}, the less aggressive the response of any | ||||
individual backoff event. | ||||
The appropriate choice for beta_{loss} and beta_{ecn} values is a | ||||
balancing act between path utilisation and draining the bottleneck | ||||
queue. More aggressive backoff (smaller beta_*) risks underutilising | ||||
the path, while less aggressive backoff (larger beta_*) can result in | ||||
slower draining of the bottleneck queue. | ||||
The Internet has already been running with at least two different | ||||
beta_{loss} values for several years: the standard value is 0.5 | ||||
[RFC5681], and the Linux implementation of CUBIC [RFC8312] has used a | ||||
multiplier of 0.7 since kernel version 2.6.25 released in 2008. ABE | ||||
does not change the value of beta_{loss} used by current TCP | ||||
implementations. | ||||
The recommendation in this document specifies a value of | ||||
beta_{ecn}=0.8. This recommended beta_{ecn} value is only applicable | ||||
for the standard TCP congestion control [RFC5681]. The selection of | ||||
beta_{ecn} enables tuning the response of a TCP connection to shallow | ||||
AQM marking thresholds. beta_{loss} characterizes the response of a | ||||
congestion control algorithm to packet loss, i.e., exhaustion of | ||||
buffers (of unknown depth). Different values for beta_{loss} have | ||||
been suggested for TCP congestion control algorithms. Consequently, | ||||
beta_{ecn} is likely to be an algorithm-specific parameter rather | ||||
than a constant multiple of the algorithm's existing beta_{loss}. | ||||
A range of tests (section IV, [ABE2017]) with NewReno and CUBIC over | ||||
CoDel and PIE in lightly-multiplexed scenarios have explored this | ||||
choice of parameter. The results of these tests indicate that CUBIC | ||||
connections benefit from beta_{ecn} of 0.85 (cf. beta_{loss} = 0.7), | ||||
and NewReno connections see improvements with beta_{ecn} in the range | ||||
0.7 to 0.85 (cf. beta_{loss} = 0.5). | ||||
4. Discussion | 4. Discussion | |||
Much of the technical background to ABE can be found in a research | Much of the technical background to ABE can be found in a research | |||
paper [ABE2017]. This paper used a mix of experiments, theory and | paper [ABE2017]. This paper used a mix of experiments, theory and | |||
simulations with NewReno [RFC5681] and CUBIC [RFC8312] to evaluate | simulations with NewReno [RFC5681] and CUBIC [RFC8312] to evaluate | |||
the technique. The technique was shown to present "...significant | the technique. The technique was shown to present "...significant | |||
performance gains in lightly-multiplexed [few concurrent flows] | performance gains in lightly-multiplexed [few concurrent flows] | |||
scenarios, without losing the delay-reduction benefits of deploying | scenarios, without losing the delay-reduction benefits of deploying | |||
CoDel or PIE". The performance improvement is achieved when reacting | CoDel or PIE". The performance improvement is achieved when reacting | |||
to ECN-Echo in congestion avoidance (when ssthresh > cwnd) by | to ECN-Echo in congestion avoidance (when ssthresh > cwnd) by | |||
skipping to change at page 5, line 18 ¶ | skipping to change at page 6, line 46 ¶ | |||
ECN-Capable packets with an ECN CE-mark. The reception of a CE-mark | ECN-Capable packets with an ECN CE-mark. The reception of a CE-mark | |||
feedback not only indicates congestion on the network path, it also | feedback not only indicates congestion on the network path, it also | |||
indicates that an AQM mechanism exists at the bottleneck along the | indicates that an AQM mechanism exists at the bottleneck along the | |||
path, and hence the CE-mark likely came from a bottleneck with a | path, and hence the CE-mark likely came from a bottleneck with a | |||
controlled short queue. Reacting differently to an ECN-signalled | controlled short queue. Reacting differently to an ECN-signalled | |||
congestion than to an inferred packet loss can then yield the benefit | congestion than to an inferred packet loss can then yield the benefit | |||
of a reduced back-off when queues are short. Using ECN can also be | of a reduced back-off when queues are short. Using ECN can also be | |||
advantageous for several other reasons [RFC8087]. | advantageous for several other reasons [RFC8087]. | |||
The idea of reacting differently to inferred packet loss and | The idea of reacting differently to inferred packet loss and | |||
detection of an ECN-signalled congestion pre-dates this document. | detection of an ECN-signalled congestion pre-dates this | |||
For example, previous research proposed using ECN CE-marked feedback | specification. For example, previous research proposed using ECN CE- | |||
to modify TCP congestion control behaviour via a larger | marked feedback to modify TCP congestion control behaviour via a | |||
multiplicative decrease factor in conjunction with a smaller additive | larger multiplicative decrease factor in conjunction with a smaller | |||
increase factor [ICC2002]. The goal of this former work was to | additive increase factor [ICC2002]. The goal of this former work was | |||
operate across AQM bottlenecks using Random Early Detection (RED) | to operate across AQM bottlenecks using Random Early Detection (RED) | |||
that were not necessarily configured to emulate a short queue (The | that were not necessarily configured to emulate a short queue (The | |||
current usage of RED as an Internet AQM method is limited [RFC7567]). | current usage of RED as an Internet AQM method is limited [RFC7567]). | |||
4.2. Focus on ECN as Defined in RFC3168 | 4.2. An RTT-based response to indicated congestion | |||
Some transport protocol mechanisms rely on ECN semantics that differ | ||||
from the original ECN definition [RFC3168]. For instance, Accurate | ||||
ECN [I-D.ietf-tcpm-accurate-ecn] permits more frequent and detailed | ||||
feedback. Use of such mechanisms (including Accurate ECN, Datacenter | ||||
TCP (DCTCP) [RFC8257], or Congestion Exposure (ConEx) [RFC7713]) is | ||||
out of scope for this document. This specification focuses on ECN as | ||||
defined in [RFC3168]. | ||||
4.3. Choice of ABE Multiplier | ||||
ABE decouples the reaction of a TCP sender to inferred packet loss | ||||
and ECN-signalled congestion in the congestion avoidance phase. To | ||||
achieve this, ABE uses a different scaling factor in Equation 4 in | ||||
Section 3.1 of [RFC5681]. The description respectively uses | ||||
beta_{loss} and beta_{ecn} to refer to the multiplicative decrease | ||||
factors applied in response to inferred packet loss, and in response | ||||
to a receiver indicating ECN-signalled congestion. For non-ECN- | ||||
enabled TCP connections, only beta_{loss} applies. | ||||
In other words, in response to inferred packet loss: | ||||
ssthresh = max (FlightSize * beta_{loss}, 2 * SMSS) | ||||
and in response to an indication of an ECN-signalled congestion: | ||||
ssthresh = max (FlightSize * beta_{ecn}, 2 * SMSS) | ||||
and | ||||
cwnd = ssthresh | ||||
(If ssthresh == 2 * SMSS, RFC 3168 section 6.1.2 provides guidance | ||||
on setting a cwnd lower than 2 * SMSS.) | ||||
where FlightSize is the amount of outstanding data in the network, | ||||
upper-bounded by the smaller of the sender's cwnd and the receiver's | ||||
advertised window (rwnd) [RFC5681]. The higher the values of | ||||
beta_{loss} and beta_{ecn}, the less aggressive the response of any | ||||
individual backoff event. | ||||
The appropriate choice for beta_{loss} and beta_{ecn} values is a | ||||
balancing act between path utilisation and draining the bottleneck | ||||
queue. More aggressive backoff (smaller beta_*) risks underutilising | ||||
the path, while less aggressive backoff (larger beta_*) can result in | ||||
slower draining of the bottleneck queue. | ||||
The Internet has already been running with at least two different | ||||
beta_{loss} values for several years: the standard value is 0.5 | ||||
[RFC5681], and the Linux implementation of CUBIC [RFC8312] has used a | ||||
multiplier of 0.7 since kernel version 2.6.25 released in 2008. ABE | ||||
proposes no change to beta_{loss} used by current TCP | ||||
implementations. | ||||
The recommendation in Section 3 in this document corresponds to a | This specification applies to the use of ECN feedback as defined in | |||
value of beta_{ecn}=0.8. This recommended beta_{ecn} value is only | [RFC3168], which specifies a response to indicated congestion that is | |||
applicable for the standard TCP congestion control [RFC5681]. The | no more frequent that once per path round trip time. Since ABE | |||
selection of beta_{ecn} enables tuning the response of a TCP | responds to indicated congestion once per RTT, it therefore does not | |||
connection to shallow AQM marking thresholds. beta_{loss} | respond to any further loss within the same RTT, because an ABE | |||
characterizes the response of a congestion control algorithm to | sender has already reduced the congestion window. If congestion | |||
packet loss, i.e., exhaustion of buffers (of unknown depth). | persists after such reduction, ABE continues to reduce the congestion | |||
Different values for beta_{loss} have been suggested for TCP | window in each consecutive RTT. This consecutive reduction can | |||
congestion control algorithms. Consequently, beta_{ecn} is likely to | protect the network against long-standing unfairness in the case of | |||
be an algorithm-specific parameter rather than a constant multiple of | AQM algorithms that do not keep a small average queue length. The | |||
the algorithm's existing beta_{loss}. | mechanism does not rely on Accurate ECN | |||
([I-D.ietf-tcpm-accurate-ecn]). | ||||
A range of tests (section IV, [ABE2017]) with NewReno and CUBIC over | In contrast, transport protocol mechanisms can also be designed to | |||
CoDel and PIE in lightly-multiplexed scenarios have explored this | utilise more frequent and detailed ECN feedback (e.g., Accurate ECN | |||
choice of parameter. The results of these tests indicate that CUBIC | [I-D.ietf-tcpm-accurate-ecn]), which then permit a congestion control | |||
connections benefit from beta_{ecn} of 0.85 (cf. beta_{loss} = 0.7), | response that adjusts the sending rate more frequently. Datacenter | |||
and NewReno connections see improvements with beta_{ecn} in the range | TCP (DCTCP) [RFC8257] is an example of this approach. | |||
0.7 to 0.85 (cf. beta_{loss} = 0.5). | ||||
5. ABE Deployment Requirements | 5. ABE Deployment Requirements | |||
This update is a sender-side only change. Like other changes to | This update is a sender-side only change. Like other changes to | |||
congestion control algorithms, it does not require any change to the | congestion control algorithms, it does not require any change to the | |||
TCP receiver or to network devices. It does not require any ABE- | TCP receiver or to network devices. It does not require any ABE- | |||
specific changes in routers or the use of Accurate ECN feedback | specific changes in routers or the use of Accurate ECN feedback | |||
[I-D.ietf-tcpm-accurate-ecn] by a receiver. | [I-D.ietf-tcpm-accurate-ecn] by a receiver. | |||
RFC3168 states that the congestion control response to an ECN- | ||||
signalled congestion is the same as the response to a dropped packet | ||||
[RFC3168]. [RFC8311] updates this specification to allow systems to | ||||
provide a different behaviour when they experience ECN-signalled | ||||
congestion rather than packet loss. The present specification | ||||
defines such an experiment and has thus been assigned an Experimental | ||||
status before being proposed as a Standards-Track update. | ||||
The purpose of the Internet experiment is to collect experience with | ||||
deployment of ABE, and confirm the safety in deployed networks using | ||||
this update to TCP congestion control. | ||||
When used with bottlenecks that do not support ECN-marking the | ||||
specification does not modify the transport protocol. | ||||
To evaluate the benefit, this experiment therefore requires support | ||||
in AQM routers for ECN-marking of packets carrying the ECN-Capable | ||||
Transport, ECT(0), codepoint [RFC3168]. | ||||
If the method is only deployed by some senders, and not by others, | If the method is only deployed by some senders, and not by others, | |||
the senders that use this method can gain some advantage, possibly at | the senders that use this method can gain some advantage, possibly at | |||
the expense of other flows that do not use this updated method. | the expense of other flows that do not use this updated method. | |||
Because this advantage applies only to ECN-marked packets and not to | Because this advantage applies only to ECN-marked packets and not to | |||
packet loss indications, an ECN-Capable bottleneck will still fall | packet loss indications, an ECN-Capable bottleneck will still fall | |||
back to dropping packets if an TCP sender using ABE is too | back to dropping packets if an TCP sender using ABE is too | |||
aggressive, and the result is no different than if the TCP sender was | aggressive, and the result is no different than if the TCP sender was | |||
using traditional loss-based congestion control. | using traditional loss-based congestion control. | |||
A TCP sender reacts to loss or ECN marks only once per round-trip | When used with bottlenecks that do not support ECN-marking the | |||
time. Hence, if a sender would first be notified of an ECN mark and | specification does not modify the transport protocol. | |||
then learn about loss in the same round-trip, it would only react to | ||||
the first notification (ECN) but not to the second (loss). RFC3168 | ||||
specified a reaction to ECN that was equal to the reaction to loss | ||||
[RFC3168]. | ||||
ABE also responds to congestion once per RTT, and therefore it does | 6. ABE Experiment Goals | |||
not respond to further loss within the same RTT, since ABE has | ||||
already reduced the congestion window. If congestion persists after | RFC3168 states that the congestion control response following an | |||
such reduction, ABE continues to reduce the congestion window in each | indication of ECN-signalled congestion is the same as the response to | |||
consecutive RTT. This consecutive reduction can protect the network | a dropped packet [RFC3168]. [RFC8311] updates this specification to | |||
against long-standing unfairness in the case of AQM algorithms that | allow systems to provide a different behaviour when they experience | |||
do not keep a small average queue length. | ECN-signalled congestion rather than packet loss. The present | |||
specification defines such an experiment and has thus been assigned | ||||
an Experimental status before being proposed as a Standards-Track | ||||
update. | ||||
The purpose of the Internet experiment is to collect experience with | ||||
deployment of ABE, and confirm acceptable safety in deployed networks | ||||
that use this update to TCP congestion control. To evaluate ABE, | ||||
this experiment therefore requires support in AQM routers for ECN- | ||||
marking of packets carrying the ECN-Capable Transport, ECT(0), | ||||
codepoint [RFC3168]. | ||||
The result of this Internet experiment ought to include an | The result of this Internet experiment ought to include an | |||
investigation of the implications of experiencing an ECN-CE mark | investigation of the implications of experiencing an ECN-CE mark | |||
followed by loss within the same RTT. At the end of the experiment, | followed by loss within the same RTT. At the end of the experiment, | |||
this will be reported to the TCPM WG (or IESG). | this will be reported to the TCPM WG or IESG. | |||
6. Acknowledgements | 7. Acknowledgements | |||
Authors N. Khademi, M. Welzl and G. Fairhurst were part-funded by | Authors N. Khademi, M. Welzl and G. Fairhurst were part-funded by | |||
the European Community under its Seventh Framework Programme through | the European Community under its Seventh Framework Programme through | |||
the Reducing Internet Transport Latency (RITE) project (ICT-317700). | the Reducing Internet Transport Latency (RITE) project (ICT-317700). | |||
The views expressed are solely those of the authors. | The views expressed are solely those of the authors. | |||
Author G. Armitage performed most of his work on this document while | ||||
employed by Swinburne University of Technology, Melbourne, Australia. | ||||
The authors would like to thank Stuart Cheshire for many suggestions | The authors would like to thank Stuart Cheshire for many suggestions | |||
when revising the draft, and the following people for their | when revising the draft, and the following people for their | |||
contributions to [ABE2017]: Chamil Kulatunga, David Ros, Stein | contributions to [ABE2017]: Chamil Kulatunga, David Ros, Stein | |||
Gjessing, Sebastian Zander. Thanks also to (in alphabetical order) | Gjessing, Sebastian Zander. Thanks also to (in alphabetical order) | |||
Roland Bless, Bob Briscoe, David Black, Markku Kojo, John Leslie, | Roland Bless, Bob Briscoe, David Black, Markku Kojo, John Leslie, | |||
Lawrence Stewart, Dave Taht and the TCPM working group for providing | Lawrence Stewart, Dave Taht and the TCPM working group for providing | |||
valuable feedback on this document. | valuable feedback on this document. | |||
The authors would finally like to thank everyone who provided | The authors would finally like to thank everyone who provided | |||
feedback on the congestion control behaviour specified in this update | feedback on the congestion control behaviour specified in this update | |||
received from the IRTF Internet Congestion Control Research Group | received from the IRTF Internet Congestion Control Research Group | |||
(ICCRG). | (ICCRG). | |||
7. IANA Considerations | 8. IANA Considerations | |||
XX RFC ED - PLEASE REMOVE THIS SECTION XXX | XX RFC ED - PLEASE REMOVE THIS SECTION XXX | |||
This document includes no request to IANA. | This document includes no request to IANA. | |||
8. Implementation Status | 9. Implementation Status | |||
ABE is implemented as a patch for Linux and FreeBSD. It is meant for | ABE is implemented as a patch for Linux and FreeBSD. It is meant for | |||
research and available for download from | research and available for download from | |||
http://heim.ifi.uio.no/naeemk/research/ABE/. This code was used to | http://heim.ifi.uio.no/naeemk/research/ABE/. This code was used to | |||
produce the test results that are reported in [ABE2017]. The FreeBSD | produce the test results that are reported in [ABE2017]. The FreeBSD | |||
code has been committed to the mainline kernel on March 19, 2018 | code has been committed to the mainline kernel on March 19, 2018 | |||
[ABE-FreeBSD]. | [ABE-FreeBSD]. | |||
9. Security Considerations | 10. Security Considerations | |||
The described method is a sender-side only transport change, and does | The described method is a sender-side only transport change, and does | |||
not change the protocol messages exchanged. The security | not change the protocol messages exchanged. The security | |||
considerations for ECN [RFC3168] therefore still apply. | considerations for ECN [RFC3168] therefore still apply. | |||
This is a change to TCP congestion control with ECN that will | This is a change to TCP congestion control with ECN that will | |||
typically lead to a change in the capacity achieved when flows share | typically lead to a change in the capacity achieved when flows share | |||
a network bottleneck. This could result in some flows receiving more | a network bottleneck. This could result in some flows receiving more | |||
than their fair share of capacity. Similar unfairness in the way | than their fair share of capacity. Similar unfairness in the way | |||
that capacity is shared is also exhibited by other congestion control | that capacity is shared is also exhibited by other congestion control | |||
mechanisms that have been in use in the Internet for many years | mechanisms that have been in use in the Internet for many years | |||
(e.g., CUBIC [RFC8312]). Unfairness may also be a result of other | (e.g., CUBIC [RFC8312]). Unfairness may also be a result of other | |||
factors, including the round trip time experienced by a flow. ABE | factors, including the round trip time experienced by a flow. ABE | |||
applies only when ECN-marked packets are received, not when packets | applies only when ECN-marked packets are received, not when packets | |||
are lost, hence use of ABE cannot lead to congestion collapse. | are lost, hence use of ABE cannot lead to congestion collapse. | |||
10. Revision Information | 11. Revision Information | |||
XX RFC ED - PLEASE REMOVE THIS SECTION XXX | XX RFC ED - PLEASE REMOVE THIS SECTION XXX | |||
-08. Addressed comments from AD review on the document structure, | ||||
and relationship to existing RFCs. | ||||
-07. Addressed comments following WGLC. | -07. Addressed comments following WGLC. | |||
o Updated Reference citations | o Updated Reference citations. | |||
o Removed paragraph containing a wrong statement related to timeout | o Removed paragraph containing a wrong statement related to timeout | |||
in section 4.1. | in section 4.1. | |||
o Discuss what happens when cwnd <= ssthresh | o Discuss what happens when cwnd <= ssthresh. | |||
o Added text on Concern about lower bound of 2*SMSS | o Added text on Concern about lower bound of 2*SMSS. | |||
-06. Addressed Michael Scharf's comments. | -06. Addressed Michael Scharf's comments. | |||
-05. Refined the description of the experiment based on feedback at | -05. Refined the description of the experiment based on feedback at | |||
IETF-100. Incorporated comments from David Black. | IETF-100. Incorporated comments from David Black. | |||
-04. Incorporates review comments from Lawrence Stewart and the | -04. Incorporates review comments from Lawrence Stewart and the | |||
remaining comments from Roland Bless. References are updated. | remaining comments from Roland Bless. References are updated. | |||
-03. Several review comments from Roland Bless are addressed. | -03. Several review comments from Roland Bless are addressed. | |||
Consistent terminology and equations. Clarification on the scope of | Consistent terminology and equations. Clarification on the scope of | |||
recommended beta_{ecn} value. | recommended beta_{ecn} value. | |||
-02. Corrected the equations in Section 4.3. Updated the | -02. Corrected the equations in Section 3.1. Updated the | |||
affiliations. Lower bound for cwnd is defined. A recommendation for | affiliations. Lower bound for cwnd is defined. A recommendation for | |||
window-based transport protocols is changed to cover all transport | window-based transport protocols is changed to cover all transport | |||
protocols that implement a congestion control reduction to an ECN | protocols that implement a congestion control reduction to an ECN | |||
congestion signal. Added text about ABE's FreeBSD mainline kernel | congestion signal. Added text about ABE's FreeBSD mainline kernel | |||
status including a reference to the FreeBSD code review page. | status including a reference to the FreeBSD code review page. | |||
References are updated. | References are updated. | |||
-01. Text improved, mainly incorporating comments from Stuart | -01. Text improved, mainly incorporating comments from Stuart | |||
Cheshire. The reference to a technical report has been updated to a | Cheshire. The reference to a technical report has been updated to a | |||
published version of the tests [ABE2017]. Used "AQM Mechanism" | published version of the tests [ABE2017]. Used "AQM Mechanism" | |||
skipping to change at page 10, line 33 ¶ | skipping to change at page 11, line 5 ¶ | |||
allowing experiments. As a result, some of the motivating and | allowing experiments. As a result, some of the motivating and | |||
discussing text that was moved from draft-khademi-alternativebackoff- | discussing text that was moved from draft-khademi-alternativebackoff- | |||
ecn-03 to draft-khademi-tsvwg-ecn-response-00 has now been re- | ecn-03 to draft-khademi-tsvwg-ecn-response-00 has now been re- | |||
inserted here. | inserted here. | |||
Individual draft -00. draft-khademi-tsvwg-ecn-response-00 and draft- | Individual draft -00. draft-khademi-tsvwg-ecn-response-00 and draft- | |||
khademi-tcpm-alternativebackoff-ecn-00 replace draft-khademi- | khademi-tcpm-alternativebackoff-ecn-00 replace draft-khademi- | |||
alternativebackoff-ecn-03, following discussion in the TSVWG and TCPM | alternativebackoff-ecn-03, following discussion in the TSVWG and TCPM | |||
working groups. | working groups. | |||
11. References | 12. References | |||
11.1. Normative References | 12.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
of Explicit Congestion Notification (ECN) to IP", | of Explicit Congestion Notification (ECN) to IP", | |||
RFC 3168, DOI 10.17487/RFC3168, September 2001, | RFC 3168, DOI 10.17487/RFC3168, September 2001, | |||
<https://www.rfc-editor.org/info/rfc3168>. | <https://www.rfc-editor.org/info/rfc3168>. | |||
skipping to change at page 11, line 20 ¶ | skipping to change at page 11, line 38 ¶ | |||
[RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., | [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., | |||
and G. Judd, "Data Center TCP (DCTCP): TCP Congestion | and G. Judd, "Data Center TCP (DCTCP): TCP Congestion | |||
Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, | Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, | |||
October 2017, <https://www.rfc-editor.org/info/rfc8257>. | October 2017, <https://www.rfc-editor.org/info/rfc8257>. | |||
[RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion | [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion | |||
Notification (ECN) Experimentation", RFC 8311, | Notification (ECN) Experimentation", RFC 8311, | |||
DOI 10.17487/RFC8311, January 2018, | DOI 10.17487/RFC8311, January 2018, | |||
<https://www.rfc-editor.org/info/rfc8311>. | <https://www.rfc-editor.org/info/rfc8311>. | |||
11.2. Informative References | 12.2. Informative References | |||
[ABE-FreeBSD] | [ABE-FreeBSD] | |||
"ABE patch review in FreeBSD", | "ABE patch review in FreeBSD", | |||
<https://svnweb.freebsd.org/ | <https://svnweb.freebsd.org/ | |||
base?view=revision&revision=331214>. | base?view=revision&revision=331214>. | |||
[ABE2017] Khademi, N., Armitage, G., Welzl, M., Fairhurst, G., | [ABE2017] Khademi, N., Armitage, G., Welzl, M., Fairhurst, G., | |||
Zander, S., and D. Ros, "Alternative Backoff: Achieving | Zander, S., and D. Ros, "Alternative Backoff: Achieving | |||
Low Latency and High Throughput with ECN and AQM", IFIP | Low Latency and High Throughput with ECN and AQM", IFIP | |||
NETWORKING 2017, Stockholm, Sweden, June 2017. | NETWORKING 2017, Stockholm, Sweden, June 2017. | |||
skipping to change at page 13, line 4 ¶ | skipping to change at page 13, line 19 ¶ | |||
Email: naeemk@ifi.uio.no | Email: naeemk@ifi.uio.no | |||
Michael Welzl | Michael Welzl | |||
University of Oslo | University of Oslo | |||
PO Box 1080 Blindern | PO Box 1080 Blindern | |||
Oslo N-0316 | Oslo N-0316 | |||
Norway | Norway | |||
Email: michawe@ifi.uio.no | Email: michawe@ifi.uio.no | |||
Grenville Armitage | Grenville Armitage | |||
Internet For Things (I4T) Research Group | Netflix Inc. | |||
Swinburne University of Technology | ||||
PO Box 218 | ||||
John Street, Hawthorn | ||||
Victoria 3122 | ||||
Australia | ||||
Email: garmitage@swin.edu.au | Email: garmitage@netflix.com | |||
Godred Fairhurst | Godred Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering, Fraser Noble Building | School of Engineering, Fraser Noble Building | |||
Aberdeen AB24 3UE | Aberdeen AB24 3UE | |||
UK | UK | |||
Email: gorry@erg.abdn.ac.uk | Email: gorry@erg.abdn.ac.uk | |||
End of changes. 42 change blocks. | ||||
189 lines changed or deleted | 193 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |