draft-ietf-tcpm-rtorestart-02.txt   draft-ietf-tcpm-rtorestart-03.txt 
TCP Maintenance and Minor Extensions (tcpm) P. Hurtig TCP Maintenance and Minor Extensions (tcpm) P. Hurtig
Internet-Draft A. Brunstrom Internet-Draft A. Brunstrom
Intended status: Experimental Karlstad University Intended status: Experimental Karlstad University
Expires: August 18, 2014 A. Petlund Expires: January 5, 2015 A. Petlund
Simula Research Laboratory AS Simula Research Laboratory AS
M. Welzl M. Welzl
University of Oslo University of Oslo
February 14, 2014 July 4, 2014
TCP and SCTP RTO Restart TCP and SCTP RTO Restart
draft-ietf-tcpm-rtorestart-02 draft-ietf-tcpm-rtorestart-03
Abstract Abstract
This document describes a modified algorithm for managing the TCP and This document describes a modified algorithm for managing the TCP and
SCTP retransmission timers that provides faster loss recovery when SCTP retransmission timers that provides faster loss recovery when
there is a small amount of outstanding data for a connection. The there is a small amount of outstanding data for a connection. The
modification allows the transport to restart its retransmission timer modification, RTO Restart (RTOR), allows the transport to restart its
more aggressively in situations where fast retransmit cannot be used. retransmission timer more aggressively in situations where fast
This enables faster loss detection and recovery for connections that retransmit cannot be used. This enables faster loss detection and
are short-lived or application-limited. recovery for connections that are short-lived or application-limited.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 18, 2014. This Internet-Draft will expire on January 5, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 19 skipping to change at page 2, line 19
described in the Simplified BSD License. described in the Simplified BSD License.
1. Introduction 1. Introduction
TCP uses two mechanisms to detect segment loss. First, if a segment TCP uses two mechanisms to detect segment loss. First, if a segment
is not acknowledged within a certain amount of time, a retransmission is not acknowledged within a certain amount of time, a retransmission
timeout (RTO) occurs, and the segment is retransmitted [RFC6298]. timeout (RTO) occurs, and the segment is retransmitted [RFC6298].
While the RTO is based on measured round-trip times (RTTs) between While the RTO is based on measured round-trip times (RTTs) between
the sender and receiver, it also has a conservative lower bound of 1 the sender and receiver, it also has a conservative lower bound of 1
second to ensure that delayed segments are not mistaken as lost. second to ensure that delayed segments are not mistaken as lost.
Second, when a sender receives duplicate acknowledgments, the fast Second, when a sender receives dupACKs, the fast retransmit algorithm
retransmit algorithm infers segment loss and triggers a infers segment loss and triggers a retransmission [RFC5681].
retransmission [RFC5681]. Duplicate acknowledgments are generated by Duplicate acknowledgments are generated by a receiver when out-of-
a receiver when out-of-order segments arrive. As both segment loss order segments arrive. As both segment loss and segment reordering
and segment reordering cause out-of-order arrival, fast retransmit cause out-of-order arrival, fast retransmit waits for three dupACKs
waits for three duplicate acknowledgments before considering the before considering the segment as lost. In some situations, however,
segment as lost. In some situations, however, the number of the number of outstanding segments is not enough to trigger three
outstanding segments is not enough to trigger three duplicate dupACKs, and the sender must rely on lengthy RTOs for loss recovery.
acknowledgments, and the sender must rely on lengthy RTOs for loss
recovery.
The number of outstanding segments can be small for several reasons: The number of outstanding segments can be small for several reasons:
(1) The connection is limited by the congestion control when the (1) The connection is limited by the congestion control when the
path has a low total capacity (bandwidth-delay product) or the path has a low total capacity (bandwidth-delay product) or the
connection's share of the capacity is small. It is also limited connection's share of the capacity is small. It is also limited
by the congestion control in the first few RTTs of a connection by the congestion control in the first few RTTs of a connection
or after an RTO when the available capacity is probed using or after an RTO when the available capacity is probed using
slow-start. slow-start.
(2) The connection is limited by the receiver's available buffer (2) The connection is limited by the receiver's available buffer
space. space.
(3) The connection is limited by the application if the available (3) The connection is limited by the application if the available
capacity of the path is not fully utilized (e.g. interactive capacity of the path is not fully utilized (e.g. interactive
applications), or at the end of a transfer. applications), or at the end of a transfer.
While the reasons listed above are valid for any flow, the third While the reasons listed above are valid for any flow, the third
reason is common for applications that transmit short flows, or use a reason is most common for applications that transmit short flows, or
low transmission rate. Typical examples of applications that produce use a bursty transmission pattern. A typical example of applications
short flows are web servers. [RJ10] shows that 70% of all web that produce short flows are web-based applications. [RJ10] shows
objects, found at the top 500 sites, are too small for fast that 70% of all web objects, found at the top 500 sites, are too
retransmit to work. [FDT13] shows that about 77% of all small for fast retransmit to work. [FDT13] shows that about 77% of
retransmissions sent by a major web service are sent after RTO all retransmissions sent by a major web service are sent after RTO
expiry. Applications have a low transmission rate when data is sent expiry. Applications with bursty transmission patterns often send
in response to actions, or as a reaction to real life events. data in response to actions, or as a reaction to real life events.
Typical examples of such applications are stock trading systems, Typical examples of such applications are stock trading systems,
remote computer operations and online games. What is special about remote computer operations, online games, and web-based applications
this class of applications is that they are time-dependant, and extra using persistent connections. What is special about this class of
latency can reduce the application service level [P09]. Although applications is that they often are time-dependant, and extra latency
such applications may represent a small amount of data sent on the can reduce the application service level [P09].
network, a considerable number of flows have such properties and the
importance of low latency is high.
The RTO restart approach outlined in this document makes the RTO The RTO Restart (RTOR) mechanism described in this document makes the
slightly more aggressive when the number of outstanding segments is RTO slightly more aggressive when the number of outstanding segments
small, in an attempt to enable faster loss recovery for all segments is too small for fast retransmit to work, in an attempt to enable
while being robust to reordering. While it still conforms to the faster loss recovery for all segments while being robust to
requirement in [RFC6298] that segments must not be retransmitted reordering. While RTOR still conforms to the requirement in
earlier than RTO seconds after their original transmission, it could [RFC6298] that segments must not be retransmitted earlier than RTO
increase the risk of spurious timeout. Spurious timeouts typically seconds after their original transmission, it could increase the risk
degrade the performance of flows with multiple bursts of data, as a of spurious timeout. Spurious timeouts typically degrade the
burst following a spurious timeout might not fit within the reduced performance of flows with multiple bursts of data, as a burst
following a spurious timeout might not fit within the reduced
congestion window (cwnd). congestion window (cwnd).
While this document focuses on TCP, the described changes are also While this document focuses on TCP, the described changes are also
valid for the Stream Control Transmission Protocol (SCTP) [RFC4960] valid for the Stream Control Transmission Protocol (SCTP) [RFC4960]
which has similar loss recovery and congestion control algorithms. which has similar loss recovery and congestion control algorithms.
1.1. Requirements Language 2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
2. RTO Restart Overview This document introduces the following variable:
RTO Restart threshold (rrthresh): RTOR is enabled whenever the number
of outstaning segments is below this threshold.
3. RTO Restart Overview
The RTO management algorithm described in [RFC6298] recommends that The RTO management algorithm described in [RFC6298] recommends that
the retransmission timer is restarted when an acknowledgment (ACK) the retransmission timer is restarted when an acknowledgment (ACK)
that acknowledges new data is received and there is still outstanding that acknowledges new data is received and there is still outstanding
data. The restart is conducted to guarantee that unacknowledged data. The restart is conducted to guarantee that unacknowledged
segments will be retransmitted after approximately RTO seconds. segments will be retransmitted after approximately RTO seconds.
However, by restarting the timer on each incoming acknowledgment, However, by restarting the timer on each incoming ACK,
retransmissions are not typically triggered RTO seconds after their retransmissions are not typically triggered RTO seconds after their
previous transmission but rather RTO seconds after the last ACK previous transmission but rather RTO seconds after the last ACK
arrived. The duration of this extra delay depends on several factors arrived. The duration of this extra delay depends on several factors
but is in most cases approximately one RTT. Hence, in most but is in most cases approximately one RTT. Hence, in most
situations, the time before a retransmission is triggered is equal to situations, the time before a retransmission is triggered is equal to
"RTO + RTT". "RTO + RTT".
The extra delay can be significant, especially for applications that The extra delay can be significant, especially for applications that
use a lower RTOmin than the standard of 1 second and/or in use a lower RTOmin than the standard of 1 second and/or in
environments with high RTTs, e.g. mobile networks. The restart environments with high RTTs, e.g. mobile networks. The restart
approach is illustrated in Figure 1 where a TCP sender transmits approach is illustrated in Figure 1 where a TCP sender transmits
three segments to a receiver. The arrival of the first and second three segments to a receiver. The arrival of the first and second
segment triggers a delayed ACK [RFC1122], which restarts the RTO segment triggers a delayed ACK (delACK) [RFC1122], which restarts the
timer at the sender. RTO restart is performed approximately one RTT RTO timer at the sender. RTO restart is performed approximately one
after the transmission of the third segment. Thus, if the third RTT after the transmission of the third segment. Thus, if the third
segment is lost, as indicated in Figure 1, the effective loss segment is lost, as indicated in Figure 1, the effective loss
detection time is "RTO + RTT" seconds. In some situations, the detection time is "RTO + RTT" seconds. In some situations, the
effective loss detection time becomes even longer. Consider a effective loss detection time becomes even longer. Consider a
scenario where only two segments are outstanding. If the second scenario where only two segments are outstanding. If the second
segment is lost, the time to expire the delayed ACK timer will also segment is lost, the time to expire the delACK timer will also be
be included in the effective loss detection time. included in the effective loss detection time.
Sender Receiver Sender Receiver
... ...
DATA [SEG 1] ----------------------> (ack delayed) DATA [SEG 1] ----------------------> (ack delayed)
DATA [SEG 2] ----------------------> (send ack) DATA [SEG 2] ----------------------> (send ack)
DATA [SEG 3] ----X /-------- ACK DATA [SEG 3] ----X /-------- ACK
(restart RTO) <----------/ (restart RTO) <----------/
... ...
(RTO expiry) (RTO expiry)
DATA [SEG 3] ----------------------> DATA [SEG 3] ---------------------->
skipping to change at page 4, line 46 skipping to change at page 4, line 46
drastically lower the congestion window compared to fast retransmit. drastically lower the congestion window compared to fast retransmit.
The current approach can therefore be beneficial -- it is described The current approach can therefore be beneficial -- it is described
in [EL04] to act as a "safety margin" that compensates for some of in [EL04] to act as a "safety margin" that compensates for some of
the problems that the authors have identified with the standard RTO the problems that the authors have identified with the standard RTO
calculation. Notably, the authors of [EL04] also state that "this calculation. Notably, the authors of [EL04] also state that "this
safety margin does not exist for highly interactive applications safety margin does not exist for highly interactive applications
where often only a single packet is in flight." where often only a single packet is in flight."
Although fast retransmit is preferrable there are situations where Although fast retransmit is preferrable there are situations where
timeouts are appropriate, or the only choice. For example, if the timeouts are appropriate, or the only choice. For example, if the
network is severely congested and no segments arrive, RTO-based network is severely congested and no segments arrive RTO-based
recovery should be used. In this situation, the time to recover from recovery should be used. In this situation, the time to recover from
the loss(es) will not be the performance bottleneck. However, for the loss(es) will not be the performance bottleneck. However, for
connections that do not utilize enough capacity to enable fast connections that do not utilize enough capacity to enable fast
retransmit, RTO-based loss detection is the only choice and the time retransmit, RTO-based loss detection is the only choice and the time
required for this can become a serious performance bottleneck. required for this can become a serious performance bottleneck.
3. RTO Restart Algorithm 4. RTOR Algorithm
To enable faster loss recovery for connections that are unable to use To enable faster loss recovery for connections that are unable to use
fast retransmit, an alternative restart can be used. By resetting fast retransmit, RTOR can be used. By resetting the timer to "RTO -
the timer to "RTO - T_earliest", where T_earliest is the time elapsed T_earliest", where T_earliest is the time elapsed since the earliest
since the earliest outstanding segment was transmitted, outstanding segment was transmitted, retransmissions will always
retransmissions will always occur after exactly RTO seconds. This occur after exactly RTO seconds. This approach makes the RTO more
approach makes the RTO more aggressive than the standardized approach aggressive than the standardized approach in [RFC6298] but still
in [RFC6298] but still conforms to the requirement in [RFC6298] that conforms to the requirement in [RFC6298] that segments must not be
segments must not be retransmitted earlier than RTO seconds after retransmitted earlier than RTO seconds after their original
their original transmission. transmission.
This document specifies an OPTIONAL sender-only modification to TCP This document specifies an OPTIONAL sender-only modification to TCP
and SCTP which updates step 5.3 in Section 5 of [RFC6298] (and a and SCTP which updates step 5.3 in Section 5 of [RFC6298] (and a
similar update in Section 6.3.2 of [RFC4960] for SCTP). A sender similar update in Section 6.3.2 of [RFC4960] for SCTP). A sender
that implements this method MUST follow the algorithm below: that implements this method MUST follow the algorithm below:
When an ACK is received that acknowledges new data: When an ACK is received that acknowledges new data, or when a new
data segment has been sent:
(1) Set T_earliest = 0. (1) Set T_earliest = 0.
(2) If the following two conditions hold: (2) If the following two conditions hold:
(a) The number of outstanding segments is less than a RTO (a) The number of outstanding segments is less than a RTOR
restart threshold (rrthresh). The rrthresh SHOULD be threshold (rrthresh). The rrthresh SHOULD be set to
set to four. four.
(b) There is no unsent data ready for transmission. (b) There is no unsent data ready for transmission.
set T_earliest to the time elapsed since the earliest set T_earliest to the time elapsed since the earliest
outstanding segment was sent. outstanding segment was sent.
(3) Restart the retransmission timer so that it will expire after (3) Restart the retransmission timer so that it will expire after
"RTO - T_earliest" seconds (for the current value of RTO). "RTO - T_earliest" seconds (for the current value of RTO).
This update needs TCP implementations to track the time elapsed since This update needs TCP implementations to track the time elapsed since
the transmission of the earliest outstanding segment (T_earliest). the transmission of the earliest outstanding segment (T_earliest).
The modified restart is only necessary to conduct when fast As RTOR is only used when the amount of outstanding data is less than
retransmit cannot be triggered, i.e., when there are less than four rrthresh segments, TCP implementations also need to track whether the
segments outstanding. Therefore, only four segments need to be amount of outstanding data is more, equal, or less than rrthresh
tracked by the TCP implementation. Furthermore, some implementations segments. Although some packet-based TCP implementations (e.g.
of TCP (e.g. Linux TCP) already track the transmission times of all Linux TCP) already track both the transmission times of all segments
segments. and also the number of outstanding segments, not all implementations
do. Section 5.3 describes how to implement segment tracking for a
general TCP implementation.
4. Discussion 5. Discussion
In this section, we discuss the applicability and a number of issues In this section, we discuss the applicability and a number of issues
surrounding the modified RTO restart. surrounding RTOR.
4.1. Applicability 5.1. Applicability
The currently standardized algorithm has been shown to add at least The currently standardized algorithm has been shown to add at least
one RTT to the loss recovery process in TCP [LS00] and SCTP one RTT to the loss recovery process in TCP [LS00] and SCTP
[HB11][PBP09]. For applications that have strict timing requirements [HB11][PBP09]. For applications that have strict timing requirements
(e.g. interactive web and gaming) rather than throughput (e.g. interactive web) rather than throughput requirements, using
requirements, the modified restart approach could be important RTOR could be beneficial because the RTT and also the delACK timer of
because the RTT and also the delayed ACK timer of receivers are often receivers are often large components of the effective loss recovery
large components of the effective loss recovery time. Measurements time. Measurements in [HB11] have shown that the total transfer time
in [HB11] have shown that the total transfer time of a lost segment of a lost segment (including the original transmission time and the
(including the original transmission time and the loss recovery time) loss recovery time) can be reduced by 35% using RTOR. These results
can be reduced by 35% using the suggested approach. These results match those presented in [PGH06][PBP09], where RTOR is shown to
match those presented in [PGH06][PBP09], where the modified restart significantly reduce retransmission latency.
approach is shown to significantly reduce retransmission latency.
There are also traffic types that do not benefit from a modified There are also traffic types that do not benefit from RTOR. One
restart behavior of the timer. One example of such traffic is bulk example of such traffic is bulk transmission. The reason why bulk
transmission. The reason why bulk traffic does not benefit from RTO traffic does not benefit RTOR is related to the number of outstanding
restart is related to the number of outstanding segments that such segments that such flows usually have. Fast retransmit [RFC5681],
flows usually have. Fast retransmit [RFC5681], the preferred loss the preferred loss recovery mechanism, is triggered whenever three
recovery mechanism, is triggered whenever three duplicate dupACKs arrive at a TCP sender. Duplicate acknowledgments are
acknowledgments arrive at a TCP sender. Duplicate acknowledgments generated by a receiver when out-of-order segments arrive. As both
are generated by a receiver when out-of-order segments arrive. As segment loss and segment reordering cause out-of-order arrival, fast
both segment loss and segment reordering cause out-of-order arrival, retransmit waits for three dupACKs before regarding the segment as
fast retransmit waits for three duplicate acknowledgments before lost. Considering this, bulk flows will mostly use fast retransmit
regarding the segment as lost. Considering this, bulk flows will as they often have three or more outstanding segments. Moreover, as
mostly use fast retransmit as they often have three or more RTOR is not activated as long as there are rrthresh, or more,
outstanding segments. Moreover, as the modified restart behavior is segments outstanding the risk of recovering loss using timeouts
not activated when there are four, or more, segments outstanding instead of fast retransmits can be controlled.
there is no increased risk of recovering loss using timeouts instead
of fast retransmits.
Given RTO restart's ability to only work when it is beneficial for Given RTOR's ability to only work when it is beneficial for the loss
the loss recovery process, it is suitable as a system-wide default recovery process, it is suitable as a system-wide default mechanism
mechanism for TCP traffic. for TCP traffic.
4.2. Spurious Timeouts 5.2. Spurious Timeouts
This document describes a modified RTO restart behavior that, in some RTOR can in some situations reduce the loss detection time and
situations, reduces the loss detection time and thereby increases the thereby increase the risk of spurious timeouts. In theory, the
risk of spurious timeouts. In theory, the retransmission timer has a retransmission timer has a lower bound of 1 second [RFC6298], which
lower bound of 1 second [RFC6298], which limits the risk of having limits the risk of having spurious timeouts. However, in practice
spurious timeouts. However, in practice most implementations use a most implementations use a significantly lower value. Initial
significantly lower value. Initial measurements, conducted by the measurements, conducted by the authors, show slight increases in the
authors, show slight increases in the number of spurious timeouts number of spurious timeouts when such lower values are used.
when such lower values are used. However, further experiments, in However, further experiments, in different environments and with
different environments and with different types of traffic, are different types of traffic, are encouraged to quantify such increases
encouraged to quantify such increases more reliably. more reliably.
Does a slightly increased risk matter? Generally, spurious timeouts Does a slightly increased risk matter? Generally, spurious timeouts
have a negative effect on TCP/SCTP performance as the congestion have a negative effect on TCP/SCTP performance as the congestion
window is reduced to one segment [RFC5681], limiting an application's window is reduced to one segment [RFC5681], limiting an application's
ability to transmit large amounts of data instantaneously. However, ability to transmit large amounts of data instantaneously. However,
with respect to RTO restart spurious timeouts are only a problem for with respect to RTOR spurious timeouts are only a problem for
applications transmitting multiple bursts of data within a single applications transmitting multiple bursts of data within a single
flow. Other types of flows, e.g. long-lived bulk flows, are not flow. Other types of flows, e.g. long-lived bulk flows, are not
affected as the algorithm is only applied when the amount of affected as the algorithm is only applied when the amount of
outstanding segments is less than four and no previously unsent data outstanding segments is less than rrthresh and no previously unsent
is available. Furthermore, short-lived and application-limited flows data is available. Furthermore, short-lived and application-limited
are typically not affected as they are too short to experience the flows are typically not affected as they are too short to experience
effect of congestion control or have a transmission rate that is the effect of congestion control or have a transmission rate that is
quickly attainable. quickly attainable.
While a slight increase in spurious timeouts has been observed using While a slight increase in spurious timeouts has been observed using
the modified RTO restart approach, it is not clear whether the RTOR, it is not clear whether the effects of this increase mandate
effects of this increase mandate any future algorithmic changes or any future algorithmic changes or not -- especially since most modern
not -- especially since most modern operating systems already include operating systems already include mechanisms to detect
mechanisms to detect [RFC3522][RFC3708][RFC5682] and resolve [RFC3522][RFC3708][RFC5682] and resolve [RFC4015] possible problems
[RFC4015] possible problems with spurious retransmissions. Further with spurious retransmissions. Further experimentation is needed to
experimentation is needed to determine this and thereby move this determine this and thereby move this specification from experimental
specification from experimental to proposed standard. to proposed standard.
5. Related Work 5.3. Tracking Outstanding Segments
Section 3.2 of [RFC5827] outlines a general method of tracking the
number of outstanding segments. This method can be used by TCP
implementations that do not natively track this number. The basic
idea is to track the segment boundaries of the last transmitted
segments (rrthresh segments for RTOR). In practice this could be
achieved by keeping a circular list of the last rrthresh segment
boundaries. Then, cumulative ACKs that do not fall within this
region indicate that at least rrthresh segments are outstanding.
Similarly, when cumulative ACKs fall within this region, the number
of outstanding segments is smaller.
6. Related Work
There are several proposals that address the problem of not having There are several proposals that address the problem of not having
enough ACKs for loss recovery. In what follows, we explain why the enough ACKs for loss recovery. In what follows, we explain why the
mechanism described here is complementary to these approaches: mechanism described here is complementary to these approaches:
The limited transmit mechanism [RFC3042] allows a TCP sender to The limited transmit mechanism [RFC3042] allows a TCP sender to
transmit a previously unsent segment for each of the first two transmit a previously unsent segment for each of the first two
duplicate acknowledgments. By transmitting new segments, the sender dupACKs. By transmitting new segments, the sender attempts to
attempts to generate additional duplicate acknowledgments to enable generate additional dupACKs to enable fast retransmit. However,
fast retransmit. However, limited transmit does not help if no limited transmit does not help if no previously unsent data is ready
previously unsent data is ready for transmission or if the receiver for transmission or if the receiver has no buffer space. [RFC5827]
has no buffer space. [RFC5827] specifies an early retransmit specifies an early retransmit algorithm to enable fast loss recovery
algorithm to enable fast loss recovery in such situations. By in such situations. By dynamically lowering the number of dupACKs
dynamically lowering the number of duplicate acknowledgments needed needed for fast retransmit (dupthresh), based on the number of
for fast retransmit (dupthresh), based on the number of outstanding outstanding segments, a smaller number of dupACKs is needed to
segments, a smaller number of duplicate acknowledgments are needed to
trigger a retransmission. In some situations, however, the algorithm trigger a retransmission. In some situations, however, the algorithm
is of no use or might not work properly. First, if a single segment is of no use or might not work properly. First, if a single segment
is outstanding, and lost, it is impossible to use early retransmit. is outstanding, and lost, it is impossible to use early retransmit.
Second, if ACKs are lost, the early retransmit cannot help. Third, Second, if ACKs are lost, the early retransmit cannot help. Third,
if the network path reorders segments, the algorithm might cause more if the network path reorders segments, the algorithm might cause more
unnecessary retransmissions than fast retransmit. unnecessary retransmissions than fast retransmit.
Following the fast retransmit mechanism standardized in [RFC5681] Following the fast retransmit mechanism standardized in [RFC5681]
this draft assumes a value of 3 for dupthresh, which is used as basis this draft assumes a value of 3 for dupthresh, which is used as a
for rrthresh. However, by considering a dynamic value for dupthresh basis for rrthresh. However, by considering a dynamic value for
a tighter integration with early retransmit (or other experimental dupthresh a tighter integration with early retransmit (or other
algorithms) could also be possible. experimental algorithms) could also be possible.
Tail Loss Probe [TLP] is a proposal to send up to two "probe Tail Loss Probe [TLP] is a proposal to send up to two "probe
segments" when a timer fires which is set to a value smaller than the segments" when a timer fires which is set to a value smaller than the
RTO. A "probe segment" is a new segment if new data is available, RTO. A "probe segment" is a new segment if new data is available,
else a retransmission. The intention is to compensate for sluggish else a retransmission. The intention is to compensate for sluggish
RTO behavior in situations where the RTO greatly exceeds the RTT, RTO behavior in situations where the RTO greatly exceeds the RTT,
which, according to measurements reported in [TLP], is not uncommon. which, according to measurements reported in [TLP], is not uncommon.
The Probe timeout (PTO) is normally two RTTs, and a spurious PTO is The Probe timeout (PTO) is normally two RTTs, and a spurious PTO is
less risky than a spurious RTO because it would not have the same less risky than a spurious RTO because it would not have the same
negative effects (clearing the scoreboard and restarting with slow- negative effects (clearing the scoreboard and restarting with slow-
start). In contrast, RTO restart is trying to make the RTO more start). In contrast, RTOR is trying to make the RTO more appropriate
appropriate in cases where there is no need to be overly cautious. in cases where there is no need to be overly cautious.
TLP is applicable in situations where RTO restart does not apply, and TLP is applicable in situations where RTOR does not apply, and it
it could overrule (yielding a similar general behavior, but with a could overrule (yielding a similar general behavior, but with a lower
lower timeout) RTO restart in cases where the number of outstanding timeout) RTOR in cases where the number of outstanding segments is
segments is smaller than four and no new segments are available for smaller than four and no new segments are available for transmission.
transmission. The PTO has the same inherent problem of restarting The PTO has the same inherent problem of restarting the timer on an
the timer on an incoming ACK, and could be combined with the modified incoming ACK, and could be combined with a strategy similar to RTOR's
restart approach to offer more consistent timeouts. to offer more consistent timeouts.
6. Acknowledgements 7. Acknowledgements
The authors wish to thank Godred Fairhurst, Yuchung Cheng, Mark The authors wish to thank Godred Fairhurst, Yuchung Cheng, Mark
Allman, Anantha Ramaiah, Richard Scheffenegger, and Nicolas Kuhn for Allman, Anantha Ramaiah, Richard Scheffenegger, Nicolas Kuhn, and
commenting the draft and the ideas behind it. Alexander Zimmermann for commenting the draft and the ideas behind
it.
All the authors are supported by RITE (http://riteproject.eu/ ), a All the authors are supported by RITE (http://riteproject.eu/ ), a
research project (ICT-317700) funded by the European Community under research project (ICT-317700) funded by the European Community under
its Seventh Framework Program. The views expressed here are those of its Seventh Framework Program. The views expressed here are those of
the author(s) only. The European Commission is not liable for any the author(s) only. The European Commission is not liable for any
use that may be made of the information in this document. use that may be made of the information in this document.
7. IANA Considerations 8. IANA Considerations
This memo includes no request to IANA. This memo includes no request to IANA.
8. Security Considerations 9. Security Considerations
This document discusses a change in how to set the retransmission This document discusses a change in how to set the retransmission
timer's value when restarted. This change does not raise any new timer's value when restarted. This change does not raise any new
security issues with TCP or SCTP. security issues with TCP or SCTP.
9. Changes from Previous Versions 10. Changes from Previous Versions
9.1. Changes from draft-ietf-...-01 to -02 RFC-Editor note: please remove this section prior to publication.
10.1. Changes from draft-ietf-...-02 to -03
o Updated the document to use "RTOR" instead of "RTO Restart" when
refering to the modified algorithm.
o Moved document terminology to a section of its own.
o Introduced the rrthresh variable in the terminology section.
o Added a section to generalize the tracking of outstanding
segments.
o Updated the algorithm to work when the number of outstanding
segments is less than four and one segment is ready for
transmission, by restarting the timer when new data has been sent.
o Clarified the relationship between fast retransmit and RTOR.
o Improved the wording throughout the document.
10.2. Changes from draft-ietf-...-01 to -02
o Changed the algorithm description in Section 3 to use formal RFC o Changed the algorithm description in Section 3 to use formal RFC
2119 language. 2119 language.
o Changed last paragraph of Section 3 to clarify why the RTO restart o Changed last paragraph of Section 3 to clarify why the RTO restart
algorithm is active when less than four segments are outstanding. algorithm is active when less than four segments are outstanding.
o Added two paragraphs in Section 4.1 to clarify why the algorithm o Added two paragraphs in Section 4.1 to clarify why the algorithm
can be turned on for all TCP traffic without having any negative can be turned on for all TCP traffic without having any negative
effects on traffic patterns that do not benefit from a modified effects on traffic patterns that do not benefit from a modified
timer restart. timer restart.
o Improved the wording throughout the document. o Improved the wording throughout the document.
o Replaced and updated some references. o Replaced and updated some references.
9.2. Changes from draft-ietf-...-00 to -01 10.3. Changes from draft-ietf-...-00 to -01
o Improved the wording throughout the document. o Improved the wording throughout the document.
o Removed the possibility for a connection limited by the receiver's o Removed the possibility for a connection limited by the receiver's
advertised window to use RTO restart, decreasing the risk of advertised window to use RTO restart, decreasing the risk of
spurious retransmission timeouts. spurious retransmission timeouts.
o Added a section that discusses the applicability of and problems o Added a section that discusses the applicability of and problems
related to the RTO restart mechanism. related to the RTO restart mechanism.
o Updated the text describing the relationship to TLP to reflect o Updated the text describing the relationship to TLP to reflect
updates made in this draft. updates made in this draft.
o Added acknowledgments. o Added acknowledgments.
10. References 11. References
10.1. Normative References 11.1. Normative References
[RFC1122] Braden, R., "Requirements for Internet Hosts - [RFC1122] Braden, R., "Requirements for Internet Hosts -
Communication Layers", STD 3, RFC 1122, October 1989. Communication Layers", STD 3, RFC 1122, October 1989.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3042] Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing [RFC3042] Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing
TCP's Loss Recovery Using Limited Transmit", RFC 3042, TCP's Loss Recovery Using Limited Transmit", RFC 3042,
January 2001. January 2001.
skipping to change at page 10, line 43 skipping to change at page 11, line 33
September 2009. September 2009.
[RFC5827] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and [RFC5827] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and
P. Hurtig, "Early Retransmit for TCP and Stream Control P. Hurtig, "Early Retransmit for TCP and Stream Control
Transmission Protocol (SCTP)", RFC 5827, May 2010. Transmission Protocol (SCTP)", RFC 5827, May 2010.
[RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent,
"Computing TCP's Retransmission Timer", RFC 6298, June "Computing TCP's Retransmission Timer", RFC 6298, June
2011. 2011.
10.2. Informative References 11.2. Informative References
[EL04] Ekstroem, H. and R. Ludwig, "The Peak-Hopper: A New End- [EL04] Ekstroem, H. and R. Ludwig, "The Peak-Hopper: A New End-
to-End Retransmission Timer for Reliable Unicast to-End Retransmission Timer for Reliable Unicast
Transport", IEEE INFOCOM 2004, March 2004. Transport", IEEE INFOCOM 2004, March 2004.
[FDT13] Flach, T., Dukkipati, N., Terzis, A., Raghavan, B., [FDT13] Flach, T., Dukkipati, N., Terzis, A., Raghavan, B.,
Cardwell, N., Cheng, Y., Jain, A., Hao, S., Katz-Bassett, Cardwell, N., Cheng, Y., Jain, A., Hao, S., Katz-Bassett,
E., and R. Govindan, "Reducing Web Latency: the Virtue of E., and R. Govindan, "Reducing Web Latency: the Virtue of
Gentle Aggression", Proc. ACM SIGCOMM Conf., August 2013. Gentle Aggression", Proc. ACM SIGCOMM Conf., August 2013.
 End of changes. 46 change blocks. 
157 lines changed or deleted 194 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/