draft-ietf-tcpm-frto-00.txt   draft-ietf-tcpm-frto-01.txt 
Internet Engineering Task Force P. Sarolahti Internet Engineering Task Force P. Sarolahti
INTERNET DRAFT Nokia Research Center INTERNET DRAFT Nokia Research Center
File: draft-ietf-tcpm-frto-00.txt M. Kojo File: draft-ietf-tcpm-frto-01.txt M. Kojo
University of Helsinki University of Helsinki
May, 2004 July, 2004
Expires: November, 2004 Expires: January, 2005
F-RTO: An Algorithm for Detecting F-RTO: An Algorithm for Detecting
Spurious Retransmission Timeouts with TCP and SCTP Spurious Retransmission Timeouts with TCP and SCTP
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of [RFC2026]. all provisions of Section 10 of RFC 2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as
Drafts. Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This document may not be modified, and derivative works of it may not
be created, except to publish it as an RFC and to translate it into
languages other than English.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2004). All Rights Reserved. Copyright (C) The Internet Society (2004). All Rights Reserved.
Abstract Abstract
Spurious retransmission timeouts cause suboptimal TCP performance, Spurious retransmission timeouts cause suboptimal TCP performance,
because they often result in unnecessary retransmission of the last because they often result in unnecessary retransmission of the last
window of data. This document describes the F-RTO detection algorithm window of data. This document describes the F-RTO detection algorithm
for detecting spurious TCP retransmission timeouts. F-RTO is a TCP for detecting spurious TCP retransmission timeouts. F-RTO is a TCP
sender only algorithm that does not require any TCP options to sender-only algorithm that does not require any TCP options to
operate. After retransmitting the first unacknowledged segment operate. After retransmitting the first unacknowledged segment
triggered by a timeout, the F-RTO algorithm at a TCP sender monitors triggered by a timeout, the F-RTO algorithm at a TCP sender monitors
the incoming acknowledgments to determine whether the timeout was the incoming acknowledgments to determine whether the timeout was
spurious and to decide whether to send new segments or retransmit spurious and to decide whether to send new segments or retransmit
unacknowledged segments. The algorithm effectively helps to avoid unacknowledged segments. The algorithm effectively helps to avoid
additional unnecessary retransmissions and thereby improves TCP additional unnecessary retransmissions and thereby improves TCP
performance in case of a spurious timeout. The F-RTO algorithm can performance in case of a spurious timeout. The F-RTO algorithm can
also be applied to SCTP. also be applied to SCTP.
Terminology Terminology
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
document, are to be interpreted as described in [RFC2119]. document, are to be interpreted as described in [RFC2119].
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. F-RTO Algorithm . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . 5
2.2 Discussion . . . . . . . . . . . . . . . . . . . . . . 6
3. SACK-enhanced version of the F-RTO algorithm . . . . . . . . 8
4. Taking Actions after Detecting Spurious RTO . . . . . . . . . 10
5. SCTP Considerations . . . . . . . . . . . . . . . . . . . . . 10
6. Security Considerations . . . . . . . . . . . . . . . . . . . 11
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 13
Appendix A: Scenarios . . . . . . . . . . . . . . . . . . . . . . 14
Appendix B: SACK-enhanced F-RTO and Fast Recovery . . . . . . . . 19
Appendix C: Discussion on Window Limited Cases . . . . . . . . . 20
1. Introduction 1. Introduction
The Transmission Control Protocol (TCP) [Pos81] has two methods for The Transmission Control Protocol (TCP) [Pos81] has two methods for
triggering retransmissions. First, the TCP sender relies on incoming triggering retransmissions. First, the TCP sender relies on incoming
duplicate ACKs, which indicate that the receiver is missing some of duplicate ACKs, which indicate that the receiver is missing some of
the data. After a required number of successive duplicate ACKs have the data. After a required number of successive duplicate ACKs have
arrived at the sender, it retransmits the first unacknowledged arrived at the sender, it retransmits the first unacknowledged
segment [APS99] and continues with a loss recovery algorithm such as segment [APS99] and continues with a loss recovery algorithm such as
NewReno [FHG04] or SACK-based loss recovery [BAFW03]. Second, the TCP NewReno [FHG04] or SACK-based loss recovery [BAFW03]. Second, the TCP
sender maintains a retransmission timer which triggers retransmission sender maintains a retransmission timer which triggers retransmission
of segments, if they have not been acknowledged before the of segments, if they have not been acknowledged before the
retransmission timeout (RTO) expires. When the retransmission timeout retransmission timeout (RTO) expires. When the retransmission timeout
occurs, the TCP sender enters the RTO recovery where congestion occurs, the TCP sender enters the RTO recovery where the congestion
window is initialized to one segment and unacknowledged segments are window is initialized to one segment and unacknowledged segments are
retransmitted using the slow-start algorithm. The retransmission retransmitted using the slow-start algorithm. The retransmission
timer is adjusted dynamically based on the measured round-trip times timer is adjusted dynamically based on the measured round-trip times
[PA00]. [PA00].
It has been pointed out that the retransmission timer can expire It has been pointed out that the retransmission timer can expire
spuriously and cause unnecessary retransmissions when no segments spuriously and cause unnecessary retransmissions when no segments
have been lost [LK00, GL02, LM03]. After a spurious retransmission have been lost [LK00, GL02, LM03]. After a spurious retransmission
timeout the late acknowledgments of the original segments arrive at timeout the late acknowledgments of the original segments arrive at
the sender, usually triggering unnecessary retransmissions of whole the sender, usually triggering unnecessary retransmissions of a whole
window of segments during the RTO recovery. Furthermore, after a window of segments during the RTO recovery. Furthermore, after a
spurious retransmission timeout a conventional TCP sender increases spurious retransmission timeout a conventional TCP sender increases
the congestion window on each late acknowledgment in slow start, the congestion window on each late acknowledgment in slow start,
injecting a large number of data segments to the network within one injecting a large number of data segments to the network within one
round-trip time, thus violating the packet conservation principle round-trip time, thus violating the packet conservation principle
[Jac88]. [Jac88].
There are a number of potential reasons for spurious retransmission There are a number of potential reasons for spurious retransmission
timeouts. First, some mobile networking technologies involve sudden timeouts. First, some mobile networking technologies involve sudden
delay spikes on transmission because of actions taken during a hand- delay spikes on transmission because of actions taken during a
off. Second, arrival of competing traffic, possibly with higher hand-off. Second, arrival of competing traffic, possibly with higher
priority, on a low-bandwidth link or some other change in available priority, on a low-bandwidth link or some other change in available
bandwidth involves a sudden increase of round-trip time which may bandwidth can cause a sudden increase of round-trip time which may
trigger a spurious retransmission timeout. A persistently reliable trigger a spurious retransmission timeout. A persistently reliable
link layer can also cause a sudden delay when a data frame and link layer can also cause a sudden delay when a data frame and
several retransmissions of it are lost for some reason. This document several retransmissions of it are lost for some reason. This document
does not distinguish the different causes of such a delay spike, but does not distinguish between the different causes of such a delay
discusses the spurious retransmission timeouts caused by a delay spike, but discusses the spurious retransmission timeouts caused by a
spike in general. delay spike in general.
This document describes the F-RTO detection algorithm. It is based on This document describes the F-RTO detection algorithm. It is based on
the detection mechanism of the "Forward RTO-Recovery" (F-RTO) the detection mechanism of the "Forward RTO-Recovery" (F-RTO)
algorithm [SKR03] that is used for detecting spurious retransmission algorithm [SKR03] that is used for detecting spurious retransmission
timeouts and thus avoiding unnecessary retransmissions following the timeouts and thus avoiding unnecessary retransmissions following the
retransmission timeout. When the timeout is not spurious, the F-RTO retransmission timeout. When the timeout is not spurious, the F-RTO
algorithm reverts back to the conventional RTO recovery algorithm and algorithm reverts back to the conventional RTO recovery algorithm and
therefore has similar behavior and performance. F-RTO does not therefore has similar behavior and performance. In contrast to
require any TCP options in its operation, and it can be implemented alternative algorithms proposed for detecting unnecessary
by modifying only the TCP sender. This is different from alternative retransmissions (Eifel [LK00], [LM03] and DSACK-based algorithms
algorithms (Eifel [LK00], [LM03] and DSACK-based algorithms [BA04]) [BA04]), F-RTO does not require any TCP options for its operation,
that have been suggested for detecting unnecessary retransmissions. and it can be implemented by modifying only the TCP sender. The
The Eifel algorithm uses TCP timestamps [BBJ92] for detecting a Eifel algorithm uses TCP timestamps [BBJ92] for detecting a spurious
spurious timeout upon arrival of the first acknowledgment after the timeout upon arrival of the first acknowledgment after the
retransmission. The DSACK-based algorithms require that the TCP retransmission. The DSACK-based algorithms require that the TCP
Selective Acknowledgment Option [MMFR96] with the DSACK extension Selective Acknowledgment Option [MMFR96] with the DSACK extension
[FMMP00] is in use. With DSACK, the TCP receiver can report if it has [FMMP00] is in use. With DSACK, the TCP receiver can report if it has
received a duplicate segment, making it possible for the sender to received a duplicate segment, making it possible for the sender to
detect afterwards whether it has retransmitted segments detect afterwards whether it has retransmitted segments
unnecessarily. The F-RTO algorithm only attempts to detect and avoid unnecessarily. The F-RTO algorithm only attempts to detect and avoid
unnecessary retransmissions after an RTO. Eifel and DSACK can also be unnecessary retransmissions after an RTO. Eifel and DSACK can also be
used for detecting unnecessary retransmissions caused by other used for detecting unnecessary retransmissions caused by other
events, for example packet reordering. events, for example packet reordering.
skipping to change at page 3, line 47 skipping to change at page 4, line 33
unacknowledged segment as usual [APS99]. Deviating from the normal unacknowledged segment as usual [APS99]. Deviating from the normal
operation after a timeout, it then tries to transmit new, previously operation after a timeout, it then tries to transmit new, previously
unsent data, for the first acknowledgment that arrives after the unsent data, for the first acknowledgment that arrives after the
timeout given that the acknowledgment advances the window. If the timeout given that the acknowledgment advances the window. If the
second acknowledgment that arrives after the timeout also advances second acknowledgment that arrives after the timeout also advances
the window, i.e., acknowledges data that was not retransmitted, the the window, i.e., acknowledges data that was not retransmitted, the
F-RTO sender declares the timeout spurious and exits the RTO F-RTO sender declares the timeout spurious and exits the RTO
recovery. However, if either of these two acknowledgments is a recovery. However, if either of these two acknowledgments is a
duplicate ACK, there is no sufficient evidence of a spurious timeout; duplicate ACK, there is no sufficient evidence of a spurious timeout;
therefore the F-RTO sender retransmits the unacknowledged segments in therefore the F-RTO sender retransmits the unacknowledged segments in
slow start similarly to the traditional algorithm. With a SACK- slow start similarly to the traditional algorithm. With a
enhanced version of the F-RTO algorithm, spurious timeouts may be SACK-enhanced version of the F-RTO algorithm, spurious timeouts may
detected even if duplicate ACKs arrive after an RTO retransmission. be detected even if duplicate ACKs arrive after an RTO
retransmission.
The F-RTO algorithm can also be applied with the Stream Control The F-RTO algorithm can also be applied to the Stream Control
Transmission Protocol (SCTP) [Ste00], because SCTP has similar Transmission Protocol (SCTP) [Ste00], because SCTP has similar
acknowledgment and packet retransmission concepts as TCP. For acknowledgment and packet retransmission concepts as TCP. For
convenience, this document mostly refers to TCP, but the algorithms convenience, this document mostly refers to TCP, but the algorithms
and other discussion are valid also with SCTP. and other discussion are valid for SCTP as well.
This document is organized as follows. Section 2 describes the basic This document is organized as follows. Section 2 describes the basic
F-RTO algorithm. Section 3 outlines an optional enhancement to the F- F-RTO algorithm. Section 3 outlines an optional enhancement to the
RTO algorithm that takes leverage on the TCP SACK option. Section 4 F-RTO algorithm that takes advantage of the TCP SACK option. Section
discusses the possible actions to be taken after detecting a spurious 4 discusses the possible actions to be taken after detecting a
RTO. Section 5 gives considerations on applying F-RTO with SCTP, and spurious RTO. Section 5 gives considerations on applying F-RTO with
Section 6 discusses the security considerations. SCTP, and Section 6 discusses the security considerations.
2. F-RTO Algorithm 2. F-RTO Algorithm
A spurious timeout is a timeout that would not have had to occur if A timeout is considered spurious if it would have been avoided had
the sender had waited longer for an acknowledgment to arrive [LM03]. the sender waited longer for an acknowledgment to arrive [LM03].
F-RTO affects the TCP sender behavior only after a retransmission F-RTO affects the TCP sender behavior only after a retransmission
timeout, otherwise the TCP behavior remains unmodified. When the RTO timeout, otherwise the TCP behavior remains the same. When the RTO
expires the F-RTO algorithm monitors incoming acknowledgments and expires the F-RTO algorithm monitors incoming acknowledgments and
declares a timeout spurious, if the TCP sender gets an acknowledgment declares a timeout spurious, if the TCP sender gets an acknowledgment
for a segment that was not retransmitted due to timeout. The actions for a segment that was not retransmitted due to timeout. The actions
taken in response to a spurious timeout are not specified in this taken in response to a spurious timeout are not specified in this
document, but we discuss the different alternatives in Section 4. document, but we discuss some alternatives in Section 4. This section
This section first describes the algorithm and then discusses the introduces the algorithm and then discusses the different steps of
different steps of the algorithm in more detail. the algorithm in more detail.
Following the practice used with the Eifel Detection algorithm Following the practice used with the Eifel Detection algorithm
[LM03], we use the "SpuriousRecovery" variable to indicate whether [LM03], we use the "SpuriousRecovery" variable to indicate whether
the retransmission is declared spurious by the sender. This variable the retransmission is declared spurious by the sender. This variable
can be used as an input for a corresponding response algorithm. With can be used as an input for a corresponding response algorithm. With
F-RTO, the value of SpuriousRecovery can either be SPUR_TO, F-RTO, the value of SpuriousRecovery can be either SPUR_TO,
indicating a spurious retransmission timeout; or FALSE, when the indicating a spurious retransmission timeout, or FALSE, when the
timeout is not declared spurious, and the TCP sender should follow timeout is not declared spurious, and the TCP sender should follow
the conventional RTO recovery algorithm. the conventional RTO recovery algorithm.
2.1. The Algorithm 2.1. The Algorithm
A TCP sender MAY implement the basic F-RTO algorithm, and if it A TCP sender MAY implement the basic F-RTO algorithm, and if it
chooses to apply the algorithm, the following steps MUST be taken chooses to apply the algorithm, the following steps MUST be taken
after the retransmission timer expires. If the sender implements some after the retransmission timer expires. If the sender implements some
other loss recovery algorithm than Reno or NewReno [FHG04], F-RTO loss recovery algorithm other than Reno or NewReno [FHG04], F-RTO
algorithm SHOULD NOT be entered when earlier fast recovery is algorithm SHOULD NOT be entered when earlier fast recovery is
underway. underway.
1) When RTO expires, the TCP sender SHOULD retransmit the first 1) When RTO expires, the TCP sender SHOULD retransmit the first
unacknowledged segment and set SpuriousRecovery to FALSE. Also, unacknowledged segment and set SpuriousRecovery to FALSE. Also,
the TCP SHOULD store the highest sequence number transmitted so the TCP SHOULD store the highest sequence number transmitted so
far in variable "recover". far in variable "recover".
2) When the first acknowledgment after the RTO retransmission arrives 2) When the first acknowledgment after the RTO retransmission arrives
at the sender, the sender chooses the following actions depending at the sender, the sender chooses the following actions depending
skipping to change at page 5, line 27 skipping to change at page 6, line 15
The TCP sender MUST NOT enter step 3 of this algorithm, and the The TCP sender MUST NOT enter step 3 of this algorithm, and the
SpuriousRecovery variable remains as FALSE. SpuriousRecovery variable remains as FALSE.
b) Else, if the acknowledgment advances the window AND it is below b) Else, if the acknowledgment advances the window AND it is below
the value of "recover", the TCP sender SHOULD transmit up to the value of "recover", the TCP sender SHOULD transmit up to
two new (previously unsent) segments and enter step 3 of this two new (previously unsent) segments and enter step 3 of this
algorithm. If the TCP sender does not have enough unsent data, algorithm. If the TCP sender does not have enough unsent data,
it SHOULD send only one segment. In addition, the TCP sender it SHOULD send only one segment. In addition, the TCP sender
MAY override the Nagle algorithm [Nag84] and immediately send a MAY override the Nagle algorithm [Nag84] and immediately send a
segment if needed. Note that sending two segments in this step segment if needed. Note that sending two segments in this step
is allowed by TCP congestion control requirements [APS99], but is allowed by TCP congestion control requirements [APS99]: An
F-RTO changes which segments are transmitted. F-RTO TCP sender simply chooses different segments to transmit.
If the TCP sender does not have any new data to send, or the If the TCP sender does not have any new data to send, or the
advertised window limits the transmission, the recommended advertised window prohibits new transmissions, the recommended
action is to not enter step 3 of this algorithm but continue action is to skip step 3 of this algorithm and continue with
with slow start retransmissions following the conventional RTO slow start retransmissions following the conventional RTO
recovery algorithm. However, alternative ways of handling the recovery algorithm. However, alternative ways of handling the
window limited cases that could result in better performance window limited cases that could result in better performance
are discussed in Appendix C. are discussed in Appendix C.
3) When the second acknowledgment after the RTO retransmission 3) When the second acknowledgment after the RTO retransmission
arrives at the sender, the TCP sender either declares the timeout arrives at the sender, the TCP sender either declares the timeout
spurious, or starts retransmitting the unacknowledged segments. spurious, or starts retransmitting the unacknowledged segments.
a) If the acknowledgment is a duplicate ACK, the TCP sender MUST a) If the acknowledgment is a duplicate ACK, the TCP sender MUST
set congestion window to no more than 3 * MSS, and continue set the congestion window to no more than 3 * MSS, and continue
with the slow start algorithm retransmitting unacknowledged with the slow start algorithm retransmitting unacknowledged
segments. Congestion window can be set to 3 * MSS, because two segments. Congestion window can be set to 3 * MSS, because two
round-trip times have elapsed since the RTO, and a conventional round-trip times have elapsed since the RTO, and a conventional
TCP sender would have increased cwnd to 3 during the same time. TCP sender would have increased cwnd to 3 during the same time.
The sender leaves SpuriousRecovery set to FALSE. The sender leaves SpuriousRecovery set to FALSE.
b) If the acknowledgment advances the window, i.e. it acknowledges b) If the acknowledgment advances the window, i.e. it acknowledges
data that was not retransmitted after the timeout, the TCP data that was not retransmitted after the timeout, the TCP
sender SHOULD declare the timeout spurious, set sender SHOULD declare the timeout spurious, set
SpuriousRecovery to SPUR_TO and set the value of "recover" SpuriousRecovery to SPUR_TO and set the value of "recover"
variable to SND.UNA, the oldest unacknowledged sequence number variable to SND.UNA, the oldest unacknowledged sequence number
[Pos81]. [Pos81].
2.2. Discussion 2.2. Discussion
The F-RTO sender takes cautious actions when it receives duplicate The F-RTO sender takes cautious actions when it receives duplicate
acknowledgments after a retransmission timeout. Since duplicate ACKs acknowledgments after a retransmission timeout. Since duplicate ACKs
may indicate that segments have been lost, reliably detecting a may indicate that segments have been lost, reliably detecting a
spurious timeout is difficult in the lack of additional information. spurious timeout is difficult due to the lack of additional
Therefore the safest alternative is to follow the conventional TCP information. Therefore, it is prudent to follow the conventional TCP
recovery in those cases. recovery in those cases.
If the first acknowledgment after the RTO retransmission covers the If the first acknowledgment after the RTO retransmission covers the
"recover" point at algorithm step (2a), there is not enough evidence "recover" point at algorithm step (2a), there is not enough evidence
that a non-retransmitted segment has arrived at the receiver after that a non-retransmitted segment has arrived at the receiver after
the timeout. This is a common case when a fast retransmission is the timeout. This is a common case when a fast retransmission is
lost and it has been retransmitted again after an RTO, while the rest lost and it has been retransmitted again after an RTO, while the rest
of the unacknowledged segments have successfully been delivered to of the unacknowledged segments have successfully been delivered to
the TCP receiver before the retransmission timeout. Therefore the the TCP receiver before the retransmission timeout. Therefore the
timeout cannot be declared spurious in this case. timeout cannot be declared spurious in this case.
skipping to change at page 6, line 45 skipping to change at page 7, line 33
segments when the first new ACK arrives after the RTO retransmission. segments when the first new ACK arrives after the RTO retransmission.
If sending new data is not possible in algorithm branch (2b), or the If sending new data is not possible in algorithm branch (2b), or the
receiver window limits the transmission, the TCP sender has to send receiver window limits the transmission, the TCP sender has to send
something in order to prevent the TCP transfer from stalling. If no something in order to prevent the TCP transfer from stalling. If no
segments were sent, the pipe between sender and receiver may run out segments were sent, the pipe between sender and receiver may run out
of segments, and no further acknowledgments would arrive. In this of segments, and no further acknowledgments would arrive. In this
case the recommendation is to revert to the conventional RTO recovery case the recommendation is to revert to the conventional RTO recovery
with slow start retransmissions, but Appendix C discusses some with slow start retransmissions, but Appendix C discusses some
alternative solutions for window limited situations. alternative solutions for window limited situations.
If the RTO is declared spurious, the TCP sender sets the value of the If the retransmission timeout is declared spurious, the TCP sender
"recover" variable to SND.UNA in order to allow fast retransmit sets the value of the "recover" variable to SND.UNA in order to allow
[FHG04]. The "recover" variable was proposed for avoiding unnecessary fast retransmit [FHG04]. The "recover" variable was proposed for
multiple fast retransmits when RTO expires during fast recovery with avoiding unnecessary multiple fast retransmits when RTO expires
NewReno TCP. As the sender does not retransmit other segments but the during fast recovery with NewReno TCP. As the sender does not
one that triggered timeout, the problem addressed by the RFC cannot retransmit other segments but the one that triggered the timeout, the
problem of unnecessary multiple fast retransmits [FHG04] cannot
occur. Therefore, if there are three duplicate ACKs arriving at the occur. Therefore, if there are three duplicate ACKs arriving at the
sender after the timeout, they are likely to indicate a packet loss, sender after the timeout, they are likely to indicate a packet loss,
hence fast retransmit should be used to allow efficient recovery. If hence fast retransmit should be used to allow efficient recovery. If
there are not enough duplicate ACKs arriving at the sender after a there are not enough duplicate ACKs arriving at the sender after a
packet loss, the retransmission timer expires another time and the packet loss, the retransmission timer expires another time and the
sender enters step 1 of this algorithm. sender enters step 1 of this algorithm.
When the timeout is declared spurious, the TCP sender cannot detect When the timeout is declared spurious, the TCP sender cannot detect
whether the unnecessary RTO retransmission was lost. In principle the whether the unnecessary RTO retransmission was lost. In principle the
loss of the RTO retransmission should be taken as a congestion loss of the RTO retransmission should be taken as a congestion
skipping to change at page 7, line 41 skipping to change at page 8, line 29
It is possible that the F-RTO algorithm does not always avoid It is possible that the F-RTO algorithm does not always avoid
unnecessary retransmissions after a spurious timeout. If packet unnecessary retransmissions after a spurious timeout. If packet
reordering or packet duplication occurs on the segment that triggered reordering or packet duplication occurs on the segment that triggered
the spurious timeout, the F-RTO algorithm may not detect the spurious the spurious timeout, the F-RTO algorithm may not detect the spurious
timeout due to incoming duplicate ACKs. Additionally, if a spurious timeout due to incoming duplicate ACKs. Additionally, if a spurious
timeout occurs during fast recovery, the F-RTO algorithm often cannot timeout occurs during fast recovery, the F-RTO algorithm often cannot
detect the spurious timeout, because the segments transmitted before detect the spurious timeout, because the segments transmitted before
the fast recovery trigger duplicate ACKs. However, we consider these the fast recovery trigger duplicate ACKs. However, we consider these
cases relatively rare, and note that in cases where F-RTO fails to cases relatively rare, and note that in cases where F-RTO fails to
detect the spurious timeout, it performs similarly to the regular RTO detect the spurious timeout, it retransmits the unacknowledged
segments in slow start and thus performs similarly to the regular RTO
recovery. recovery.
3. A SACK-enhanced version of the F-RTO algorithm 3. SACK-enhanced version of the F-RTO algorithm
This section describes an alternative version of the F-RTO algorithm, This section describes an alternative version of the F-RTO algorithm,
that makes use of TCP Selective Acknowledgment Option [MMFR96]. By that makes use of the TCP Selective Acknowledgment Option [MMFR96].
using the SACK option the TCP sender can detect spurious timeouts in By using the SACK option the TCP sender can detect spurious timeouts
most of the cases when packet reordering or packet duplication is in most of the cases when packet reordering or packet duplication is
present. The difference to the basic F-RTO algorithm is that the present. The difference to the basic F-RTO algorithm is that the
sender may declare timeout spurious even when duplicate ACKs follow sender may declare timeout spurious even when duplicate ACKs follow
the RTO, if the SACK blocks acknowledge new data that was not the RTO, if the SACK blocks acknowledge new data that was not
transmitted after the RTO retransmission. transmitted after the RTO retransmission.
Given that the TCP Selective Acknowledgment Option [MMFR96] is Given that the TCP Selective Acknowledgment Option [MMFR96] is
enabled for a TCP connection, a TCP sender MAY implement the SACK- enabled for a TCP connection, a TCP sender MAY implement the
enhanced F-RTO algorithm. If the sender applies the SACK-enhanced F- SACK-enhanced F-RTO algorithm. If the sender applies the
RTO algorithm, it MUST follow the steps below. This algorithm SHOULD SACK-enhanced F-RTO algorithm, it MUST follow the steps below. This
NOT be applied, if the TCP sender is already in loss recovery when algorithm SHOULD NOT be applied, if the TCP sender is already in loss
retransmission timeout occurs. However, it should be possible to recovery when retransmission timeout occurs. However, it should be
apply the principle of F-RTO within certain limitations also when possible to apply the principle of F-RTO within certain limitations
retransmission timeout occurs during existing loss recovery. While also when retransmission timeout occurs during existing loss
this is a topic of further research, Appendix B briefly discusses the recovery. While this is a topic of further research, Appendix B
related issues. briefly discusses the related issues.
1) When the RTO expires, the TCP sender SHOULD retransmit the first 1) When the RTO expires, the TCP sender SHOULD retransmit the first
unacknowledged segment and set SpuriousRecovery to FALSE. Variable unacknowledged segment and set SpuriousRecovery to FALSE. Variable
"recover" is set to indicate the highest segment transmitted so "recover" is set to indicate the highest segment transmitted so
far. Following the recommendation in SACK specification [MMFR96], far. Following the recommendation in SACK specification [MMFR96],
the SACK scoreboard SHOULD be reset. the SACK scoreboard SHOULD be reset.
2) Wait until the acknowledgment for the data retransmitted due to 2) Wait until the acknowledgment for the data retransmitted due to
the timeout arrives at the sender. If duplicate ACKs arrive before the timeout arrives at the sender. If duplicate ACKs arrive before
the cumulative acknowledgment for retransmitted data, adjust the the cumulative acknowledgment for retransmitted data, adjust the
scoreboard according to the incoming SACK information but stay in scoreboard according to the incoming SACK information but stay in
step 2 waiting for the next new acknowledgment. If RTO expires step 2 waiting for the next new acknowledgment. If RTO expires
again, restart the algorithm. again, go to step 1 of the algorithm.
a) if a cumulative ACK acknowledges a sequence number equal to a) if a cumulative ACK acknowledges a sequence number equal to
"recover", the TCP sender SHOULD revert to the conventional RTO "recover", the TCP sender SHOULD revert to the conventional RTO
recovery and it MUST set congestion window to no more than 2 * recovery and it MUST set congestion window to no more than 2 *
MSS. The sender MUST NOT enter step 3 of this algorithm. MSS, like a regular TCP would do. The sender MUST NOT enter
step 3 of this algorithm.
b) else, if a cumulative ACK acknowledges a sequence number b) else, if a cumulative ACK acknowledges a sequence number
smaller than "recover" but larger than SND.UNA, the TCP sender smaller than "recover" but larger than SND.UNA, the TCP sender
SHOULD transmit up to two new (previously unsent) segments and SHOULD transmit up to two new (previously unsent) segments and
proceed to step 3. If the TCP sender is not able to transmit proceed to step 3. If the TCP sender is not able to transmit
any previously unsent data due to receiver window limitation or any previously unsent data due to receiver window limitation or
because it does not have any new data to send, the recommended because it does not have any new data to send, the recommended
action is to not enter step 3 of this algorithm but continue action is to not enter step 3 of this algorithm but continue
with slow start retransmissions following the conventional RTO with slow start retransmissions following the conventional RTO
recovery algorithm. recovery algorithm.
skipping to change at page 9, line 15 skipping to change at page 10, line 5
3) The next acknowledgment arrives at the sender. Either duplicate 3) The next acknowledgment arrives at the sender. Either duplicate
ACK or a new cumulative ACK advancing the window applies in this ACK or a new cumulative ACK advancing the window applies in this
step. step.
a) if the ACK acknowledges sequence number above "recover", either a) if the ACK acknowledges sequence number above "recover", either
in SACK blocks or as a cumulative ACK, the sender MUST set in SACK blocks or as a cumulative ACK, the sender MUST set
congestion window to no more than 3 * MSS and proceed with the congestion window to no more than 3 * MSS and proceed with the
conventional RTO recovery, retransmitting unacknowledged conventional RTO recovery, retransmitting unacknowledged
segments. The sender SHOULD take this branch also when the segments. The sender SHOULD take this branch also when the
acknowledgment is a duplicate ACK and it does not acknowledge acknowledgment is a duplicate ACK and it does not acknowledge
any new previously unacknowledged data below "recover" in the any new, previously unacknowledged data below "recover" in the
SACK blocks. The sender leaves SpuriousRecovery set to FALSE. SACK blocks. The sender leaves SpuriousRecovery set to FALSE.
b) if the ACK does not acknowledge sequence numbers above b) if the ACK does not acknowledge sequence numbers above
"recover" AND it acknowledges data that was not acknowledged "recover" AND it acknowledges data that was not acknowledged
earlier either with cumulative acknowledgment or using SACK earlier either with cumulative acknowledgment or using SACK
blocks, the TCP sender SHOULD declare the timeout spurious and blocks, the TCP sender SHOULD declare the timeout spurious and
set SpuriousRecovery to SPUR_TO. The retransmission timeout can set SpuriousRecovery to SPUR_TO. The retransmission timeout can
be declared spurious, because the segment acknowledged with be declared spurious, because the segment acknowledged with
this ACK was transmitted before the timeout. this ACK was transmitted before the timeout.
skipping to change at page 9, line 41 skipping to change at page 10, line 31
4. Taking Actions after Detecting Spurious RTO 4. Taking Actions after Detecting Spurious RTO
Upon retransmission timeout, a conventional TCP sender assumes that Upon retransmission timeout, a conventional TCP sender assumes that
outstanding segments are lost and starts retransmitting the outstanding segments are lost and starts retransmitting the
unacknowledged segments. When the retransmission timeout is detected unacknowledged segments. When the retransmission timeout is detected
to be spurious, the TCP sender should not continue retransmitting to be spurious, the TCP sender should not continue retransmitting
based on the timeout. For example, if the sender was in congestion based on the timeout. For example, if the sender was in congestion
avoidance phase transmitting new previously unsent segments, it avoidance phase transmitting new previously unsent segments, it
should continue transmitting previously unsent segments after should continue transmitting previously unsent segments after
detecting spurious RTO. This document does not describe the response detecting a spurious RTO. This document does not describe the
to spurious timeout, but a response algorithm is described in another response to spurious timeout, but a response algorithm is described
IETF document [LG04]. in another IETF document [LG04].
Additionally, different response variants to spurious retransmission Additionally, different response variants to spurious retransmission
timeout have been discussed in various research papers [SKR03, GL03, timeout have been discussed in various research papers [SKR03, GL03,
Sar03] and Internet-Drafts [SL03]. The different response Sar03] and Internet-Drafts [SL03]. The different response
alternatives vary in whether the spurious retransmission timeout alternatives vary in whether the spurious retransmission timeout
should be taken as a congestion signal, thus causing the congestion should be taken as a congestion signal, thus causing the congestion
window or slow start threshold to be reduced at the sender, or window or slow start threshold to be reduced at the sender, or
whether the congestion control state should be fully reverted to the whether the congestion control state should be fully reverted to the
state valid prior to the retransmission timeout. state valid prior to the retransmission timeout.
5. SCTP Considerations 5. SCTP Considerations
SCTP has similar retransmission algorithms and congestion control to SCTP has similar retransmission algorithms and congestion control to
TCP. The SCTP T3-rtx timer for one destination address is maintained TCP. The SCTP T3-rtx timer for one destination address is maintained
in the same way than the TCP retransmission timer, and after a T3-rtx in the same way than the TCP retransmission timer, and after a T3-rtx
expires, an SCTP sender retransmits unacknowledged data chunks in expires, an SCTP sender retransmits unacknowledged data chunks in
slow start like TCP does. Therefore, SCTP is vulnerable to the nega- slow start like TCP does. Therefore, SCTP is vulnerable to the
tive effects of the spurious retransmission timeouts similarly to negative effects of the spurious retransmission timeouts similarly to
TCP. Due to similar RTO recovery algorithms, F-RTO algorithm logic TCP. Due to similar RTO recovery algorithms, F-RTO algorithm logic
can be applied also to SCTP. Since SCTP uses selective acknowledg- can be applied also to SCTP. Since SCTP uses selective
ments, the SACK-based variant of the algorithm is recommended, acknowledgments, the SACK-based variant of the algorithm is
although the basic version can also be applied to SCTP. However, SCTP recommended, although the basic version can also be applied to SCTP.
contains features that are not present with TCP that need to be dis- However, SCTP contains features that are not present with TCP that
cussed when applying the F-RTO algorithm. need to be discussed when applying the F-RTO algorithm.
SCTP association can be multi-homed. The current retransmission pol- SCTP associations can be multi-homed. The current retransmission
icy states that retransmissions should go to alternative addresses. policy states that retransmissions should go to alternative
If the retransmission was due to spurious timeout caused by a delay addresses. If the retransmission was due to spurious timeout caused
spike, it is possible that the acknowledgment for the retransmission by a delay spike, it is possible that the acknowledgment for the
arrives back at the sender before the acknowledgments of the original retransmission arrives back at the sender before the acknowledgments
transmissions arrive. If this happens, a possible loss of the origi- of the original transmissions arrive. If this happens, a possible
nal transmission of the data chunk that was retransmitted due to the loss of the original transmission of the data chunk that was
spurious timeout may remain undetected when applying the F-RTO algo- retransmitted due to the spurious timeout may remain undetected when
rithm. Because the timeout was caused by a delay spike, and it was applying the F-RTO algorithm. Because the timeout was caused by a
spurious in that respect, a suitable response is to continue by send- delay spike, and it was spurious in that respect, a suitable response
ing new data. However, if the original transmission was lost, fully is to continue by sending new data. However, if the original
reverting the congestion control parameters is too aggressive. There- transmission was lost, fully reverting the congestion control
fore, taking conservative actions on congestion control is recom- parameters is too aggressive. Therefore, taking conservative actions
mended, if the SCTP association is multi-homed and retransmissions go on congestion control is recommended, if the SCTP association is
to alternative address. The information in duplicate TSNs can be then multi-homed and retransmissions go to alternative address. The
used for reverting congestion control, if desired [BA04]. information in duplicate TSNs can be then used for reverting
congestion control, if desired [BA04].
Note that the forward transmissions made in F-RTO algorithm step (2b) Note that the forward transmissions made in F-RTO algorithm step (2b)
should be destined to the primary address, since they are not should be destined to the primary address, since they are not
retransmissions. retransmissions.
When making a retransmission, a SCTP sender can bundle a number of When making a retransmission, a SCTP sender can bundle a number of
unacknowledged data chunks and include them in the same packet. This unacknowledged data chunks and include them in the same packet. This
needs to be considered when implementing F-RTO for SCTP. The basic needs to be considered when implementing F-RTO for SCTP. The basic
principle of F-RTO still holds: in order to declare the timeout spu- principle of F-RTO still holds: in order to declare the timeout
rious, the sender must get an acknowledgment for a data chunk that spurious, the sender must get an acknowledgment for a data chunk that
was not retransmitted after the retransmission timeout. In other was not retransmitted after the retransmission timeout. In other
words, acknowledgments of data chunks that were bundled in RTO words, acknowledgments of data chunks that were bundled in RTO
retransmission must not be used for declaring the timeout spurious. retransmission must not be used for declaring the timeout spurious.
6. Security Considerations 6. Security Considerations
The main security threat regarding F-RTO is the possibility of a The main security threat regarding F-RTO is the possibility of a
receiver misleading the sender to set too large a congestion window receiver misleading the sender to set too large a congestion window
after an RTO. There are two possible ways a malicious receiver could after an RTO. There are two possible ways a malicious receiver could
trigger a wrong output from the F-RTO algorithm. First, the receiver trigger a wrong output from the F-RTO algorithm. First, the receiver
can acknowledge data that it has not received. Second, it can delay can acknowledge data that it has not received. Second, it can delay
acknowledgment of a segment it has received earlier, and acknowledge acknowledgment of a segment it has received earlier, and acknowledge
the segment after the TCP sender has been deluded to enter algorithm the segment after the TCP sender has been deluded to enter algorithm
step 3. step 3.
If the receiver acknowledges a segment it has not really received, If the receiver acknowledges a segment it has not really received,
the sender can be lead to declare spurious timeout in F-RTO algorithm the sender can be led to declare spurious timeout in F-RTO algorithm
step 3. However, since this causes the sender to have incorrect step 3. However, since this causes the sender to have incorrect
state, it cannot retransmit the segment that has never reached the state, it cannot retransmit the segment that has never reached the
receiver. Therefore, this attack is unlikely to be useful for the receiver. Therefore, this attack is unlikely to be useful for the
receiver to maliciously gain a larger congestion window. receiver to maliciously gain a larger congestion window.
A common case for a retransmission timeout is that a fast retransmis- A common case for a retransmission timeout is that a fast
sion of a segment is lost. If all other segments have been received, retransmission of a segment is lost. If all other segments have been
the RTO retransmission causes the whole window to be acknowledged at received, the RTO retransmission causes the whole window to be
once. This case is recognized in F-RTO algorithm branch (2a). How- acknowledged at once. This case is recognized in F-RTO algorithm
ever, if the receiver only acknowledges one segment after receiving branch (2a). However, if the receiver only acknowledges one segment
the RTO retransmission, and then the rest of the segments, it could after receiving the RTO retransmission, and then the rest of the
cause the timeout to be declared spurious when it is not. Therefore, segments, it could cause the timeout to be declared spurious when it
it is suggested that when an RTO expires during fast recovery phase, is not. Therefore, it is suggested that when an RTO expires during
the sender would not fully revert the congestion window even if the fast recovery phase, the sender would not fully revert the congestion
timeout was declared spurious, but reduce the congestion window to 1. window even if the timeout was declared spurious, but reduce the
However, the sender can take actions to avoid unnecessary retransmis- congestion window to 1. However, the sender can take actions to avoid
sions normally. If a TCP sender implements a burst avoidance algo- unnecessary retransmissions normally. If a TCP sender implements a
rithm that limits the sending rate to be no higher than in slow burst avoidance algorithm that limits the sending rate to be no
start, this precaution is not needed, and the sender may apply F-RTO higher than in slow start, this precaution is not needed, and the
normally. sender may apply F-RTO normally.
If there are more than one segments missing at the time when a If there are more than one segments missing at the time when a
retransmission timeout occurs, the receiver does not benefit from retransmission timeout occurs, the receiver does not benefit from
misleading the sender to declare a spurious timeout, because the misleading the sender to declare a spurious timeout, because the
sender would then have to go through another recovery period to sender would then have to go through another recovery period to
retransmit the missing segments, usually after an RTO has elapsed. retransmit the missing segments, usually after an RTO has elapsed.
Acknowledgments 7. IANA Considerations
This document has no actions for IANA.
8. Acknowledgments
We are grateful to Reiner Ludwig, Andrei Gurtov, Josh Blanton, Mark We are grateful to Reiner Ludwig, Andrei Gurtov, Josh Blanton, Mark
Allman, Sally Floyd, Yogesh Swami, Mika Liljeberg, Ivan Arias Allman, Sally Floyd, Yogesh Swami, Mika Liljeberg, Ivan Arias
Rodriguez, Sourabh Ladha, Martin Duke, Motoharu Miyake, Ted Faber, Rodriguez, Sourabh Ladha, Martin Duke, Motoharu Miyake, Ted Faber,
and Samu Kontinen for the discussion and feedback contributed to this Samu Kontinen, and Kostas Pentikousis for the discussion and feedback
text. contributed to this text.
9. References
Normative References Normative References
[APS99] M. Allman, V. Paxson, and W. Stevens. TCP Congestion Con- [APS99] M. Allman, V. Paxson, and W. Stevens. TCP Congestion
trol. RFC 2581, April 1999. Control. RFC 2581, April 1999.
[BAFW03] E. Blanton, M. Allman, K. Fall, and L. Wang. A Conservative [BAFW03] E. Blanton, M. Allman, K. Fall, and L. Wang. A Conservative
Selective Acknowledgment (SACK)-based Loss Recovery Algo- Selective Acknowledgment (SACK)-based Loss Recovery
rithm for TCP. RFC 3517, April 2003. Algorithm for TCP. RFC 3517, April 2003.
[FHG04] S. Floyd, T. Henderson, and A. Gurtov. The NewReno Modifi- [RFC2119] S. Bradner. Key words for use in RFCs to Indicate
cation to TCP's Fast Recovery Algorithm. RFC 3782, April Requirement Levels. RFC 2119, March 1997.
2004.
[MMFR96] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP Selec- [FHG04] S. Floyd, T. Henderson, and A. Gurtov. The NewReno
tive Acknowledgment Options. RFC 2018, October 1996. Modification to TCP's Fast Recovery Algorithm. RFC 3782,
April 2004.
[MMFR96] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP
Selective Acknowledgment Options. RFC 2018, October 1996.
[PA00] V. Paxson and M. Allman. Computing TCP's Retransmission [PA00] V. Paxson and M. Allman. Computing TCP's Retransmission
Timer. RFC 2988, November 2000. Timer. RFC 2988, November 2000.
[Pos81] J. Postel. Transmission Control Protocol. RFC 793, Septem- [Pos81] J. Postel. Transmission Control Protocol. RFC 793,
ber 1981. September 1981.
[Ste00] R. Stewart, et. al. Stream Control Transmission Protocol. [Ste00] R. Stewart, et. al. Stream Control Transmission Protocol.
RFC 2960, October 2000. RFC 2960, October 2000.
Informative References Informative References
[ABF01] M. Allman, H. Balakrishnan, and S. Floyd. Enhancing TCP's [ABF01] M. Allman, H. Balakrishnan, and S. Floyd. Enhancing TCP's
Loss Recovery Using Limited Transmit. RFC 3042, January Loss Recovery Using Limited Transmit. RFC 3042, January
2001. 2001.
[BA04] E. Blanton and M. Allman. Using TCP Duplicate Selective [BA04] E. Blanton and M. Allman. Using TCP Duplicate Selective
Acknowledgment (DSACKs) and Stream Control Transmission Acknowledgment (DSACKs) and Stream Control Transmission
Protocol (SCTP) Duplicate Transmission Sequence Numbers Protocol (SCTP) Duplicate Transmission Sequence Numbers
(TSNs) to Detect Spurious Retransmissions. RFC 3708, Febru- (TSNs) to Detect Spurious Retransmissions. RFC 3708,
ary 2004. February 2004.
[BBJ92] D. Borman, R. Braden, and V. Jacobson. TCP Extensions for [BBJ92] D. Borman, R. Braden, and V. Jacobson. TCP Extensions for
High Performance. RFC 1323, May 1992. High Performance. RFC 1323, May 1992.
[FMMP00] S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky. An Exten- [FMMP00] S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky. An
sion to the Selective Acknowledgment (SACK) Option to TCP. Extension to the Selective Acknowledgment (SACK) Option to
RFC 2883, July 2000. TCP. RFC 2883, July 2000.
[GL02] A. Gurtov and R. Ludwig. Evaluating the Eifel Algorithm for [GL02] A. Gurtov and R. Ludwig. Evaluating the Eifel Algorithm for
TCP in a GPRS Network. In Proc. of European Wireless, Flo- TCP in a GPRS Network. In Proc. of European Wireless,
rence, Italy, February 2002 Florence, Italy, February 2002.
[GL03] A. Gurtov and R. Ludwig, Responding to Spurious Timeouts in [GL03] A. Gurtov and R. Ludwig, Responding to Spurious Timeouts in
TCP. In Proceedings of IEEE INFOCOM 03, San Francisco, CA, TCP. In Proceedings of IEEE INFOCOM 03, San Francisco, CA,
USA, March 2003. USA, March 2003.
[Jac88] V. Jacobson. Congestion Avoidance and Control. In Proceed- [Jac88] V. Jacobson. Congestion Avoidance and Control. In
ings of ACM SIGCOMM 88. Proceedings of ACM SIGCOMM 88.
[LG04] R. Ludwig and A. Gurtov. The Eifel Response Algorithm for [LG04] R. Ludwig and A. Gurtov. The Eifel Response Algorithm for
TCP. Internet draft "draft-ietf-tsvwg-tcp-eifel- TCP. Internet draft
response-05.txt". March 2004. Work in progress. "draft-ietf-tsvwg-tcp-eifel-response-05.txt". March 2004.
Work in progress.
[LK00] R. Ludwig and R.H. Katz. The Eifel Algorithm: Making TCP [LK00] R. Ludwig and R.H. Katz. The Eifel Algorithm: Making TCP
Robust Against Spurious Retransmissions. ACM SIGCOMM Com- Robust Against Spurious Retransmissions. ACM SIGCOMM
puter Communication Review, 30(1), January 2000. Computer Communication Review, 30(1), January 2000.
[LM03] R. Ludwig and M. Meyer. The Eifel Detection Algorithm for [LM03] R. Ludwig and M. Meyer. The Eifel Detection Algorithm for
TCP. RFC 3522, April 2003. TCP. RFC 3522, April 2003.
[Nag84] J. Nagle. Congestion Control in IP/TCP Internetworks. RFC [Nag84] J. Nagle. Congestion Control in IP/TCP Internetworks. RFC
896, January 1984. 896, January 1984.
[SKR03] P. Sarolahti, M. Kojo, and K. Raatikainen. F-RTO: An [SKR03] P. Sarolahti, M. Kojo, and K. Raatikainen. F-RTO: An
Enhanced Recovery Algorithm for TCP Retransmission Time- Enhanced Recovery Algorithm for TCP Retransmission
outs. ACM SIGCOMM Computer Communication Review, 33(2), Timeouts. ACM SIGCOMM Computer Communication Review, 33(2),
April 2003. April 2003.
[Sar03] P. Sarolahti. Congestion Control on Spurious TCP Retrans- [Sar03] P. Sarolahti. Congestion Control on Spurious TCP
mission Timeouts. In Proceedings of IEEE Globecom 2003, San Retransmission Timeouts. In Proceedings of IEEE Globecom
Francisco, CA, USA. December 2003. 2003, San Francisco, CA, USA. December 2003.
[SL03] Y. Swami and K. Le. DCLOR: De-correlated Loss Recovery [SL03] Y. Swami and K. Le. DCLOR: De-correlated Loss Recovery
using SACK option for spurious timeouts. Internet draft using SACK option for spurious timeouts. Internet draft
"draft-swami-tsvwg-tcp-dclor-02.txt". September 2003. Work "draft-swami-tsvwg-tcp-dclor-02.txt". September 2003. Work
in progress. in progress.
Appendix A: Scenarios Appendix A: Scenarios
This section discusses different scenarios where RTOs occur and how This section discusses different scenarios where RTOs occur and how
the basic F-RTO algorithm performs in those scenarios. The the basic F-RTO algorithm performs in those scenarios. The
interesting scenarios are a sudden delay triggering retransmission interesting scenarios are a sudden delay triggering retransmission
timeout, loss of a retransmitted packet during fast recovery, link timeout, loss of a retransmitted packet during fast recovery, link
outage causing the loss of several packets, and packet reordering. A outage causing the loss of several packets, and packet reordering. A
performance evaluation with a more thorough analysis on a real performance evaluation with a more thorough analysis on a real
implementation of F-RTO is given in [SKR03]. implementation of F-RTO is given in [SKR03].
A.1. Sudden delay A.1. Sudden delay
The main motivation of F-RTO algorithm is to improve TCP performance
when a delay spike triggers a spurious retransmission timeout. The The main motivation behind the F-RTO algorithm is to improve TCP
example below illustrates the segments and acknowledgments performance when a delay spike triggers a spurious retransmission
transmitted by the TCP end hosts when a spurious timeout occurs, but timeout. The example below illustrates the segments and
no packets are lost. For simplicity, delayed acknowledgments are not acknowledgments transmitted by the TCP end hosts when a spurious
used in the example. The example below applies the Eifel Response timeout occurs, but no packets are lost. For simplicity, delayed
Algorithm [LG04] after detecting a spurious timeout. acknowledgments are not used in the example. The example below
applies the Eifel Response Algorithm [LG04] after detecting a
spurious timeout.
... ...
(cwnd = 6, ssthresh < 6, FlightSize = 6) (cwnd = 6, ssthresh < 6, FlightSize = 6)
1. <---------------------------- ACK 5 1. <---------------------------- ACK 5
2. SEND 10 ----------------------------> 2. SEND 10 ---------------------------->
(cwnd = 6, ssthresh < 6, FlightSize = 6) (cwnd = 6, ssthresh < 6, FlightSize = 6)
3. <---------------------------- ACK 6 3. <---------------------------- ACK 6
4. SEND 11 ----------------------------> 4. SEND 11 ---------------------------->
(cwnd = 6, ssthresh < 6, FlightSize = 6) (cwnd = 6, ssthresh < 6, FlightSize = 6)
5. | 5. |
skipping to change at page 14, line 50 skipping to change at page 16, line 8
12. <---------------------------- ACK 9 12. <---------------------------- ACK 9
13. SEND 15 ----------------------------> 13. SEND 15 ---------------------------->
(cwnd = 7, ssthresh = 6, FlightSize = 7) (cwnd = 7, ssthresh = 6, FlightSize = 7)
14. <---------------------------- ACK 10 14. <---------------------------- ACK 10
15. SEND 16 ----------------------------> 15. SEND 16 ---------------------------->
(cwnd = 7, ssthresh = 6, FlightSize = 7) (cwnd = 7, ssthresh = 6, FlightSize = 7)
... ...
When a sudden delay long enough to trigger timeout occurs at step 5, When a sudden delay long enough to trigger timeout occurs at step 5,
the TCP sender retransmits the first unacknowledged segment (step 6). the TCP sender retransmits the first unacknowledged segment (step 6).
The next ACK covers the RTO retransmission because originally The next ACK covers the RTO retransmission because the originally
transmitted segment 6 arrives at the receiver, and the TCP sender transmitted segment 6 arrived at the receiver, and the TCP sender
continues by sending two new data segments (steps 8, 9). Note that on continues by sending two new data segments (steps 8, 9). Note that on
F-RTO steps (1) and (2b) congestion window and FlightSize are not yet F-RTO steps (1) and (2b) congestion window and FlightSize are not yet
reset, because in case of possible spurious timeout the segments sent reset, because in case of possible spurious timeout the segments sent
before the timeout are still in the network. However, the sender before the timeout are still in the network. However, the sender
should still be equally aggressive to conventional TCP. Because the should still be equally aggressive to conventional TCP. Because the
second acknowledgment arriving after the RTO retransmission second acknowledgment arriving after the RTO retransmission
acknowledges data that was not retransmitted due to timeout (step acknowledges data that was not retransmitted due to timeout (step
10), the TCP sender declares the timeout as spurious and continues by 10), the TCP sender declares the timeout as spurious and continues by
sending new data on next acknowledgments. Also the congestion control sending new data on next acknowledgments. Also the congestion control
state is reversed, as required by the Eifel Response Algorithm. state is reversed, as required by the Eifel Response Algorithm.
skipping to change at page 17, line 31 skipping to change at page 18, line 37
Again, F-RTO sender transmits two new segments (steps 8 and 9) after Again, F-RTO sender transmits two new segments (steps 8 and 9) after
the RTO retransmission is acknowledged. Because the next ACK does not the RTO retransmission is acknowledged. Because the next ACK does not
acknowledge any data that was not retransmitted after the acknowledge any data that was not retransmitted after the
retransmission timeout (step 10), the F-RTO sender proceeds with retransmission timeout (step 10), the F-RTO sender proceeds with
conventional recovery and slow start retransmissions. conventional recovery and slow start retransmissions.
A.4. Packet reordering A.4. Packet reordering
Since F-RTO modifies the TCP sender behavior only after a Since F-RTO modifies the TCP sender behavior only after a
retransmission timeout and it is intended to avoid unnecessary retransmission timeout and it is intended to avoid unnecessary
retransmits only after spurious timeout, we limit the discussion on retransmissions only after spurious timeout, we limit the discussion
the effects of packet reordering in F-RTO behavior to the cases where on the effects of packet reordering in F-RTO behavior to the cases
packet reordering occurs immediately after the retransmission where packet reordering occurs immediately after the retransmission
timeout. When the TCP receiver gets an out-of-order segment, it timeout. When the TCP receiver gets an out-of-order segment, it
generates a duplicate ACK. If the TCP sender implements the basic F- generates a duplicate ACK. If the TCP sender implements the basic
RTO algorithm, this may prevent the sender from detecting a spurious F-RTO algorithm, this may prevent the sender from detecting a
timeout. spurious timeout.
However, if the TCP sender applies the SACK-enhanced F-RTO, it is However, if the TCP sender applies the SACK-enhanced F-RTO, it is
possible to detect a spurious timeout also when packet reordering possible to detect a spurious timeout also when packet reordering
occurs. We illustrate the behavior of SACK-enhanced F-RTO below when occurs. We illustrate the behavior of SACK-enhanced F-RTO below when
segment 8 arrives before segments 6 and 7, and segments starting from segment 8 arrives before segments 6 and 7, and segments starting from
segment 6 are delayed in the network. In this example the TCP sender segment 6 are delayed in the network. In this example the TCP sender
reduces the congestion window and slow start threshold in response to reduces the congestion window and slow start threshold in response to
spurious timeout. spurious timeout.
... ...
skipping to change at page 18, line 44 skipping to change at page 19, line 50
After RTO expires and the sender retransmits segment 6 (step 6), the After RTO expires and the sender retransmits segment 6 (step 6), the
receiver gets segment 8 and generates duplicate ACK with SACK for receiver gets segment 8 and generates duplicate ACK with SACK for
segment 8. In response to the acknowledgment the TCP sender does not segment 8. In response to the acknowledgment the TCP sender does not
send anything but stays in F-RTO step 2. Because the next send anything but stays in F-RTO step 2. Because the next
acknowledgment advances the cumulative ACK point (step 9), the sender acknowledgment advances the cumulative ACK point (step 9), the sender
can transmit two new segments according to SACK-enhanced F-RTO. The can transmit two new segments according to SACK-enhanced F-RTO. The
next segment acknowledges new data between 7 and 11 that was not next segment acknowledges new data between 7 and 11 that was not
acknowledged earlier (segment 7), so the F-RTO sender declares the acknowledged earlier (segment 7), so the F-RTO sender declares the
timeout spurious. timeout spurious.
Appendix B: Applying SACK-enhanced F-RTO when RTO occurs during loss Appendix B: SACK-enhanced F-RTO and Fast Recovery
recovery
We believe that slightly modified SACK-enhanced F-RTO algorithm can We believe that slightly modified SACK-enhanced F-RTO algorithm can
be used to detect spurious timeouts also when RTO expires while an be used to detect spurious timeouts also when RTO expires while an
earlier loss recovery is underway. However, there are issues that earlier loss recovery is underway. However, there are issues that
need to be considered if F-RTO is applied in this case. need to be considered if F-RTO is applied in this case.
The original SACK-based F-RTO requires in algorithm step 3 that an The original SACK-based F-RTO requires in algorithm step 3 that an
ACK acknowledges previously unacknowledged non-retransmitted data ACK acknowledges previously unacknowledged non-retransmitted data
between SND.UNA and send_high. If RTO expires during earlier (SACK- between SND.UNA and send_high. If RTO expires during earlier
based) loss recovery, the F-RTO sender must only use acknowledgments (SACK-based) loss recovery, the F-RTO sender must only use
for non-retransmitted segments transmitted before the SACK-based loss acknowledgments for non-retransmitted segments transmitted before the
recovery started. This means that in order to declare timeout SACK-based loss recovery started. This means that in order to declare
spurious the TCP sender must receive an acknowledgment for non- timeout spurious the TCP sender must receive an acknowledgment for
retransmitted segment between SND.UNA and RecoveryPoint in algorithm non-retransmitted segment between SND.UNA and RecoveryPoint in
step 3. RecoveryPoint is defined in conservative SACK-recovery algorithm step 3. RecoveryPoint is defined in conservative
algorithm [BAFW03], and it is set to indicate the highest segment SACK-recovery algorithm [BAFW03], and it is set to indicate the
transmitted so far when SACK-based loss recovery begins. In other highest segment transmitted so far when SACK-based loss recovery
words, if the TCP sender receives acknowledgment for segment that was begins. In other words, if the TCP sender receives acknowledgment for
transmitted more than one RTO ago, it can declare the timeout segment that was transmitted more than one RTO ago, it can declare
spurious. Defining an efficient algorithm for checking these the timeout spurious. Defining an efficient algorithm for checking
conditions remains as a future work item. these conditions remains as a future work item.
When spurious timeout is detected according to the rules given above, When spurious timeout is detected according to the rules given above,
it may be possible that the response algorithm needs to consider this it may be possible that the response algorithm needs to consider this
case separately, for example in terms of what segments to retransmit case separately, for example in terms of what segments to retransmit
after RTO expires, and whether it is safe to revert the congestion after RTO expires, and whether it is safe to revert the congestion
control parameters in this case. This is considered as a topic of control parameters in this case. This is considered as a topic of
future research. future research.
Appendix C: Discussion on Window Limited Cases Appendix C: Discussion on Window Limited Cases
skipping to change at page 21, line 47 skipping to change at page 22, line 49
Copies of IPR disclosures made to the IETF Secretariat and any Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr. http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at ietf- this standard. Please address the information to the IETF at
ipr@ietf.org. ietf-ipr@ietf.org.
Acknowledgement Acknowledgement
Funding for the RFC Editor function is currently provided by the Funding for the RFC Editor function is currently provided by the
Internet Society. Internet Society.
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/