draft-ietf-tcpm-tcp-dcr-07.txt   rfc4653.txt 
Internet Engineering Task Force Sumitha Bhandarkar Network Working Group S. Bhandarkar
INTERNET DRAFT A. L. Narasimha Reddy Request for Comments: 4653 A. L. N. Reddy
draft-ietf-tcpm-tcp-dcr-07.txt Texas A&M University Category: Experimental Texas A&M University
Expires: July 2006 Mark Allman M. Allman
ICIR/ICSI ICIR/ICSI
Ethan Blanton E. Blanton
Purdue University Purdue University
January 2006 August 2006
Improving the Robustness of TCP to Non-Congestion Events Improving the Robustness of TCP to Non-Congestion Events
Status of this Memo Status of This Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at This memo defines an Experimental Protocol for the Internet
http://www.ietf.org/shadow.html. community. It does not specify an Internet standard of any kind.
Discussion and suggestions for improvement are requested.
Distribution of this memo is unlimited.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2006). Copyright (C) The Internet Society (2006).
Abstract Abstract
This document specifies Non-Congestion Robustness (NCR) for TCP. In This document specifies Non-Congestion Robustness (NCR) for TCP. In
the absence of explicit congestion notification from the network TCP the absence of explicit congestion notification from the network, TCP
uses loss as an indication of congestion. One of the ways TCP uses loss as an indication of congestion. One of the ways TCP
detects loss is using the arrival of three duplicate acknowledgments. detects loss is using the arrival of three duplicate acknowledgments.
However, this heuristic is not always correct, notably in the case However, this heuristic is not always correct, notably in the case
when network paths reorder segments (for whatever reason), resulting when network paths reorder segments (for whatever reason), resulting
in degraded performance. TCP-NCR is designed to mitigate this in degraded performance. TCP-NCR is designed to mitigate this
degraded performance by increasing the number of duplicate degraded performance by increasing the number of duplicate
acknowledgments required to trigger loss recovery, based on the acknowledgments required to trigger loss recovery, based on the
current state of the connection, in an effort to better disambiguate current state of the connection, in an effort to better disambiguate
true segment loss from segment reordering. This document specifies true segment loss from segment reordering. This document specifies
the changes to TCP, as well as the costs and benefits of these the changes to TCP, as well as the costs and benefits of these
modifications. modifications.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction ....................................................2
2. NCR Description . . . . . . . . . . . . . . . . . . . . 5 1.1. Terminology ................................................4
3. Algorithm . . . . . . . . . . . . . . . . . . . . . . . 6 2. NCR Description .................................................5
3.1 Initialization . . . . . . . . . . . . . . . . . . . 8 3. Algorithm .......................................................6
3.2 Terminating Extended Limited Transmit and 3.1. Initialization .............................................8
Preventing Bursts . . . . . . . . . . . . . . . . . . 9 3.2. Terminating Extended Limited Transmit and
3.3 Extended Limited Transmit . . . . . . . . . . . . . . 10 Preventing Bursts ..........................................9
3.4 Entering Loss Recovery . . . . . . . . . . . . . . . 11 3.3. Extended Limited Transmit .................................10
4. Advantages . . . . . . . . . . . . . . . . . . . . . . . 11 3.4. Entering Loss Recovery ....................................11
5. Disadvantages . . . . . . . . . . . . . . . . . . . . . 12 4. Advantages .....................................................12
6. Related Work . . . . . . . . . . . . . . . . . . . . . . 13 5. Disadvantages ..................................................12
7. Security Considerations . . . . . . . . . . . . . . . . 13 6. Related Work ...................................................13
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . 14 7. Security Considerations ........................................14
9. IANA Considerations . . . . . . . . . . . . . . . . . . 14 8. Acknowledgments ................................................14
10. Normative References . . . . . . . . . . . . . . . . . . 14 9. IANA Considerations ............................................14
11. Informative References . . . . . . . . . . . . . . . . . 14 10. References ....................................................14
12. Author's Addresses . . . . . . . . . . . . . . . . . . . 16 10.1. Normative References .....................................14
10.2. Informative References ...................................15
Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in [RFC2119].
Readers should be familiar with the TCP terminology (e.g.,
FlightSize, Pipe, etc.) given in [RFC2581] and [RFC3517].
1. Introduction 1. Introduction
One strength of TCP [RFC793] lies in its ability to adjust its One strength of TCP [RFC793] lies in its ability to adjust its
sending rate according to the perceived congestion in the network sending rate according to the perceived congestion in the network
[Jac88,RFC2581]. In the absence of explicit notification of [Jac88,RFC2581]. In the absence of explicit notification of
congestion from the network, TCP uses segment loss as an indication congestion from the network, TCP uses segment loss as an indication
of congestion (i.e., assuming queue overflow). TCP receivers send of congestion (i.e., assuming queue overflow). TCP receivers send
cumulative acknowledgments (ACKs) indicating the next sequence number cumulative acknowledgments (ACKs) indicating the next sequence number
expected from the sender for arriving segments [RFC793]. When expected from the sender for arriving segments [RFC793]. When
segments arrive out-of-order, duplicate ACKs are generated. As segments arrive out of order, duplicate ACKs are generated. As
specified in [RFC2581], a TCP sender uses the arrival of three specified in [RFC2581], a TCP sender uses the arrival of three
duplicate ACKs as an indication of segment loss. The TCP sender duplicate ACKs as an indication of segment loss. The TCP sender
retransmits the lost segment and reduces the load imposed on the retransmits the lost segment and reduces the load imposed on the
network, assuming the segment loss was caused by resource contention network, assuming the segment loss was caused by resource contention
within the network path. The TCP sender does not assume loss on the within the network path. The TCP sender does not assume loss on the
first or second duplicate ACK, but waits for three duplicate ACKs to first or second duplicate ACK, but waits for three duplicate ACKs to
account for minor packet reordering. However, the use of this account for minor packet reordering. However, the use of this
constant threshold of duplicate ACKs has several problems that can be constant threshold of duplicate ACKs has several problems that can be
mitigated with a dynamic threshold. mitigated with a dynamic threshold.
The following is an example of TCP's behavior: The following is an example of TCP's behavior:
+ TCP A is the data sender and TCP B is the data receiver. + TCP A is the data sender, and TCP B is the data receiver.
+ TCP A sends 10 segments each consisting of a single data byte + TCP A sends 10 segments, each consisting of a single data byte
(i.e., transmits bytes 1-10 in segments 1-10). (i.e., transmits bytes 1-10 in segments 1-10).
+ Assume segment 3 is dropped in the network. + Assume segment 3 is dropped in the network.
+ TCP B cumulatively acknowledges segments 1 and 2, making the + TCP B cumulatively acknowledges segments 1 and 2, making the
cumulative ACK transmitted to the sender 3 (the next expected cumulative ACK transmitted to the sender 3 (the next expected
sequence number). (Note: TCP B may generate one or two ACKs, sequence number). (Note: TCP B may generate one or two ACKs,
depending on whether delayed ACKs [RFC1122,RFC2581] are depending on whether delayed ACKs [RFC1122,RFC2581] are
employed.) employed.)
skipping to change at page 3, line 50 skipping to change at page 3, line 36
The above scenario will play out in precisely the same manner The above scenario will play out in precisely the same manner
insomuch as a retransmission of segment 3 will be triggered. In insomuch as a retransmission of segment 3 will be triggered. In
other words, TCP is not capable of disambiguating this reordering other words, TCP is not capable of disambiguating this reordering
event from a segment loss, resulting in an unnecessary retransmission event from a segment loss, resulting in an unnecessary retransmission
and rate reduction. and rate reduction.
The following is the specific motivation behind making TCP robust to The following is the specific motivation behind making TCP robust to
reordered segments: reordered segments:
* A number of Internet measurement studies have shown that packet * A number of Internet measurement studies have shown that packet
reordering is not a rare phenomenon [Pax97,BPS99,JIDKT03,GPL04]. reordering is not a rare phenomenon [Pax97, BPS99, JIDKT03,
Further, the reordering can be well beyond that required for GPL04]. Further, the reordering can be well beyond that required
fast retransmit to be falsely triggered. for fast retransmit to be falsely triggered.
* [BA02,ZKFP03] show the negative performance implications that * [BA02,ZKFP03] show the negative performance implications that
packet reordering has on current TCP. packet reordering has on current TCP.
* The requirement imposed by TCP for almost in-order packet * The requirement imposed by TCP for almost in-order packet
delivery places a constraint on the design of future technology. delivery places a constraint on the design of future technology.
Novel routing algorithms, network components, link-layer Novel routing algorithms, network components, link-layer
retransmission mechanisms and applications could all be looked retransmission mechanisms, and applications could all be looked
at with a fresh perspective if TCP were to be more robust to at with a fresh perspective if TCP were to be more robust to
segment reordering. For instance, high speed packet switches segment reordering. For instance, high-speed packet switches
could cause resequencing of packets if TCP were more robust. could cause resequencing of packets if TCP were more robust.
There has been work proposed in the literature explicitly to There has been work proposed in the literature explicitly to
ensure that packet ordering is maintained in such switches ensure that packet ordering is maintained in such switches (e.g.,
(e.g., [KM02]). Also, link-layer mechanisms that attempt to [KM02]). Also, link-layer mechanisms that attempt to recover
recover from packet corruption by retransmitting could be from packet corruption by retransmitting could be allowed to
allowed to reorder packets and, hence, increase the chances of reorder packets, and thus increase the chances of local loss
local loss repair rather than relying on TCP to repair the loss repair rather than rely on TCP to repair the loss (and,
(and, needlessly reduce its sending rate). Additional examples needlessly reduce its sending rate). Additional examples include
include multi-path routing, high-delay satellite links and some multi-path routing, high-delay satellite links, and some of the
of the schemes proposed for a differentiated services schemes proposed for a differentiated services architecture. By
architecture. By making TCP more robust to non-congestion making TCP more robust to non-congestion events, TCP-NCR may open
events, TCP-NCR may open the design space of the future Internet the design space of the future Internet components.
components.
In this document we specify a set of TCP sender modifications to In this document, we specify a set of TCP sender modifications to
provide Non-Congestion Robustness (NCR) to TCP. In particular, these provide Non-Congestion Robustness (NCR) to TCP. In particular, these
changes are built on top of TCP with selective acknowledgments changes are built on top of TCP with selective acknowledgments
(SACKs) [RFC2018] and the SACK-based loss recovery scheme given in (SACKs) [RFC2018] and the SACK-based loss recovery scheme given in
[RFC3517], since SACK is widely deployed at this point ([MAF05] [RFC3517], since SACK is widely deployed at this point ([MAF05]
indicates that 68% of web servers and 88% of web clients utilize SACK indicates that 68% of web servers and 88% of web clients utilize SACK
as of spring, 2004). as of spring 2004).
We note that the TCP-NCR algorithm provided in this document could be Note that the TCP-NCR algorithm provided in this document could be
easily adapted to SCTP [RFC2960] since SCTP uses congestion control easily adapted to SCTP [RFC2960] since SCTP uses congestion control
algorithms similar to TCP's (and, hence, has the same reordering algorithms similar to TCP's (and thus has the same reordering
robustness issues). robustness issues).
As we note in several places in the remainder of this document, we As noted in several places in the remainder of this document, we
consider TCP-NCR to be experimental in that more experience with the consider TCP-NCR experimental in that more experience with the
techniques is required before TCP-NCR should be used on a large scale techniques is required before TCP-NCR should be used on a large scale
on the Internet. We encourage implementation and experimentation on the Internet. We encourage implementation and experimentation
with TCP-NCR in the hopes of gaining an understanding of its with TCP-NCR in the hopes of gaining an understanding of its
suitability for wide-scale deployment. suitability for wide-scale deployment.
The remainder of this document is organized as follows. Section 2 The remainder of this document is organized as follows. Section 2
provides a high-level description of the TCP-NCR mechanisms. In provides a high-level description of the TCP-NCR mechanisms. In
Section 3, we specify the TCP-NCR algorithm. Section 4 provides a Section 3, we specify the TCP-NCR algorithm. Section 4 provides a
brief overview of the benefits of TCP-NCR, while Section 5 discusses brief overview of the benefits of TCP-NCR, while Section 5 discusses
the drawbacks of TCP-NCR. Section 6 discusses related work. Section the drawbacks. Section 6 discusses related work. Section 7
7 discusses security concerns. discusses security concerns.
1.1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Readers should be familiar with the TCP terminology (e.g.,
FlightSize, Pipe) given in [RFC2581] and [RFC3517].
2. NCR Description 2. NCR Description
As discussed above, in the face of packet reordering, three duplicate As discussed above, in the face of packet reordering, three duplicate
ACKs may not be enough to disambiguate loss from reordering. In this ACKs may not be enough to disambiguate loss from reordering. In this
section we provide a non-normative sketch of TCP-NCR. The detailed section we provide a non-normative sketch of TCP-NCR. The detailed
algorithms for implementing Non-Congestion Robustness for TCP are algorithms for implementing Non-Congestion Robustness for TCP are
presented in the next section. presented in the next section.
The general idea behind TCP-NCR is to increase the threshold used to The general idea behind TCP-NCR is to increase the threshold used to
trigger a fast retransmission from the current fixed value of three trigger a fast retransmission from the current fixed value of three
duplicate ACKs [RFC2581] to approximately a congestion window of data duplicate ACKs [RFC2581] to approximately a congestion window of data
having left the network (but, not less than the currently having left the network (but not less than the currently standardized
standardized value of three duplicate ACKs). Since cwnd represents value of three duplicate ACKs). Since cwnd represents the amount of
the amount of data a TCP flow can transmit in one round-trip time data a TCP flow can transmit in one round-trip time (RTT), waiting to
(RTT), waiting to receive notice that cwnd bytes have left the receive notice that cwnd bytes have left the network before deciding
network before deciding whether the root cause is loss or reordering whether the root cause is loss or reordering imposes a delay of
imposes a delay of roughly one RTT on both the retransmission and the roughly one RTT on both the retransmission and the congestion control
congestion control response. The appropriate choice for a new value response. The appropriate choice for a new value of the threshold is
of the threshold is essentially a tradeoff between making the best essentially a trade-off between making the best decision regarding
decision regarding the cause of the duplicate ACKs and the cause of the duplicate ACKs and responsiveness. The choice to
responsiveness. The choice to trigger a retransmission only after a trigger a retransmission only after a cwnd's worth of data is known
cwnd's worth of data is known to have left the network represents to have left the network represents roughly the largest amount of
roughly the largest amount of time a TCP can wait before the (often time a TCP can wait before the (often costly) retransmission timeout
costly) retransmission timeout may be triggered. Therefore, the may be triggered. Therefore, the algorithm described in this
algorithm described in this document attempts to make the best document attempts to make the best decision possible at the expense
decision possible at the expense of timeliness. of timeliness.
Simply increasing the threshold before retransmitting a segment can Simply increasing the threshold before retransmitting a segment can
make TCP brittle to packet loss or ACK loss since such loss reduces make TCP brittle to packet loss or ACK loss since such loss reduces
the number of duplicate ACKs that will arrive at the sender from the the number of duplicate ACKs that will arrive at the sender from the
receiver. For instance, if the cwnd is 10 segments and one segment receiver. For instance, if the cwnd is 10 segments and one segment
is lost, a duplicate ACK threshold of 10 will never be met because is lost, a duplicate ACK threshold of 10 will never be met because
duplicate ACKs corresponding to at most 9 segments will arrive at the duplicate ACKs corresponding to at most 9 segments will arrive at the
sender. To offset the issue of loss, we extend TCP's Limited sender. To offset the issue of loss, we extend TCP's Limited
Transmit [RFC3042] scheme to allow for the sending of new data during Transmit [RFC3042] scheme to allow for the sending of new data during
the period when the TCP sender is disambiguating loss and reordering. the period when the TCP sender is disambiguating loss and reordering.
This new data serves to increase the likelihood of enough duplicate This new data serves to increase the likelihood that enough duplicate
ACKs arriving at the sender to trigger loss recovery if it is ACKs arrive at the sender to trigger loss recovery if it is
appropriate. appropriate.
At this point we note that TCP tightly couples reliability and Note that TCP tightly couples reliability and congestion control:
congestion control -- when a segment is declared lost, a when a segment is declared lost, a retransmission is triggered, and a
retransmission is triggered and a change to the sending rate is also change to the sending rate is also made on the assumption that the
made on the assumption that the drop is due to resource contention drop is due to resource contention [RFC2581]. Therefore, simply by
[RFC2581]. Therefore, by simply changing the retransmission trigger changing the retransmission trigger, the congestion control response
the congestion control response is also changed. However, we lack is also changed. However, we lack experience on the Internet as to
experience on the Internet as to whether delaying the point that a whether delaying the point that a rate reduction takes place is
rate reduction takes place is appropriate for wide-scale deployment. appropriate for wide-scale deployment. Therefore, the Extended
Therefore, the Extended Limited Transmit mechanism proposed in this Limited Transmit mechanism proposed in this document offers two
document offers two variants for experimentation. variants for experimentation.
The first Extended Limited Transmit variant, Careful Limited The first Extended Limited Transmit variant, Careful Limited
Transmit, calls for the transmission of one previously unsent Transmit, calls for the transmission of one previously unsent
segment, in response to duplicate acknowledgements, for every two segment, in response to duplicate acknowledgments, for every two
segments that are known to have left the network. This has the segments that are known to have left the network. This effectively
effect of halving the sending rate since normal TCP operation calls halves the sending rate, since normal TCP operation calls for the
for the sending of one segment for every segment that has left the sending of one segment for every segment that has left the network.
network. Further, the halving starts immediately and is not delayed Further, the halving starts immediately and is not delayed until a
until a retransmission is triggered. In the case of packet retransmission is triggered. In the case of packet reordering (i.e.,
reordering (i.e., not segment loss) the congestion control state is not segment loss), the congestion control state is restored to its
restored to its previous state when reordering is determined. previous state when reordering is determined.
The second variant, Aggressive Limited Transmit, calls for The second variant, Aggressive Limited Transmit, calls for
transmitting one previously unsent data segment, in response to transmitting one previously unsent data segment, in response to
duplicate acknowledgements, for every segment known to have left the duplicate acknowledgments, for every segment known to have left the
network. With this variant, while waiting to disambiguate the loss network. With this variant, while waiting to disambiguate the loss
from a reordering event, ACK-clocked transmission continues at from a reordering event, ACK-clocked transmission continues at
roughly the same rate as before the event started. Retransmission roughly the same rate as before the event started. Retransmission
and the sending rate reduction happen per [RFC2581,RFC3517], albeit and the sending rate reduction happen per [RFC2581,RFC3517], albeit
with the delayed threshold described above. While this approach with the delayed threshold described above. Although this approach
delays legitimate rate reductions (possibly slightly and temporarily delays legitimate rate reductions (possibly slightly and temporarily
aggravating overall congestion on the network) the scheme has the aggravating overall congestion on the network), the scheme has the
advantage of not reducing the transmission rate in the face of advantage of not reducing the transmission rate in the face of
segment reordering. segment reordering.
It is an open question which of the two Extended Limited Transmit Which of the two Extended Limited Transmit variants is best for use
variants is best for use on the Internet. on the Internet is an open question.
3. Algorithm 3. Algorithm
The TCP-NCR modifications make two fundamental changes to the way The TCP-NCR modifications make two fundamental changes to the way
[RFC3517] currently operates, as follows. [RFC3517] currently operates, as follows.
First, the trigger for retransmitting a segment is changed from three First, the trigger for retransmitting a segment is changed from three
duplicate ACKs [RFC2581,RFC3517] to indications that a congestion duplicate ACKs [RFC2581,RFC3517] to indications that a congestion
window's worth of data has left the network. Second, TCP-NCR window's worth of data has left the network. Second, TCP-NCR
decouples initial congestion control decisions from retransmission decouples initial congestion control decisions from retransmission
decisions, in some cases delaying congestion control changes relative decisions, in some cases delaying congestion control changes relative
to TCP's current behavior defined in [RFC2581]. The algorithm to TCP's current behavior as defined in [RFC2581]. The algorithm
provides two alternatives for extending Limited Transmit. The two provides two alternatives for extending Limited Transmit. The two
variants of extended Limited Transmit are: variants of extended Limited Transmit are:
Careful Limited Transmit: Careful Limited Transmit
This variant calls for reducing the sending rate at This variant calls for reducing the sending rate at
approximately the same time [RFC2581] implementations reduce approximately the same time [RFC2581] implementations reduce
the congestion window, while at the same time withholding a the congestion window, while at the same time withholding a
retransmission (and the final congestion determination) for retransmission (and the final congestion determination) for
approximately one RTT. approximately one RTT.
Aggressive Limited Transmit: Aggressive Limited Transmit
This variant calls for maintaining the sending rate in the This variant calls for maintaining the sending rate in the
face of duplicate ACKs until TCP concludes a segment is lost face of duplicate ACKs until TCP concludes that a segment is
and needs to be retransmitted (which TCP-NCR delays by one lost and needs to be retransmitted (which TCP-NCR delays by
RTT when compared with current loss recovery schemes). one RTT when compared with current loss recovery schemes).
A TCP-NCR implementation MUST use either Careful Limited Transmit or A TCP-NCR implementation MUST use either Careful Limited Transmit or
Aggressive Limited Transmit. Aggressive Limited Transmit.
A constant MUST be set depending on which variant of extended Limited A constant MUST be set, depending on which variant of extended
Transmit is used, as follows: Limited Transmit is used, as follows:
Careful Limited Transmit: Careful Limited Transmit
LT_F = 2/3 LT_F = 2/3
Aggressive Limited Transmit: Aggressive Limited Transmit
LT_F = 1/2 LT_F = 1/2
This constant reflects the fraction of outstanding data (including This constant reflects the fraction of outstanding data (including
data sent during Extended Limited Transmit) that must be SACKed data sent during Extended Limited Transmit) that must be SACKed
before a retransmission is triggered. Since Aggressive Limited before a retransmission is triggered. Since Aggressive Limited
Transmit sends a new segment for every segment known to have left the Transmit sends a new segment for every segment known to have left the
network, a total of roughly cwnd segments will be sent during network, a total of roughly cwnd segments will be sent during
Aggressive Limited Transmit and therefore ideally a total of roughly Aggressive Limited Transmit, and therefore ideally a total of roughly
2*cwnd segments will be outstanding when a retransmission is 2*cwnd segments will be outstanding when a retransmission is
triggered. The duplicate ACK threshold is then set to LT_F = 1/2 of triggered. The duplicate ACK threshold is then set to LT_F = 1/2 of
2*cwnd (or about 1 RTT worth of data). The factor is different for 2*cwnd (or about 1 RTT worth of data). The factor is different for
Careful Limited Transmit because the sender only transmits one new Careful Limited Transmit because the sender only transmits one new
segment for every two segments that are SACKed and therefore will segment for every two segments that are SACKed and therefore will
ideally have a total of 1.5*cwnd segments outstanding when the ideally have a total of 1.5*cwnd segments outstanding when the
retransmission is to be triggered. Hence, the required threshold is retransmission is to be triggered. Hence, the required threshold is
LT_F=2/3 of 1.5*cwnd to delay the retransmission by roughly 1 RTT. LT_F=2/3 of 1.5*cwnd to delay the retransmission by roughly 1 RTT.
There are situations whereby the sender cannot transmit new data There are situations whereby the sender cannot transmit new data
skipping to change at page 8, line 15 skipping to change at page 8, line 15
Therefore, TCP-NCR adapts the duplicate ACK threshold on each SACK Therefore, TCP-NCR adapts the duplicate ACK threshold on each SACK
arrival to be as robust as possible given the actual amount of data arrival to be as robust as possible given the actual amount of data
that has been transmitted, or roughly LT_F times the number of that has been transmitted, or roughly LT_F times the number of
outstanding segments. outstanding segments.
The TCP-NCR modifications specified in this document lend themselves The TCP-NCR modifications specified in this document lend themselves
to incremental deployment. Only the TCP implementation on the sender to incremental deployment. Only the TCP implementation on the sender
side requires modification (assuming both hosts support SACK). The side requires modification (assuming both hosts support SACK). The
changes themselves are modest. However, as will be discussed below, changes themselves are modest. However, as will be discussed below,
availability of additional buffer space at the receiver will help availability of additional buffer space at the receiver will help
maximize the benefits of using TCP-NCR but are not strictly maximize the benefits of using TCP-NCR but is not strictly necessary.
necessary.
The following algorithms depend on the notions provided by [RFC3517] The following algorithms depend on the notions provided by [RFC3517],
and we assume the reader is familiar with the terminology given in and we assume the reader is familiar with the terminology given in
[RFC3517]. The TCP-NCR algorithm can be adapted to alternate SACK- [RFC3517]. The TCP-NCR algorithm can be adapted to alternate SACK-
based loss recovery schemes. [BR04,BSRV04] outline non-SACK-based based loss recovery schemes. [BR04,BSRV04] outline non-SACK-based
algorithms, however, we do not specify those algorithms in this algorithms; however, we do not specify those algorithms in this
document and do not recommend them due to both the complexity and document and do not recommend them due to both the complexity and
security implications of having only a gross understanding of the security implications of having only a gross understanding of the
number of outstanding segments in the network. number of outstanding segments in the network.
A TCP connection using the Nagle algorithm [RFC896,RFC1122] MAY A TCP connection using the Nagle algorithm [RFC896,RFC1122] MAY
employ the TCP-NCR algorithm. If a TCP implementation does implement employ the TCP-NCR algorithm. If a TCP implementation does implement
TCP-NCR the implementation MUST follow the various specifications TCP-NCR, the implementation MUST follow the various specifications
provided in sections 3.1 - 3.4. If the Nagle algorithm is not being provided in Sections 3.1 - 3.4. If the Nagle algorithm is not being
used there is no way to accurately calculate the number of used, there is no way to accurately calculate the number of
outstanding segments in the network (and, therefore, no good way to outstanding segments in the network (and, therefore, no good way to
derive an appropriate duplicate ACK threshold) without adding state derive an appropriate duplicate ACK threshold) without adding state
to the TCP sender. A TCP connection that does not employ the Nagle to the TCP sender. A TCP connection that does not employ the Nagle
algorithm SHOULD NOT use TCP-NCR. We envision that NCR could be algorithm SHOULD NOT use TCP-NCR. We envision that NCR could be
adapted to an implementation that carefully tracks the sequence adapted to an implementation that carefully tracks the sequence
numbers transmitted in each segment. However, we leave this as numbers transmitted in each segment. However, we leave this as
future work. future work.
3.1. Initialization 3.1. Initialization
When entering a period of loss / reordering detection and Extended When entering a period of loss / reordering detection and Extended
Limited Transmit a TCP-NCR MUST initialize several state variables. Limited Transmit, a TCP-NCR MUST initialize several state variables.
A TCP MUST enter Extended Limited Transmit upon receiving the first A TCP MUST enter Extended Limited Transmit upon receiving the first
ACK with a SACK block after the reception of an ACK that (a) did not ACK with a SACK block after the reception of an ACK that (a) did not
contain SACK information and (b) did increase the connection's contain SACK information and (b) did increase the connection's
cumulative ACK point. The initializations are: cumulative ACK point. The initializations are:
(I.1) The TCP MUST save the current FlightSize. (I.1) The TCP MUST save the current FlightSize.
FlightSizePrev = FlightSize FlightSizePrev = FlightSize
(I.2) The TCP MUST set a variable for tracking the number of (I.2) The TCP MUST set a variable for tracking the number of
segments for which an ACK does not trigger a transmission segments for which an ACK does not trigger a transmission
skipping to change at page 9, line 22 skipping to change at page 9, line 22
(I.3) The TCP MUST set DupThresh (from [RFC3517]) based on the (I.3) The TCP MUST set DupThresh (from [RFC3517]) based on the
current FlightSize. current FlightSize.
DupThresh = max (LT_F * (FlightSize / SMSS),3) DupThresh = max (LT_F * (FlightSize / SMSS),3)
Note: We keep the lower bound of DupThresh = 3 from Note: We keep the lower bound of DupThresh = 3 from
[RFC2581,RFC3517]. [RFC2581,RFC3517].
In addition to the above steps, the incoming ACK MUST be processed In addition to the above steps, the incoming ACK MUST be processed
with the E series of steps in section 3.3. with the E series of steps in Section 3.3.
3.2. Terminating Extended Limited Transmit and Preventing Bursts 3.2. Terminating Extended Limited Transmit and Preventing Bursts
Extended Limited Transmit MUST be terminated at the start of loss Extended Limited Transmit MUST be terminated at the start of loss
recovery as outlined in section 3.4. recovery as outlined in Section 3.4.
The arrival of an ACK that advances the cumulative ACK point while in The arrival of an ACK that advances the cumulative ACK point while in
Extended Limited Transmit, but before loss recovery is triggered Extended Limited Transmit, but before loss recovery is triggered,
signals that a series of duplicate ACKs were caused by reordering and signals that a series of duplicate ACKs was caused by reordering and
not congestion. Therefore, the receipt of an ACK that extends the not congestion. Therefore, the receipt of an ACK that extends the
cumulative ACK point MUST terminate Extended Limited Transmit. As cumulative ACK point MUST terminate Extended Limited Transmit. As
described below (in (T.4)), an ACK that extends the cumulative ACK described below (in (T.4)), an ACK that extends the cumulative ACK
point and *also* contains SACK information will also trigger the point and *also* contains SACK information will also trigger the
beginning of a new Extended Limited Transmit phase. beginning of a new Extended Limited Transmit phase.
Upon the termination of Extended Limited Transmit, and especially Upon the termination of Extended Limited Transmit, and especially
when using the Careful variant, TCP-NCR may be in a situation where when using the Careful variant, TCP-NCR may be in a situation where
the entire cwnd is not being utilized and therefore TCP-NCR will be the entire cwnd is not being utilized, and therefore TCP-NCR will be
prone to transmitting a burst of segments into the network. prone to transmitting a burst of segments into the network.
Therefore, to mitigate this bursting when a TCP-NCR in the Extended Therefore, to mitigate this bursting when a TCP-NCR in the Extended
Limited Transmit phase receives an ACK that updates the cumulative Limited Transmit phase receives an ACK that updates the cumulative
ACK point (regardless of whether the ACK contains SACK information), ACK point (regardless of whether the ACK contains SACK information),
the following steps MUST be taken: the following steps MUST be taken:
(T.1) A TCP MUST reset cwnd to: (T.1) A TCP MUST reset cwnd to:
cwnd = min (FlightSize + SMSS,FlightSizePrev) cwnd = min (FlightSize + SMSS,FlightSizePrev)
This step ensures that cwnd is not grossly larger than the This step ensures that cwnd is not grossly larger than the
amount of data outstanding --- a situation that would cause a amount of data outstanding, a situation that would cause a
line rate burst. line rate burst.
(T.2) A TCP MUST set ssthresh to: (T.2) A TCP MUST set ssthresh to:
ssthresh = FlightSizePrev ssthresh = FlightSizePrev
This step provides TCP-NCR with a sense of "history". If step This step provides TCP-NCR with a sense of "history". If step
(T.1) reduces cwnd below FlightSizePrev this step ensures that (T.1) reduces cwnd below FlightSizePrev, this step ensures that
TCP-NCR will slow start back to the operating point in effect TCP-NCR will slow start back to the operating point in effect
before Extended Limited Transmit. before Extended Limited Transmit.
(T.3) A TCP is now permitted to transmit previously unsent data as (T.3) A TCP is now permitted to transmit previously unsent data as
allowed by cwnd, FlightSize, application data availability and allowed by cwnd, FlightSize, application data availability, and
the receiver's advertised window. the receiver's advertised window.
(T.4) When an incoming ACK extends the cumulative ACK point and also (T.4) When an incoming ACK extends the cumulative ACK point and also
contains SACK information, the initializations in steps (I.2) contains SACK information, the initializations in steps (I.2)
and (I.3) from section 3.1 MUST be taken (but, step (I.1) MUST and (I.3) from Section 3.1 MUST be taken (but step (I.1) MUST
NOT be executed) to re-start Extended Limited Transmit. In NOT be executed) to re-start Extended Limited Transmit. In
addition, the series of steps in section 3.3 (the "E" steps) addition, the series of steps in Section 3.3 (the "E" steps)
MUST be taken. MUST be taken.
3.3. Extended Limited Transmit 3.3. Extended Limited Transmit
On each ACK containing SACK information that arrives after TCP-NCR On each ACK containing SACK information that arrives after TCP-NCR
has entered the Extended Limited Transmit phase (as outlined in has entered the Extended Limited Transmit phase (as outlined in
section 3.1) and before Extended Limited Transmit terminates, the Section 3.1) and before Extended Limited Transmit terminates, the
sender MUST use the following procedure. sender MUST use the following procedure.
(E.1) The SetPipe () procedure from [RFC3517] MUST be used to set (E.1) The SetPipe () procedure from [RFC3517] MUST be used to set
the "pipe" variable (which represents the number of bytes the "pipe" variable (which represents the number of bytes
still considered "in the network"). Note: the current value still considered "in the network"). Note: the current value
of DupThresh MUST be used by SetPipe () to produce an accurate of DupThresh MUST be used by SetPipe () to produce an accurate
assessment of the amount of data still considered in the assessment of the amount of data still considered in the
network. network.
(E.2) If the comparison in equation (1) below holds and there are (E.2) If the comparison in equation (1), below, holds and there are
SMSS bytes of previously unsent data available for SMSS bytes of previously unsent data available for
transmission then the sender MUST transmit one segment of SMSS transmission, then the sender MUST transmit one segment of SMSS
bytes. bytes.
(pipe + Skipped) <= (FlightSizePrev - SMSS) (1) (pipe + Skipped) <= (FlightSizePrev - SMSS) (1)
If the comparison in equation (1) does not hold or no new data If the comparison in equation (1) does not hold or no new data
can be transmitted (due to lack of data from the application can be transmitted (due to lack of data from the application
or the advertised window limit), skip to step (E.6). or the advertised window limit), skip to step (E.6).
(E.3) Pipe MUST be incremented by SMSS bytes. (E.3) Pipe MUST be incremented by SMSS bytes.
(E.4) If using Careful Limited Transmit, Skipped MUST be incremented (E.4) If using Careful Limited Transmit, Skipped MUST be incremented
by SMSS bytes to ensure that the next SMSS bytes of SACKed data by SMSS bytes to ensure that the next SMSS bytes of SACKed data
processed does not trigger a Limited Transmit transmission processed does not trigger a Limited Transmit transmission
(since the goal of Careful Limited Transmit is to send upon (since the goal of Careful Limited Transmit is to send upon
skipping to change at page 11, line 11 skipping to change at page 11, line 14
If the comparison in equation (1) does not hold or no new data If the comparison in equation (1) does not hold or no new data
can be transmitted (due to lack of data from the application can be transmitted (due to lack of data from the application
or the advertised window limit), skip to step (E.6). or the advertised window limit), skip to step (E.6).
(E.3) Pipe MUST be incremented by SMSS bytes. (E.3) Pipe MUST be incremented by SMSS bytes.
(E.4) If using Careful Limited Transmit, Skipped MUST be incremented (E.4) If using Careful Limited Transmit, Skipped MUST be incremented
by SMSS bytes to ensure that the next SMSS bytes of SACKed data by SMSS bytes to ensure that the next SMSS bytes of SACKed data
processed does not trigger a Limited Transmit transmission processed does not trigger a Limited Transmit transmission
(since the goal of Careful Limited Transmit is to send upon (since the goal of Careful Limited Transmit is to send upon
the reception of every second duplicate ACK). receipt of every second duplicate ACK).
(E.5) A TCP MUST return to step (E.2) to ensure that as many bytes (E.5) A TCP MUST return to step (E.2) to ensure that as many bytes
as appropriate are transmitted. This provides robustness to as are appropriate are transmitted. This provides robustness
ACK loss that can be (largely) compensated for using SACK to ACK loss that can be (largely) compensated for using SACK
information. information.
(E.6) DupThresh MUST be reset via: (E.6) DupThresh MUST be reset via:
DupThresh = max (LT_F * (FlightSize / SMSS),3) DupThresh = max (LT_F * (FlightSize / SMSS),3)
where FlightSize is the total number of bytes that have not where FlightSize is the total number of bytes that have not
been cumulatively acknowledged (which is different from been cumulatively acknowledged (which is different from
"pipe"). "pipe").
3.4 Entering Loss Recovery 3.4. Entering Loss Recovery
When a segment is deemed lost via the algorithms in [RFC3517], When a segment is deemed lost via the algorithms in [RFC3517],
Extended Limited Transmit MUST be terminated, leaving the Extended Limited Transmit MUST be terminated, leaving the algorithms
algorithms in [RFC3517] to govern TCP's behavior. One slight in [RFC3517] to govern TCP's behavior. One slight change to
change to [RFC3517] MUST be made, however. In section 5, step [RFC3517] MUST be made, however. In Section 5, step (2) of [RFC3517]
(2) of [RFC3517] MUST be changed to: MUST be changed to:
(2) ssthresh = cwnd = (FlightSizePrev / 2) (2) ssthresh = cwnd = (FlightSizePrev / 2)
This ensures that the congestion control modifications are made This ensures that the congestion control modifications are made with
with respect to the amount of data in the network before respect to the amount of data in the network before FlightSize was
FlightSize was increased by Extended Limited Transmit. increased by Extended Limited Transmit.
Note: Once the algorithm in [RFC3517] takes over from Extended Note: Once the algorithm in [RFC3517] takes over from Extended
Limited Transmit the DupThresh value MUST be held constant until Limited Transmit, the DupThresh value MUST be held constant until the
the loss recovery phase is terminated. loss recovery phase is terminated.
4. Advantages 4. Advantages
The major advantages of TCP-NCR are two-fold. As discussed in The major advantages of TCP-NCR are twofold. As discussed in Section
section 1, TCP-NCR will open up the design space for network 1, TCP-NCR will open up the design space for network applications and
applications and components that are currently constrained by TCP's components that are currently constrained by TCP's lack of robustness
lack of robustness to packet reordering. The second advantage is in to packet reordering. The second advantage is in terms of an
terms of an increase in TCP performance. increase in TCP performance.
[BR04] presents ns-2 [NS-2] simulations of a pre-cursor to the TCP- [BR04] presents ns-2 [NS-2] simulations of a pre-cursor to the TCP-
NCR algorithm specified in this document, called TCP-DCR (Delayed NCR algorithm specified in this document, called TCP-DCR (Delayed
Congestion Response). The paper shows that TCP-DCR aids performance Congestion Response). The paper shows that TCP-DCR aids performance
in comparison to unmodified TCP in the presence of packet reordering. in comparison to unmodified TCP in the presence of packet reordering.
In addition, the extended version of [BR04] presents results based on In addition, the extended version of [BR04] presents results based on
emulations involving Linux (kernel 2.4.24). These results show that emulations involving Linux (kernel 2.4.24). These results show that
the performance of TCP-DCR is similar to Linux's native the performance of TCP-DCR is similar to Linux's native
implementation that seeks to "undo" wrong decisions based on DSACK implementation that seeks to "undo" wrong decisions according to
[RFC2883] feedback (similar to the schemes outlined in [ZKFP03]), duplicate-SACK (DSACK) [RFC2883] feedback (similar to the schemes
when packets are reordered by less than one RTT. The advantage of outlined in [ZKFP03]), when packets are reordered by less than one
using TCP-DCR over the DSACK-based scheme is that the DSACK-based RTT. The advantage of using TCP-DCR over the DSACK-based scheme is
scheme tries to estimate the exact amount of reordering in the that the DSACK-based scheme tries to estimate the exact amount of
network using fairly complex algorithms, whereas TCP-DCR achieves reordering in the network using fairly complex algorithms, whereas
similar results with less complicated modifications. TCP-DCR achieves similar results with less complicated modifications.
In addition, [BR04,BSRV04] illustrate the ability of TCP-DCR to allow In addition, [BR04,BSRV04] illustrate the ability of TCP-DCR to allow
for the improvement of other parts of the system. For example, these for the improvement of other parts of the system. For example, these
papers show that increasing TCP's robustness to packet reordering papers show that increasing TCP's robustness to packet reordering
allows for a novel wireless ARQ mechanism to be added at the link- allows a novel wireless ARQ mechanism to be added at the link-layer.
layer. The added robustness of the link-layer to channel errors, in The added robustness of the link-layer to channel errors, in turn,
turn, increases TCP performance by not requiring TCP to retransmit increases TCP performance by not requiring TCP to retransmit packets
packets that were dropped due to corruption (and, hence, also that were dropped due to corruption (and thus also prevents TCP from
prevents TCP from needlessly reducing the sending rate when needlessly reducing the sending rate when retransmitting these
retransmitting these segments). segments).
5. Disadvantages 5. Disadvantages
While we note that all of the changes outlined above are implemented Although all the changes outlined above are implemented in the
in the sender, the receiver also potentially has a part to play. In sender, the receiver also potentially has a part to play. In
particular, TCP-NCR increases the receiver's buffering requirement by particular, TCP-NCR increases the receiver's buffering requirement by
up to an extra cwnd -- in the case of the TCP sender using Aggressive up to an extra cwnd -- in the case of the TCP sender using Aggressive
Limited Transmit and actual loss occurring in the network. Limited Transmit and actual loss occurring in the network.
Therefore, to maximize the benefits from TCP-NCR receivers should Therefore, to maximize the benefits from TCP-NCR, receivers should
advertise a large window to absorb the extra out-of-order traffic. In advertise a large window to absorb the extra out-of-order traffic.
the case that the additional buffer requirements are not met, the use In the case that the additional buffer requirements are not met, the
of the above algorithm takes into account the reduced advertised use of the above algorithm takes into account the reduced advertised
window---with a corresponding loss in robustness to packet window -- with a corresponding loss in robustness to packet
reordering. reordering.
In addition, using TCP-NCR could delay the delivery of data to the In addition, using TCP-NCR could delay the delivery of data to the
application by up to one RTT because the fast retransmission point is application by up to one RTT because the fast retransmission point is
delayed by roughly one RTT in TCP-NCR. Applications that are delayed by roughly one RTT in TCP-NCR. Applications that are
sensitive to such delays should turn off the TCP-NCR option. For sensitive to such delays should turn off the TCP-NCR option. For
instance, a socket option could be introduced to allow applications instance, a socket option could be introduced to allow applications
to control whether NCR would be used for a particular connection. to control whether NCR would be used for a particular connection.
Finally, the use of TCP-NCR makes the recovery from congestion events Finally, the use of TCP-NCR makes the recovery from congestion events
sluggish in comparison to the standard reaction in [RFC2581]. [BR04, sluggish in comparison to the standard reaction in [RFC2581]. [BR04,
BSRV04] show (via simulation) that the delay in congestion response BSRV04] show (via simulation) that the delay in congestion response
has minimal impact on the connection itself and the traffic sharing a has minimal impact on the connection itself and the traffic sharing a
bottleneck. [BBFS01] also indicates (again, via simulation) that bottleneck. [BBFS01] also indicates (again, via simulation) that
"slowly responsive" congestion control may be safe for deployment in "slowly responsive" congestion control may be safe for deployment in
the Internet. These studies suggest that schemes that slightly delay the Internet. These studies suggest that schemes that slightly delay
congestion control decisions may be reasonable, however, further congestion control decisions may be reasonable; however, further
experimentation on the Internet is required to verify these results. experimentation on the Internet is required to verify these results.
6. Related Work 6. Related Work
Over the past few years, several solutions have been proposed to Over the past few years, several solutions have been proposed to
improve the performance of TCP in the face of segment reordering. improve the performance of TCP in the face of segment reordering.
These schemes generally fall into one of two categories (with some These schemes generally fall into one of two categories (with some
overlap): mechanisms that try to prevent spurious retransmits from overlap): mechanisms that try to prevent spurious retransmits from
happening and mechanisms that try to detect spurious retransmits and happening and mechanisms that try to detect spurious retransmits and
"undo" the needless congestion control state changes that have been "undo" the needless congestion control state changes that have been
taken. taken.
[BA02,ZKFP03] attempt to prevent segment reordering from triggering [BA02,ZKFP03] attempt to prevent segment reordering from triggering
spurious retransmits by using various algorithms to approximate the spurious retransmits by using various algorithms to approximate the
duplicate ACK threshold required to disambiguate loss and reordering duplicate ACK threshold required to disambiguate loss and reordering
over a given network path at a given time. TCP-NCR similarly tries over a given network path at a given time. TCP-NCR similarly tries
to prevent spurious retransmits. However, TCP-NCR takes a simplified to prevent spurious retransmits. However, TCP-NCR takes a simplified
approach compared to those in [BA02,ZKFP03] in that TCP-NCR simply approach compared to those in [BA02, ZKFP03], in that TCP-NCR simply
delays retransmission by an amount based on the current cwnd (in delays retransmission by an amount based on the current cwnd (in
comparison to standard TCP), while the other schemes use relatively comparison to standard TCP), while the other schemes use relatively
complex algorithms in an attempt to derive a more precise value for complex algorithms in an attempt to derive a more precise value for
DupThresh that depends on the current patterns of packet reordering. DupThresh that depends on the current patterns of packet reordering.
While TCP-NCR offers simplicity the other schemes may offer more While TCP-NCR offers simplicity, the other schemes may offer more
precision such that applications would not be forced to wait as long precision such that applications would not be forced to wait as long
for their retransmissions. Future work could be undertaken to for their retransmissions. Future work could be undertaken to
achieve robustness without needless delay. achieve robustness without needless delay.
On the other hand, several schemes have been developed to detect and On the other hand, several schemes have been developed to detect and
mitigate needless retransmissions after the fact. mitigate needless retransmissions after the fact. [RFC3522, RFC3708,
[RFC3522,RFC3708,BA02,RFC4015,SK04] present algorithms to detect BA02, RFC4015, RFC4138] present algorithms to detect spurious
spurious retransmits and mitigate the changes these events made to retransmits and mitigate the changes these events made to the
the congestion control state. TCP-NCR could be used in conjunction congestion control state. TCP-NCR could be used in conjunction with
with these algorithms, with TCP-NCR attempting to prevent spurious these algorithms, with TCP-NCR attempting to prevent spurious
retransmits and some other scheme kicking in if the prevention retransmits and some other scheme kicking in if the prevention
failed. In addition, we note that TCP-NCR is concentrated on failed. In addition, note that TCP-NCR is concentrated on preventing
preventing spurious fast retransmits and some of the above algorithms spurious fast retransmits; some of the above algorithms also attempt
also attempt to detect and mitigate spurious timeout-based to detect and mitigate spurious timeout-based retransmits.
retransmits.
7. Security Considerations 7. Security Considerations
We do not believe there are security implications involved with TCP- General attacks against the congestion control of TCP are described
NCR over and above those for general TCP congestion control in [RFC2581]. SACK-based loss recovery for TCP [RFC3517] mitigates
[RFC2581]. In particular, the Extended Limited Transmit algorithms some of the duplicate ACK attacks against TCP's congestion control.
specified in this document have been specifically designed not to be This document builds upon that work, and the Extended Limited
susceptible to the sorts of ACK splitting attacks TCP's general TCP Transmit algorithms specified in this document have been designed to
congestion control is vulnerable to (as discussed in [RFC3465]). thwart the ACK division problems that are described in [RFC3465].
8. Acknowledgements 8. Acknowledgments
Feedback from Lars Eggert, Ted Faber, Wesley Eddy, Gorry Fairhurst, Feedback from Lars Eggert, Ted Faber, Wesley Eddy, Gorry Fairhurst,
Sally Floyd, Sara Landstrom, Nauzad Sadry, Pasi Sarolahti, Joe Touch Sally Floyd, Sara Landstrom, Nauzad Sadry, Pasi Sarolahti, Joe Touch,
and Nitin Vaidya and the TCPM working group have contributed Nitin Vaidya, and the TCPM working group have contributed
significantly to this document. Our thanks to all! significantly to this document. Our thanks to all!
9. IANA Considerations 9. References
This document requires no IANA assignments. The RFC Editor can
safely remove this section.
10. Normative References 9.1. Normative References
[RFC793] J. Postel, "Transmission Control Protocol", RFC 793, [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC
September 1981. 793, September 1981.
[RFC2018] M. Mathis, J. Mahdavi, S. Floyd and A. Romanow, "TCP [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
selective acknowledgment options," Internet RFC 2018. Selective Acknowledgement Options", RFC 2018, October 1996.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2581] M. Allman, V. Paxson, and W. Stevens, "TCP Congestion [RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
Control", RFC 2581, April 1999. Control", RFC 2581, April 1999.
[RFC3042] M. Allman, H. Balakrishnan and S. Floyd, "Enhancing TCP's [RFC3042] Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing
Loss Recovery Using Limited Transmit", RFC 3042, January 2001. TCP's Loss Recovery Using Limited Transmit", RFC 3042,
January 2001.
[RFC3517] E. Blanton, M. Allman, K. Fall and L. Wang, "A Conservative [RFC3517] Blanton, E., Allman, M., Fall, K., and L. Wang, "A
Selective Acknowledgment (SACK)-based Loss Recovery Algorithm for Conservative Selective Acknowledgment (SACK)-based Loss
TCP", RFC 3517, April 2003. Recovery Algorithm for TCP", RFC 3517, April 2003.
11. Informative References 9.2. Informative References
[BA02] E. Blanton and M. Allman, "On Making TCP More Robust to Packet [BA02] E. Blanton and M. Allman, "On Making TCP More Robust to
Reordering," ACM Computer Communication Review, January 2002. Packet Reordering," ACM Computer Communication Review,
January 2002.
[BBFS01] D. Bansal, H. Balakrishnan, S. Floyd and S. Shenker, [BBFS01] D. Bansal, H. Balakrishnan, S. Floyd and S. Shenker,
"Dynamic Behavior of Slowly Responsive Congestion Control "Dynamic Behavior of Slowly Responsive Congestion Control
Algorithms", Proceedings of ACM SIGCOMM, Sep. 2001. Algorithms", Proceedings of ACM SIGCOMM, Sep. 2001.
[BPS99] J. Bennett, C. Partridge, and N. Shectman, "Packet reordering [BPS99] J. Bennett, C. Partridge, and N. Shectman, "Packet
is not pathological network behavior," IEEE/ACM Transactions on reordering is not pathological network behavior," IEEE/ACM
Networking, December 1999. Transactions on Networking, December 1999.
[BR04] Sumitha Bhandarkar and A. L. Narasimha Reddy, "TCP-DCR: Making [BR04] Sumitha Bhandarkar and A. L. Narasimha Reddy, "TCP-DCR:
TCP Robust to Non-Congestion Events", In the Proceedings of Making TCP Robust to Non-Congestion Events", In the
Networking 2004 conference, May 2004. Extended version available as Proceedings of Networking 2004 conference, May 2004.
tech report TAMU-ECE-2003-04. Extended version available as tech report TAMU-ECE-2003-04.
[BSRV04] Sumitha Bhandarkar, Nauzad Sadry, A. L. Narasimha Reddy and [BSRV04] Sumitha Bhandarkar, Nauzad Sadry, A. L. Narasimha Reddy and
Nitin Vaidya, "TCP-DCR: A Novel Protocol for Tolerating Wireless Nitin Vaidya, "TCP-DCR: A Novel Protocol for Tolerating
Channel Errors", To appear in IEEE Transactions on Mobile Computing Wireless Channel Errors", to appear in IEEE Transactions on
Mobile Computing.
[GPL04] Ladan Gharai, Colin Perkins and Tom Lehman, "Packet [GPL04] Ladan Gharai, Colin Perkins and Tom Lehman, "Packet
Reordering, High Speed Networks and Transport Protocol Performance", Reordering, High Speed Networks and Transport Protocol
ICCCN 2004, October 2004. Performance", ICCCN 2004, October 2004.
[Jac88] V. Jacobson, "Congestion Avoidance and Control", Computer [Jac88] V. Jacobson, "Congestion Avoidance and Control", Computer
Communication Review, vol. 18, no. 4, pp. 314-329, Aug. 1988. Communication Review, vol. 18, no. 4, pp. 314-329, Aug.
ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z. 1988. ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z.
[JIDKT03] S. Jaiswal, G. Iannaccone, C. Diot, J. Kurose, and D. [JIDKT03] S. Jaiswal, G. Iannaccone, C. Diot, J. Kurose, and D.
Towsley, "Measurement and Classification of Out-of-Sequence Packets Towsley, "Measurement and Classification of Out-of-Sequence
in a Tier-1 IP Backbone," Proceedings of IEEE INFOCOM, 2003. Packets in a Tier-1 IP Backbone," Proceedings of IEEE
INFOCOM, 2003.
[KM02] I. Keslassy and N. McKeown, "Maintaining packet order in [KM02] I. Keslassy and N. McKeown, "Maintaining packet order in
twostage switches," Proceedings of the IEEE Infocom, June
2002
[MAF05] A. Medina, M. Allman, S. Floyd. Measuring the Evolution of [MAF05] A. Medina, M. Allman, S. Floyd. Measuring the Evolution of
Transport Protocols in the Internet. ACM Computer Communication Transport Protocols in the Internet. ACM Computer
Review, 35(2), April 2005. Communication Review, 35(2), April 2005.
[NS-2] ns-2 Network Simulator. http://www.isi.edu/nsnam/ [NS-2] ns-2 Network Simulator. http://www.isi.edu/nsnam/
[Pax97] V. Paxson, "End-to-End Internet Packet Dynamics,"
Proceedings of ACM SIGCOMM, September 1997.
[Pax97] V. Paxson, "End-to-End Internet Packet Dynamics," Proceedings [RFC896] Nagle, J., "Congestion control in IP/TCP internetworks",
of ACM SIGCOMM, September 1997. RFC 896, January 1984.
[RFC896] J. Nagle, "Congestion Control in IP/TCP Internetworks", RFC
896, January 1984.
[RFC1122] R. Braden, "Requirements for Internet Hosts - Communication [RFC1122] Braden, R., "Requirements for Internet Hosts -
Layers", RFC 1122, October 1989. Communication Layers", STD 3, RFC 1122, October 1989.
[RFC2883] Sally Floyd, Jamshid Mahdavi, Matt Mathis and Matt [RFC2883] Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An
Podolsky, "An Extension to the Selective Acknowledgement (SACK) Extension to the Selective Acknowledgement (SACK) Option
Option for TCP," RFC 2883, July 2000. for TCP", RFC 2883, July 2000.
[RFC2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. [RFC2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H.
Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, V. Paxson. Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, V.
Paxson. Stream Control Transmission Protocol. October
Stream Control Transmission Protocol. October 2000. 2000.
[RFC2988] V. Paxson and M. Allman, "Computing TCP's Retransmission
Timer", RFC 2988, November 2000.
[RFC3465] M. Allman. TCP Congestion Control with Appropriate Byte [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte
Counting (ABC), February 2003. RFC 3465. Counting (ABC)", RFC 3465, February 2003.
[RFC3522] R. Ludwig and M. Meyer, "The Eifel Detection Algorithm for [RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for
TCP," RFC 3522, April 2003. TCP", RFC 3522, April 2003.
[RFC3708] E. Blanton and M. Allman, "Using TCP Duplicate Selective [RFC3708] Blanton, E. and M. Allman, "Using TCP Duplicate Selective
Acknowledgement (DSACKs) and Stream Control Transmission Protocol Acknowledgement (DSACKs) and Stream Control Transmission
(SCTP) Duplicate Transmission Sequence Numbers (TSNs) to Detect Protocol (SCTP) Duplicate Transmission Sequence Numbers
Spurious Retransmissions", RFC 3708, February 2004. (TSNs) to Detect Spurious Retransmissions", RFC 3708,
February 2004.
[RFC4015] R. Ludwig, A. Gurtov, "The Eifel Response Algorithm for [RFC4015] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm for
TCP", RFC 4015, February 2005. TCP", RFC 4015, February 2005.
[SK04] P. Sarolahti, M. Kojo, "Forward RTO-Recovery (F-RTO): An [RFC4138] Sarolahti, P. and M. Kojo, "Forward RTO-Recovery (F-RTO):
Algorithm for Detecting Spurious Retransmission Timeouts with TCP and An Algorithm for Detecting Spurious Retransmission Timeouts
SCTP", Internet-Draft draft-ietf-tcpm-frto-02.txt (work in progress). with TCP and the Stream Control Transmission Protocol
November 2004. (SCTP)", RFC 4138, August 2005.
[ZKFP03] M. Zhang, B. Karp, S. Floyd, L. Peterson, "RR-TCP: A [ZKFP03] M. Zhang, B. Karp, S. Floyd, L. Peterson, "RR-TCP: A
Reordering-Robust TCP with DSACK", in Proceedings of the Eleventh Reordering-Robust TCP with DSACK", in Proceedings of the
IEEE International Conference on Networking Protocols (ICNP 2003), Eleventh IEEE International Conference on Networking
Atlanta, GA, November, 2003. Protocols (ICNP 2003), Atlanta, GA, November, 2003.
12. Author's Addresses Authors' Addresses
Sumitha Bhandarkar Sumitha Bhandarkar
Dept. of Elec. Engg. Dept. of Elec. Engg.
214 ZACH 214 ZACH
College Station, TX 77843-3128 College Station, TX 77843-3128
Phone: (512) 468-8078 Phone: (512) 468-8078
Email: sumitha@tamu.edu EMail: sumitha@tamu.edu
URL : http://students.cs.tamu.edu/sumitha/ URL : http://students.cs.tamu.edu/sumitha/
A. L. Narasimha Reddy A. L. Narasimha Reddy
Professor Professor
Dept. of Elec. Engg. Dept. of Elec. Engg.
315C WERC 315C WERC
College Station, TX 77843-3128 College Station, TX 77843-3128
Phone : (979) 845-7598 Phone : (979) 845-7598
Email : reddy@ee.tamu.edu EMail: reddy@ee.tamu.edu
URL : http://ee.tamu.edu/~reddy/ URL : http://ee.tamu.edu/~reddy/
Mark Allman Mark Allman
ICSI Center for Internet Research ICSI Center for Internet Research
1947 Center Street, Suite 600 1947 Center Street, Suite 600
Berkeley, CA 94704-1198 Berkeley, CA 94704-1198
Phone: (216) 243-7361
Email: mallman@icir.org Phone: (440) 235-1792
EMail: mallman@icir.org
URL: http://www.icir.org/mallman/ URL: http://www.icir.org/mallman/
Ethan Blanton Ethan Blanton
Purdue University Computer Science Purdue University Computer Science
250 North University Street 305 North University Street
West Lafayette, IN 47907 West Lafayette, IN 47907
Email: eblanton@cs.purdue.edu
Intellectual Property Statement EMail: eblanton@cs.purdue.edu
Full Copyright Statement
Copyright (C) The Internet Society (2006).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79. found in BCP 78 and BCP 79.
skipping to change at page 17, line 42 skipping to change at page 18, line 45
such proprietary rights by implementers or users of this such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr. http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at this standard. Please address the information to the IETF at
ietf-ipr@ietf.org. ietf-ipr@ietf.org.
Disclaimer of Validity Acknowledgement
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2006). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the Funding for the RFC Editor function is provided by the IETF
Internet Society. Administrative Support Activity (IASA).
 End of changes. 113 change blocks. 
306 lines changed or deleted 292 lines changed or added

This html diff was produced by rfcdiff 1.32. The latest version is available from http://www.levkowetz.com/ietf/tools/rfcdiff/