draft-ietf-tcpm-rfc3782-bis-03.txt   draft-ietf-tcpm-rfc3782-bis-04.txt 
TCP Maintenance and Minor T. Henderson TCP Maintenance and Minor T. Henderson
Extensions Working Group Boeing Extensions Working Group Boeing
Internet-Draft S. Floyd Internet-Draft S. Floyd
Obsoletes: 3782 (if approved) ICSI Obsoletes: 3782 (if approved) ICSI
Intended status: Standards Track A. Gurtov Intended status: Standards Track A. Gurtov
Expires: April 22, 2012 HIIT Expires: June 5, 2012 HIIT
Y. Nishida Y. Nishida
WIDE Project WIDE Project
October 22, 2011 December 5, 2011
The NewReno Modification to TCP's Fast Recovery Algorithm The NewReno Modification to TCP's Fast Recovery Algorithm
draft-ietf-tcpm-rfc3782-bis-03.txt draft-ietf-tcpm-rfc3782-bis-04.txt
Abstract Abstract
RFC 5681 documents the following four intertwined TCP RFC 5681 documents the following four intertwined TCP
congestion control algorithms: slow start, congestion avoidance, fast congestion control algorithms: slow start, congestion avoidance, fast
retransmit, and fast recovery. RFC 5681 explicitly allows retransmit, and fast recovery. RFC 5681 explicitly allows
certain modifications of these algorithms, including modifications certain modifications of these algorithms, including modifications
that use the TCP Selective Acknowledgement (SACK) option (RFC 2883), that use the TCP Selective Acknowledgement (SACK) option (RFC 2883),
and modifications that respond to "partial acknowledgments" (ACKs and modifications that respond to "partial acknowledgments" (ACKs
which cover new data, but not all the data outstanding when loss was which cover new data, but not all the data outstanding when loss was
skipping to change at page 1, line 45 skipping to change at line 43
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 22, 2012. This Internet-Draft will expire on June 5, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as Copyright (c) 2011 IETF Trust and the persons identified as
the document authors. All rights reserved. the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 4, line 31 skipping to change at line 147
NS simulator [NS] and with numerous implementations of NewReno, we NS simulator [NS] and with numerous implementations of NewReno, we
believe that this modification improves the performance of the Fast believe that this modification improves the performance of the Fast
Retransmit and Fast Recovery algorithms in a wide variety of Retransmit and Fast Recovery algorithms in a wide variety of
scenarios. Previous versions of this RFC [RFC2582, RFC3782] provide scenarios. Previous versions of this RFC [RFC2582, RFC3782] provide
simulation-based evidence of the possible performance gains. simulation-based evidence of the possible performance gains.
2. Terminology and Definitions 2. Terminology and Definitions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
RFC 2119 [RFC2119]. RFC 2119 [RFC2119].
This document assumes that the reader is familiar with the terms This document assumes that the reader is familiar with the terms
SENDER MAXIMUM SEGMENT SIZE (SMSS), CONGESTION WINDOW (cwnd), and SENDER MAXIMUM SEGMENT SIZE (SMSS), CONGESTION WINDOW (cwnd), and
FLIGHT SIZE (FlightSize) defined in [RFC5681]. FLIGHT SIZE is FLIGHT SIZE (FlightSize) defined in [RFC5681]. FLIGHT SIZE is
defined as in [RFC5681] as follows: defined as in [RFC5681] as follows:
FLIGHT SIZE: FLIGHT SIZE:
The amount of data that has been sent but not yet cumulatively The amount of data that has been sent but not yet cumulatively
acknowledged. acknowledged.
skipping to change at page 5, line 38 skipping to change at line 200
1) Initialization of TCP protocol control block: 1) Initialization of TCP protocol control block:
When the TCP protocol control block is initialized, Recover is When the TCP protocol control block is initialized, Recover is
set to the initial send sequence number. set to the initial send sequence number.
2) Three duplicate ACKs: 2) Three duplicate ACKs:
When the third duplicate ACK is received, the TCP sender first When the third duplicate ACK is received, the TCP sender first
checks the value of Recover to see if the Cumulative checks the value of Recover to see if the Cumulative
Acknowledgment field covers more than Recover. If so, the value Acknowledgment field covers more than Recover. If so, the value
of Recover is incremented to the value of the highest sequence of Recover is incremented to the value of the highest sequence
number transmitted by the TCP so far. The TCP then enters Fast number transmitted by the TCP so far. The TCP then enters Fast
Retransmit (step 2 of Section 3.2 of [RFC5681]). If not, the Retransmit (step 2 of Section 3.2 of [RFC5681]). If not, the TCP
TCP does not enter fast retransmit and does not reset ssthresh. does not enter fast retransmit and does not reset ssthresh.
3) Response to newly acknowledged data: 3) Response to newly acknowledged data:
Step 6 of [RFC5681] specifies the response to the next ACK that Step 6 of [RFC5681] specifies the response to the next ACK that
acknowledges previously unacknowledged data. When an ACK acknowledges previously unacknowledged data. When an ACK
arrives that acknowledges new data, this ACK could be the arrives that acknowledges new data, this ACK could be the
acknowledgment elicited by the retransmission from step 2, or acknowledgment elicited by the retransmission from step 2, or
elicited by a later retransmission. There are two cases. elicited by a later retransmission. There are two cases.
Full acknowledgments: Full acknowledgments:
If this ACK acknowledges all of the data up to and including If this ACK acknowledges all of the data up to and including
Recover, then the ACK acknowledges all the intermediate Recover, then the ACK acknowledges all the intermediate
segments sent between the original transmission of the lost segments sent between the original transmission of the lost
segment and the receipt of the third duplicate ACK. Set cwnd to segment and the receipt of the third duplicate ACK. Set cwnd to
either (1) min (ssthresh, max(FlightSize, SMSS) + SMSS) or either (1) min (ssthresh, max(FlightSize, SMSS) + SMSS) or
(2) ssthresh, where ssthresh is the value set when Fast (2) ssthresh, where ssthresh is the value set when Fast Retransmit
Retransmit was entered, and where FlightSize in (1) is the amount was entered, and where FlightSize in (1) is the amount of data
of data presently outstanding. This is termed "deflating" the presently outstanding. This is termed "deflating" the window.
window. If the second option is selected, the implementation If the second option is selected, the implementation
is encouraged to take measures to avoid a possible burst of is encouraged to take measures to avoid a possible burst of
data, in case the amount of data outstanding in the network is data, in case the amount of data outstanding in the network is
much less than the new congestion window allows. A simple much less than the new congestion window allows. A simple
mechanism is to limit the number of data packets that can be sent mechanism is to limit the number of data packets that can be sent
in response to a single acknowledgment. Exit the Fast Recovery in response to a single acknowledgment. Exit the Fast Recovery
procedure. procedure.
Partial acknowledgments: Partial acknowledgments:
If this ACK does *not* acknowledge all of the data up to and If this ACK does *not* acknowledge all of the data up to and
including Recover, then this is a partial ACK. In this case, including Recover, then this is a partial ACK. In this case,
skipping to change at page 7, line 19 skipping to change at line 278
pattern of packet losses, the partial acknowledgment might pattern of packet losses, the partial acknowledgment might
acknowledge nearly a window of data. In this case, if the congestion acknowledge nearly a window of data. In this case, if the congestion
window was not deflated, the data sender might be able to send nearly window was not deflated, the data sender might be able to send nearly
a window of data back-to-back. a window of data back-to-back.
This document does not specify the sender's response to duplicate This document does not specify the sender's response to duplicate
ACKs when the Fast Retransmit/Fast Recovery algorithm is not ACKs when the Fast Retransmit/Fast Recovery algorithm is not
invoked. This is addressed in other documents, such as those invoked. This is addressed in other documents, such as those
describing the Limited Transmit procedure [RFC3042]. This document describing the Limited Transmit procedure [RFC3042]. This document
also does not address issues of adjusting the duplicate also does not address issues of adjusting the duplicate
acknowledgment threshold, but assumes the threshold specified in the acknowledgment threshold, but assumes the threshold specified in
IETF standards; the current standard is [RFC5681], which specifies the IETF standards; the current standard is [RFC5681], which
a threshold of three duplicate acknowledgments. specifies a threshold of three duplicate acknowledgments.
As a final note, we would observe that in the absence of the SACK As a final note, we would observe that in the absence of the SACK
option, the data sender is working from limited information. When option, the data sender is working from limited information. When
the issue of recovery from multiple dropped packets from a single the issue of recovery from multiple dropped packets from a single
window of data is of particular importance, the best alternative window of data is of particular importance, the best alternative
would be to use the SACK option. would be to use the SACK option.
4. Handling Duplicate Acknowledgments After A Timeout 4. Handling Duplicate Acknowledgments After A Timeout
After each retransmit timeout, the highest sequence number After each retransmit timeout, the highest sequence number
skipping to change at page 7, line 45 skipping to change at line 304
receiver, then the TCP data sender will receive three duplicate receiver, then the TCP data sender will receive three duplicate
acknowledgments that do not cover more than "recover". In this acknowledgments that do not cover more than "recover". In this
case, the duplicate acknowledgments are not an indication of a new case, the duplicate acknowledgments are not an indication of a new
instance of congestion. They are simply an indication that the instance of congestion. They are simply an indication that the
sender has unnecessarily retransmitted at least three packets. sender has unnecessarily retransmitted at least three packets.
However, when a retransmitted packet is itself dropped, the sender However, when a retransmitted packet is itself dropped, the sender
can also receive three duplicate acknowledgments that do not cover can also receive three duplicate acknowledgments that do not cover
more than "recover". In this case, the sender would have been more than "recover". In this case, the sender would have been
better off if it had initiated Fast Retransmit. For a TCP that better off if it had initiated Fast Retransmit. For a TCP that
implements the algorithm specified in Section 3 of this document, the implements the algorithm specified in Section 3.2 of this document, the
sender does not infer a packet drop from duplicate acknowledgments sender does not infer a packet drop from duplicate acknowledgments
in this scenario. As always, the retransmit timer is the backup in this scenario. As always, the retransmit timer is the backup
mechanism for inferring packet loss in this case. mechanism for inferring packet loss in this case.
There are several heuristics, based on timestamps or on the amount of There are several heuristics, based on timestamps or on the amount of
advancement of the cumulative acknowledgment field, that allow the advancement of the cumulative acknowledgment field, that allow the
sender to distinguish, in some cases, between three duplicate sender to distinguish, in some cases, between three duplicate
acknowledgments following a retransmitted packet that was dropped, acknowledgments following a retransmitted packet that was dropped,
and three duplicate acknowledgments from the unnecessary and three duplicate acknowledgments from the unnecessary
retransmission of three packets [Gur03, GF04]. The TCP sender MAY retransmission of three packets [Gur03, GF04]. The TCP sender MAY
skipping to change at page 8, line 31 skipping to change at line 339
distinguish between a retransmitted packet that was dropped and distinguish between a retransmitted packet that was dropped and
three duplicate acknowledgments from the unnecessary three duplicate acknowledgments from the unnecessary
retransmission of three packets. retransmission of three packets.
4.1. ACK Heuristic 4.1. ACK Heuristic
If the ACK-based heuristic is used, then following the advancement of If the ACK-based heuristic is used, then following the advancement of
the cumulative acknowledgment field, the sender stores the value of the cumulative acknowledgment field, the sender stores the value of
the previous cumulative acknowledgment as prev_highest_ack, and the previous cumulative acknowledgment as prev_highest_ack, and
stores the latest cumulative ACK as highest_ack. In addition, the stores the latest cumulative ACK as highest_ack. In addition, the
following step is performed if Step 1 in Section 3 fails, before following check is performed if, in Step 2 of Section 3.2, the
proceeding to Step 1B. Cumulative Acknowledgment field does not cover more than "recover".
1*) If the Cumulative Acknowledgment field didn't cover more than 1*) If the Cumulative Acknowledgment field didn't cover more than
"recover", check to see if the congestion window is greater "recover", check to see if the congestion window is greater
than SMSS bytes and the difference between highest_ack and than SMSS bytes and the difference between highest_ack and
prev_highest_ack is at most 4*SMSS bytes. If true, duplicate prev_highest_ack is at most 4*SMSS bytes. If true, duplicate
ACKs indicate a lost segment (proceed to Step 1A in Section ACKs indicate a lost segment (enter Fast Retransmit). Otherwise,
3). Otherwise, duplicate ACKs likely result from unnecessary duplicate ACKs likely result from unnecessary retransmissions
retransmissions (proceed to Step 1B in Section 3). (do not enter Fast Retransmit).
The congestion window check serves to protect against fast retransmit The congestion window check serves to protect against fast retransmit
immediately after a retransmit timeout. immediately after a retransmit timeout.
If several ACKs are lost, the sender can see a jump in the cumulative If several ACKs are lost, the sender can see a jump in the cumulative
ACK of more than three segments, and the heuristic can fail. ACK of more than three segments, and the heuristic can fail.
[RFC5681] recommends that a receiver should [RFC5681] recommends that a receiver should
send duplicate ACKs for every out-of-order data packet, such as a send duplicate ACKs for every out-of-order data packet, such as a
data packet received during Fast Recovery. The ACK heuristic is more data packet received during Fast Recovery. The ACK heuristic is more
likely to fail if the receiver does not follow this advice, because likely to fail if the receiver does not follow this advice, because
then a smaller number of ACK losses are needed to produce a then a smaller number of ACK losses are needed to produce a
sufficient jump in the cumulative ACK. sufficient jump in the cumulative ACK.
4.2. Timestamp Heuristic 4.2. Timestamp Heuristic
If this heuristic is used, the sender stores the timestamp of the If this heuristic is used, the sender stores the timestamp of the
last acknowledged segment. In addition, the second paragraph of step last acknowledged segment. In addition, the last sentence of step
1 in Section 3 is replaced as follows: 2 in Section 3.2 is replaced as follows:
1**) If the Cumulative Acknowledgment field didn't cover more than 1**) If the Cumulative Acknowledgment field didn't cover more than
"recover", check to see if the echoed timestamp in the last "recover", check to see if the echoed timestamp in the last
non-duplicate acknowledgment equals the non-duplicate acknowledgment equals the
stored timestamp. If true, duplicate ACKs indicate a lost stored timestamp. If true, duplicate ACKs indicate a lost
segment (proceed to Step 1A in Section 3). Otherwise, duplicate segment (enter Fast Retransmit). Otherwise, duplicate
ACKs likely result from unnecessary retransmissions (proceed ACKs likely result from unnecessary retransmissions (do not enter
to Step 1B in Section 3). Fast Retransmit).
The timestamp heuristic works correctly, both when the receiver The timestamp heuristic works correctly, both when the receiver
echoes timestamps as specified by [RFC1323], and by its revision echoes timestamps as specified by [RFC1323], and by its revision
attempts. However, if the receiver arbitrarily echoes timestamps, attempts. However, if the receiver arbitrarily echoes timestamps,
the heuristic can fail. The heuristic can also fail if a timeout was the heuristic can fail. The heuristic can also fail if a timeout was
spurious and returning ACKs are not from retransmitted segments. spurious and returning ACKs are not from retransmitted segments.
This can be prevented by detection algorithms such as [RFC3522]. This can be prevented by detection algorithms such as [RFC3522].
5. Implementation Issues for the Data Receiver 5. Implementation Issues for the Data Receiver
skipping to change at page 12, line 4 skipping to change at line 503
feedback on this document or on its precursor, RFC 2582. Jeffrey feedback on this document or on its precursor, RFC 2582. Jeffrey
Hsu provided clarifications on the handling of the recover variable Hsu provided clarifications on the handling of the recover variable
that were applied to RFC 3782 as errata, and now are in Section 8 that were applied to RFC 3782 as errata, and now are in Section 8
of this document. Yoshifumi Nishida contributed a modification of this document. Yoshifumi Nishida contributed a modification
to the fast recovery algorithm to account for the case in which to the fast recovery algorithm to account for the case in which
flightsize is 0 when the TCP sender leaves fast recovery, and the flightsize is 0 when the TCP sender leaves fast recovery, and the
TCP receiver uses delayed acknowledgments. Alexander Zimmermann TCP receiver uses delayed acknowledgments. Alexander Zimmermann
provided several suggestions to improve the clarity of the document. provided several suggestions to improve the clarity of the document.
11. References 11. References
11.1. Normative References 11.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5681] Allman, M., Paxson, V. and E. Blanton, "TCP Congestion [RFC5681] Allman, M., Paxson, V. and E. Blanton, "TCP Congestion
Control", RFC 5681, September 2009. Control", RFC 5681, September 2009.
[RFC6298] Paxson, V., Allman, M., Chu, J., and Sargent, M., [RFC6298] Paxson, V., M. Allman, J. Chu, and M. Sargent, "Computing
"Computing TCP's Retransmission Timer", RFC 6298, TCP's Retransmission Timer", RFC 6298, June 2011.
June 2011.
11.2. Informative References 11.2. Informative References
[C98] Cardwell, N., "delayed ACKs for retransmitted packets: [C98] Cardwell, N., "delayed ACKs for retransmitted packets:
ouch!". November 1998, Email to the tcpimpl mailing list, ouch!". November 1998, Email to the tcpimpl mailing list,
Message-ID "Pine.LNX.4.02A.9811021421340.26785-100000@ Message-ID
sake.cs.washington.edu", "Pine.LNX.4.02A.9811021421340.26785-100000@sake.cs.washington.edu",
archived at "http://tcp-impl.lerc.nasa.gov/tcp-impl". archived at "http://tcp-impl.lerc.nasa.gov/tcp-impl".
[F98] Floyd, S., Revisions to RFC 2001, "Presentation to the [F98] Floyd, S., Revisions to RFC 2001, "Presentation to the
TCPIMPL Working Group", August 1998. URLs TCPIMPL Working Group", August 1998. URLs
"ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.ps" and "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.ps" and
"ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.pdf". "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.pdf".
[F03] Floyd, S., "Moving NewReno from Experimental to Proposed [F03] Floyd, S., "Moving NewReno from Experimental to Proposed
Standard? Presentation to the TSVWG Working Group", March Standard? Presentation to the TSVWG Working Group", March 2003.
2003. URLs URLs "http://www.icir.org/floyd/talks/newreno-Mar03.ps" and
"http://www.icir.org/floyd/talks/newreno-Mar03.ps" and
"http://www.icir.org/floyd/talks/newreno-Mar03.pdf". "http://www.icir.org/floyd/talks/newreno-Mar03.pdf".
[FF96] Fall, K. and S. Floyd, "Simulation-based Comparisons of [FF96] Fall, K. and S. Floyd, "Simulation-based Comparisons of
Tahoe, Reno and SACK TCP", Computer Communication Review, Tahoe, Reno and SACK TCP", Computer Communication Review, July 1996.
July 1996. URL "ftp://ftp.ee.lbl.gov/papers/sacks.ps.Z". URL "ftp://ftp.ee.lbl.gov/papers/sacks.ps.Z".
[F94] Floyd, S., "TCP and Successive Fast Retransmits", Technical [F94] Floyd, S., "TCP and Successive Fast Retransmits", Technical
report, October 1994. URL report, October 1994. URL
"ftp://ftp.ee.lbl.gov/papers/fastretrans.ps". "ftp://ftp.ee.lbl.gov/papers/fastretrans.ps".
[GF04] Gurtov, A. and S. Floyd, "Resolving Acknowledgment [GF04] Gurtov, A. and S. Floyd, "Resolving Acknowledgment
Ambiguity in non-SACK TCP", Next Generation Teletraffic and Ambiguity in non-SACK TCP", Next Generation Teletraffic and
Wired/Wireless Advanced Networking (NEW2AN'04), February Wired/Wireless Advanced Networking (NEW2AN'04), February
2004. URL "http://www.cs.helsinki.fi/u/gurtov/papers/ 2004. URL "http://www.cs.helsinki.fi/u/gurtov/papers/
heuristics.html". heuristics.html".
[Gur03] Gurtov, A., "[Tsvwg] resolving the problem of unnecessary [Gur03] Gurtov, A., "[Tsvwg] resolving the problem of unnecessary
fast retransmits in go-back-N", email to the tsvwg mailing fast retransmits in go-back-N", email to the tsvwg mailing list,
list, message ID <3F25B467.9020609@cs.helsinki.fi>, July message ID <3F25B467.9020609@cs.helsinki.fi>, July 28, 2003. URL
28, 2003. URL "http://www1.ietf.org/mail-archive/ "http://www1.ietf.org/mail-archive/working-groups/tsvwg/current/
working-groups/ tsvwg/current/msg04334.html". msg04334.html".
[Hen98] Henderson, T., Re: NewReno and the 2001 Revision. September [Hen98] Henderson, T., Re: NewReno and the 2001 Revision. September
1998. Email to the tcpimpl mailing list, Message ID 1998. Email to the tcpimpl mailing list, Message ID
"Pine.BSI.3.95.980923224136.26134A-100000@raptor. "Pine.BSI.3.95.980923224136.26134A-100000@raptor.CS.Berkeley.EDU",
CS.Berkeley.EDU", archived at archived at "http://tcp-impl.lerc.nasa.gov/tcp-impl".
"http://tcp-impl.lerc.nasa.gov/tcp-impl".
[Hoe95] Hoe, J., "Startup Dynamics of TCP's Congestion Control and [Hoe95] Hoe, J., "Startup Dynamics of TCP's Congestion Control and
Avoidance Schemes", Master's Thesis, MIT, 1995. Avoidance Schemes", Master's Thesis, MIT, 1995.
[Hoe96] Hoe, J., "Improving the Start-up Behavior of a Congestion [Hoe96] Hoe, J., "Improving the Start-up Behavior of a Congestion
Control Scheme for TCP", ACM SIGCOMM, August 1996. URL Control Scheme for TCP", ACM SIGCOMM, August 1996. URL
"http://www.acm.org/sigcomm/sigcomm96/program.html". "http://www.acm.org/sigcomm/sigcomm96/program.html".
[LM97] Lin, D. and R. Morris, "Dynamics of Random Early [LM97] Lin, D. and R. Morris, "Dynamics of Random Early
Detection", SIGCOMM 97, September 1997. URL Detection", SIGCOMM 97, September 1997. URL
skipping to change at page 13, line 35 skipping to change at line 582
[PF01] Padhye, J. and S. Floyd, "Identifying the TCP Behavior of [PF01] Padhye, J. and S. Floyd, "Identifying the TCP Behavior of
Web Servers", June 2001, SIGCOMM 2001. Web Servers", June 2001, SIGCOMM 2001.
[RFC1323] Jacobson, V., Braden, R. and D. Borman, "TCP Extensions for [RFC1323] Jacobson, V., Braden, R. and D. Borman, "TCP Extensions for
High Performance", RFC 1323, May 1992. High Performance", RFC 1323, May 1992.
[RFC2582] Floyd, S. and T. Henderson, "The NewReno Modification to [RFC2582] Floyd, S. and T. Henderson, "The NewReno Modification to
TCP's Fast Recovery Algorithm", RFC 2582, April 1999. TCP's Fast Recovery Algorithm", RFC 2582, April 1999.
[RFC2883] Floyd, S., J. Mahdavi, M. Mathis, and M. Podolsky, "The [RFC2883] Floyd, S., J. Mahdavi, M. Mathis, and M. Podolsky, "The
Selective Acknowledgment (SACK) Option for TCP, RFC 2883, Selective Acknowledgment (SACK) Option for TCP, RFC 2883, July 2000.
July 2000.
[RFC3042] Allman, M., Balakrishnan, H. and S. Floyd, "Enhancing TCP's [RFC3042] Allman, M., Balakrishnan, H. and S. Floyd, "Enhancing TCP's
Loss Recovery Using Limited Transmit", RFC 3042, January Loss Recovery Using Limited Transmit", RFC 3042, January 2001.
2001.
[RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for [RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for
TCP", RFC 3522, April 2003. TCP", RFC 3522, April 2003.
[RFC3782] Floyd, S., T. Henderson, and A. Gurtov, "The NewReno [RFC3782] Floyd, S., T. Henderson, and A. Gurtov, "The NewReno
Modification to TCP's Fast Recovery Algorithm", RFC 3782, Modification to TCP's Fast Recovery Algorithm", RFC 3782, April 2004.
April 2004.
Appendix A. Additional Information Appendix A. Additional Information
Previous versions of this RFC ([RFC2582], [RFC3782]) contained Previous versions of this RFC ([RFC2582], [RFC3782]) contained
additional informative material on the following subjects, and additional informative material on the following subjects, and
may be consulted by readers who may want more information about may be consulted by readers who may want more information about
possible variants to the algorithm and who may want references possible variants to the algorithm and who may want references
to specific [NS] simulations that provide NewReno test cases. to specific [NS] simulations that provide NewReno test cases.
Section 4 of [RFC3782] discusses some alternative behaviors for Section 4 of [RFC3782] discusses some alternative behaviors for
skipping to change at page 14, line 27 skipping to change at line 623
Section 10 of [RFC3782] provides a comparison of Reno and Section 10 of [RFC3782] provides a comparison of Reno and
NewReno TCP. NewReno TCP.
Section 11 of [RFC3782] listed changes relative to [RFC3782]. Section 11 of [RFC3782] listed changes relative to [RFC3782].
Appendix B. Changes Relative to RFC 3782 Appendix B. Changes Relative to RFC 3782
In [RFC3782], the cwnd after Full ACK reception will be set to In [RFC3782], the cwnd after Full ACK reception will be set to
(1) min (ssthresh, FlightSize + SMSS) or (2) ssthresh. However, (1) min (ssthresh, FlightSize + SMSS) or (2) ssthresh. However,
there is a risk in the first logic which results in performance there is a risk in the first option which results in performance
degradation. With the first logic, if FlightSize is zero, the degradation. With the first option, if FlightSize is zero, the
result will be 1 SMSS. This means TCP can transmit only 1 segment result will be 1 SMSS. This means TCP can transmit only 1 segment
at this moment, which can cause delay in ACK transmission at receiver at this moment, which can cause delay in ACK transmission at receiver
due to delayed ACK algorithm. due to delayed ACK algorithm.
The FlightSize on Full ACK reception can be zero in some situations. The FlightSize on Full ACK reception can be zero in some situations.
A typical example is where sending window size during fast recovery A typical example is where sending window size during fast recovery
is small. In this case, the retransmitted packet and new data packets is small. In this case, the retransmitted packet and new data packets
can be transmitted within a short interval. If all these packets can be transmitted within a short interval. If all these packets
successfully arrive, the receiver may generate a Full ACK that successfully arrive, the receiver may generate a Full ACK that
acknowledges all outstanding data. Even if window size is not small, acknowledges all outstanding data. Even if window size is not small,
loss of ACK packets or receive buffer shortage during fast recovery loss of ACK packets or receive buffer shortage during fast recovery
can also increase the possibility to fall into this situation. can also increase the possibility of falling into this situation.
The proposed fix in this document ensures that sender TCP transmits The proposed fix in this document, which sets cwnd to at least 2*SMSS
at least two segments on Full ACK reception. if the implementation uses option 1 in the Full ACK case (Section 3.2,
step 3, option 1), ensures that the sender TCP transmits at least two
segments on Full ACK reception.
In addition, errata for RFC3782 (editorial clarification to Section 8 In addition, errata for RFC3782 (editorial clarification to Section 8
of RFC2582, which is now Section 6 of this document) has been of RFC2582, which is now Section 6 of this document) has been
applied. applied.
The specification text (Section 3.2 herein) was rewritten to more The specification text (Section 3.2 herein) was rewritten to more
closely track Section 3.2 of [RFC5681]. closely track Section 3.2 of [RFC5681].
Sections 4, 5, 9-11 of [RFC3782] were removed, and instead Appendix Sections 4, 5, 9-11 of [RFC3782] were removed, and instead Appendix
A of this document was added to back-reference this informative A of this document was added to back-reference this informative
 End of changes. 27 change blocks. 
54 lines changed or deleted 50 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/