draft-ietf-tcpm-early-rexmt-03.txt   draft-ietf-tcpm-early-rexmt-04.txt 
Internet Engineering Task Force Mark Allman Internet Engineering Task Force Mark Allman
INTERNET DRAFT ICSI INTERNET DRAFT ICSI
File: draft-ietf-tcpm-early-rexmt-03.txt Konstantin Avrachenkov File: draft-ietf-tcpm-early-rexmt-04.txt Konstantin Avrachenkov
Intended Status: Experimental INRIA Intended Status: Experimental INRIA
Urtzi Ayesta Urtzi Ayesta
LAAS-CNRS BCAM-IKERBASQUE and LAAS-CNRS
Josh Blanton Josh Blanton
Ohio University Ohio University
Per Hurtig Per Hurtig
Karlstad University Karlstad University
November 2009 January 2010
Expires: May 2010 Expires: July 2010
Early Retransmit for TCP and SCTP Early Retransmit for TCP and SCTP
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with This Internet-Draft is submitted to IETF in full conformance with
the provisions of BCP 78 and BCP 79. the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 39 skipping to change at page 1, line 39
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 18, 2010. This Internet-Draft will expire on July 27, 2010.
Copyright Notice Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 24 skipping to change at page 2, line 24
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
The reader is expected to be familiar with the definitions given in The reader is expected to be familiar with the definitions given in
[RFC5681]. [RFC5681].
1 Introduction 1 Introduction
Many researchers have studied problems with TCP [RFC793,RFC5681] Many researchers have studied problems with TCP's loss recovery
when the congestion window is small and have outlined possible [RFC793,RFC5681] when the congestion window is small and have
mechanisms to mitigate these problems outlined possible mechanisms to mitigate these problems
[Mor97,BPS+98,Bal98,LK98,RFC3150,AA02]. SCTP's [RFC4960] loss [Mor97,BPS+98,Bal98,LK98,RFC3150,AA02]. SCTP's [RFC4960] loss
recovery and congestion control mechanisms are based on TCP and recovery and congestion control mechanisms are based on TCP and
therefore the same problems impact the performance of SCTP therefore the same problems impact the performance of SCTP
connections. When the transport detects a missing segment, the connections. When the transport detects a missing segment, the
connection enters a loss recovery phase. There are several variants connection enters a loss recovery phase. There are several variants
of the loss recovery phase depending on the TCP implementation. TCP of the loss recovery phase depending on the TCP implementation. TCP
can use slow start based recovery or Fast Recovery [RFC5681], can use slow start based recovery or Fast Recovery [RFC5681],
NewReno [RFC3782], and loss recovery based on selective NewReno [RFC3782], and loss recovery based on selective
acknowledgments (SACKs) [RFC2018,FF96,RFC3517]. SCTP's loss acknowledgments (SACKs) [RFC2018,FF96,RFC3517]. SCTP's loss
recovery is not as varied due to the built-in selective recovery is not as varied due to the built-in selective
skipping to change at page 4, line 15 skipping to change at page 4, line 15
use "Limited Transmit" to include both TCP and SCTP mechanisms for use "Limited Transmit" to include both TCP and SCTP mechanisms for
sending in response to the first two duplicate ACKs. By sending sending in response to the first two duplicate ACKs. By sending
these two new segments the sender is attempting to induce additional these two new segments the sender is attempting to induce additional
duplicate ACKs (if appropriate) so that Fast Retransmit will be duplicate ACKs (if appropriate) so that Fast Retransmit will be
triggered before the retransmission timeout expires. The triggered before the retransmission timeout expires. The
sender-side "Early Retransmit" mechanism outlined in this document sender-side "Early Retransmit" mechanism outlined in this document
covers the case when previously unsent data is not available for covers the case when previously unsent data is not available for
transmission (case (2) above) or cannot be transmitted due to an transmission (case (2) above) or cannot be transmitted due to an
advertised window limitation (case (3) above). advertised window limitation (case (3) above).
Note: This document is being published as an experimental RFC as
part of the process for the TCPM WG and the IETF to assess whether
the proposed change is useful and safe in the heterogeneous
environments, including which variants of the mechanism are the most
effective. In the future, this specification may be updated and put
on the standards track if the safeness and efficacy can be
demonstrated.
2 Early Retransmit Algorithm 2 Early Retransmit Algorithm
The Early Retransmit algorithm calls for lowering the threshold for The Early Retransmit algorithm calls for lowering the threshold for
triggering Fast Retransmit when the amount of outstanding data is triggering Fast Retransmit when the amount of outstanding data is
small and when no previously unsent data can be transmitted (such small and when no previously unsent data can be transmitted (such
that Limited Transmit could be used). Duplicate ACKs are triggered that Limited Transmit could be used). Duplicate ACKs are triggered
by each arriving out-of-order segment. Therefore, Fast Retransmit by each arriving out-of-order segment. Therefore, Fast Retransmit
will not be invoked when there are less than four outstanding will not be invoked when there are less than four outstanding
segments (assuming only one segment loss in the window). However, segments (assuming only one segment loss in the window). However,
TCP and SCTP are not required to track the number of outstanding TCP and SCTP are not required to track the number of outstanding
skipping to change at page 5, line 25 skipping to change at page 5, line 33
ER_thresh = ceiling (ownd/SMSS) - 1 (1) ER_thresh = ceiling (ownd/SMSS) - 1 (1)
duplicate ACKs, where ownd is in terms of bytes. We call this duplicate ACKs, where ownd is in terms of bytes. We call this
reduced ACK threshold enabling "Early Retransmission". reduced ACK threshold enabling "Early Retransmission".
When conditions (2.a) and (2.b) hold and a TCP connection does When conditions (2.a) and (2.b) hold and a TCP connection does
support SACK or SCTP is in use, Early Retransmit MUST be used only support SACK or SCTP is in use, Early Retransmit MUST be used only
when "ownd - SMSS" bytes have been SACKed. when "ownd - SMSS" bytes have been SACKed.
When conditions (2.a) and (2.b) do not hold, the transport MUST NOT If either (or both) condition (2.a) or (2.b) does not hold, the
use Early Retransmit, but rather prefer the standard mechanisms, transport MUST NOT use Early Retransmit, but rather prefer the
including Fast Retransmit and Limited Transmit. standard mechanisms, including Fast Retransmit and Limited Transmit.
As noted above, the drawback of this byte-based variant is precision As noted above, the drawback of this byte-based variant is precision
[HB08]. We illustrate this with two examples: [HB08]. We illustrate this with two examples:
+ Consider a non-SACK TCP sender that uses an SMSS of 1460 bytes + Consider a non-SACK TCP sender that uses an SMSS of 1460 bytes
and transmits three segments each with 400 bytes of payload. and transmits three segments each with 400 bytes of payload.
This is a case where Early Retransmit could aid loss recovery if This is a case where Early Retransmit could aid loss recovery if
one segment is lost. However, in this case ER_thresh will one segment is lost. However, in this case ER_thresh will
become zero, per equation (1), because the number of outstanding become zero, per equation (1), because the number of outstanding
bytes is a poor estimate of the number of outstanding segments. bytes is a poor estimate of the number of outstanding segments.
skipping to change at page 6, line 26 skipping to change at page 6, line 34
segments. (We discuss tracking the number of outstanding segments segments. (We discuss tracking the number of outstanding segments
below.) We call this reduced ACK threshold enabling "Early below.) We call this reduced ACK threshold enabling "Early
Retransmission". Retransmission".
When conditions (3.a) and (3.b) hold and a TCP connection does When conditions (3.a) and (3.b) hold and a TCP connection does
support SACK or SCTP is in use, Early Retransmit MUST be used only support SACK or SCTP is in use, Early Retransmit MUST be used only
when "oseg - 1" segments have been SACKed. A segment is considered when "oseg - 1" segments have been SACKed. A segment is considered
to be SACKed when all its data bytes (TCP) or data chunks (SCTP) to be SACKed when all its data bytes (TCP) or data chunks (SCTP)
have been indicated as arrived by the receiver. have been indicated as arrived by the receiver.
When conditions (3.a) and (3.b) do not hold, the transport MUST NOT If either (or both) conditions (3.a) or (3.b) does not hold, the
use Early Retransmit, but rather prefer the standard mechanisms, transport MUST NOT use Early Retransmit, but rather prefer the
including Fast Retransmit and Limited Transmit. standard mechanisms, including Fast Retransmit and Limited Transmit.
This version of Early Retransmit solves the precision issues This version of Early Retransmit solves the precision issues
discussed in the previous section. As noted previously, the cost is discussed in the previous section. As noted previously, the cost is
that the implementation will have to track segment boundaries to that the implementation will have to track segment boundaries to
form an understanding as to how many actual segments have been form an understanding as to how many actual segments have been
transmitted, but not acknowledged. This can be done by the sender transmitted, but not acknowledged. This can be done by the sender
tracking the boundaries of the three segments on the right side of tracking the boundaries of the three segments on the right side of
the current window (which involves tracking four sequence numbers in the current window (which involves tracking four sequence numbers in
TCP). This could be done by keeping a circular list of the segment TCP). This could be done by keeping a circular list of the segment
boundaries, for instance. Cumulative ACKs that do not fall within boundaries, for instance. Cumulative ACKs that do not fall within
 End of changes. 8 change blocks. 
14 lines changed or deleted 22 lines changed or added

This html diff was produced by rfcdiff 1.37c. The latest version is available from http://tools.ietf.org/tools/rfcdiff/