draft-ietf-tcpm-tcp-dcr-04.txt   draft-ietf-tcpm-tcp-dcr-05.txt 
Internet Engineering Task Force Sumitha Bhandarkar Internet Engineering Task Force Sumitha Bhandarkar
INTERNET DRAFT A. L. Narasimha Reddy INTERNET DRAFT A. L. Narasimha Reddy
draft-ietf-tcpm-tcp-dcr-04.txt Texas A&M University draft-ietf-tcpm-tcp-dcr-05.txt Texas A&M University
Expires : November 2005 Mark Allman Expires: April 2005 Mark Allman
ICIR ICIR/ICSI
Ethan Blanton Ethan Blanton
Purdue University Purdue University
May 2005 September 2005
Improving the Robustness of TCP to Non-Congestion Events Improving the Robustness of TCP to Non-Congestion Events
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
skipping to change at page 2, line 6 skipping to change at page 2, line 6
This document specifies Non-Congestion Robustness (NCR) for TCP. In This document specifies Non-Congestion Robustness (NCR) for TCP. In
the absence of explicit congestion notification from the network, the absence of explicit congestion notification from the network,
TCP's loss recovery algorithms treat the receipt of three duplicate TCP's loss recovery algorithms treat the receipt of three duplicate
acknowledgments as an implicit indication of congestion in the acknowledgments as an implicit indication of congestion in the
network. This is not always correct, notably in the case when network. This is not always correct, notably in the case when
network paths reorder segments (for whatever reason), resulting in network paths reorder segments (for whatever reason), resulting in
degraded performance. TCP-NCR is designed to mitigate this degraded degraded performance. TCP-NCR is designed to mitigate this degraded
performance by increasing the number of duplicate acknowledgments performance by increasing the number of duplicate acknowledgments
required to trigger loss recovery, based on the current state of the required to trigger loss recovery, based on the current state of the
connection, in an effort to disambiguate true segment loss from connection, in an effort to better disambiguate true segment loss
segment reordering. In addition, we specify an option, Aggressive from segment reordering. This document specifies the changes to TCP,
Limited Transmit, where the TCP sender does not reduce its sending as well as the costs and benefits of these modifications.
rate until a segment is actually retransmitted; this would delay the
reduction of the sending rate by roughly one round-trip time compared
to current TCP implementations. This document specifies the changes
to TCP, as well as the costs and benefits of these modifications.
Terminology Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described "OPTIONAL" in this document are to be interpreted as described
in [RFC2119]. in [RFC2119].
Readers should be familiar with the TCP terminology given in Readers should be familiar with the TCP terminology given in
[RFC2581] and [RFC3517]. [RFC2581] and [RFC3517].
skipping to change at page 5, line 24 skipping to change at page 5, line 21
duplicate ACKs corresponding to at most 9 segments will arrive at the duplicate ACKs corresponding to at most 9 segments will arrive at the
sender. To offset the issue of loss, we extend TCP's Limited sender. To offset the issue of loss, we extend TCP's Limited
Transmit [RFC3042] scheme to allow for the sending of new data during Transmit [RFC3042] scheme to allow for the sending of new data during
the period when the TCP sender is disambiguating loss and reordering. the period when the TCP sender is disambiguating loss and reordering.
This new data serves to increase the likelihood of enough duplicate This new data serves to increase the likelihood of enough duplicate
ACKs arriving at the sender to trigger loss recovery if it is ACKs arriving at the sender to trigger loss recovery if it is
appropriate. appropriate.
At this point we note that TCP tightly couples reliability and At this point we note that TCP tightly couples reliability and
congestion control -- when a segment is declared lost, a congestion control -- when a segment is declared lost, a
retransmission is triggered and a change to sending rate is also made retransmission is triggered and a change to the sending rate is also
on the assumption that the drop is due to resource contention made on the assumption that the drop is due to resource contention
[RFC2581]. Therefore, by simply changing the retransmission trigger [RFC2581]. Therefore, by simply changing the retransmission trigger
the congestion control response is also changed. However, we lack the congestion control response is also changed. However, we lack
experience on the Internet as to whether delaying the point that a experience on the Internet as to whether delaying the point that a
rate reduction takes place is appropriate for wide-scale deployment. rate reduction takes place is appropriate for wide-scale deployment.
Therefore, the extended Limited Transmit mechanism proposed in this Therefore, the Extended Limited Transmit mechanism proposed in this
document offers two variants for experimentation. document offers two variants for experimentation.
The first Extended Limited Transmit variant, Careful Limited The first Extended Limited Transmit variant, Careful Limited
Transmit, calls for the transmission of a previously unsent segment Transmit, calls for the transmission of one previously unsent
for every two segments that are known to have left the network. This segment, in response to duplicate acknowledgements, for every two
has the effect of halving the sending rate since normal TCP operation segments that are known to have left the network. This has the
calls for the sending of one segment for every segment that has left effect of halving the sending rate since normal TCP operation calls
the network. Further, the halving starts immediately and is not for the sending of one segment for every segment that has left the
delayed until a retransmission is triggered. In the case of packet network. Further, the halving starts immediately and is not delayed
until a retransmission is triggered. In the case of packet
reordering (i.e., not segment loss) the congestion control state is reordering (i.e., not segment loss) the congestion control state is
restored to its previous state when reordering is determined. restored to its previous state when reordering is determined.
The second variant, Aggressive Limited Transmit, calls for The second variant, Aggressive Limited Transmit, calls for
transmitting a previously unsent data segment for every segment known transmitting one previously unsent data segment, in response to
to have left the network. With this variant, while waiting to duplicate acknowledgements, for every segment known to have left the
disambiguate the loss from a reordering event, ACK-clocked network. With this variant, while waiting to disambiguate the loss
transmission continues at rougly the same rate as before the event from a reordering event, ACK-clocked transmission continues at
started. Retransmission and the sending rate reduction happen per roughly the same rate as before the event started. Retransmission
[RFC2581,RFC3517], albeit with the delayed threshold described above. and the sending rate reduction happen per [RFC2581,RFC3517], albeit
While this approach delays legitimate rate reductions (possibly with the delayed threshold described above. While this approach
slightly and temporarily aggravating overall congestion on the delays legitimate rate reductions (possibly slightly and temporarily
network) the scheme has the advantage of not reducing the aggravating overall congestion on the network) the scheme has the
transmission rate in the face of segment reordering. advantage of not reducing the transmission rate in the face of
segment reordering.
Which of the two Extended Limited Transmit variants is best for use It is an open question which of the two Extended Limited Transmit
on the Internet is an open question. variants is best for use on the Internet.
3. Algorithm 3. Algorithm
The TCP-NCR modifications make two fundamental changes to the way The TCP-NCR modifications make two fundamental changes to the way
[RFC3517] currently operates, as follows. [RFC3517] currently operates, as follows.
First, the trigger for retransmitting a segment is changed from three First, the trigger for retransmitting a segment is changed from three
duplicate ACKs [RFC2581,RFC3517] to indications that a congestion duplicate ACKs [RFC2581,RFC3517] to indications that a congestion
window's worth of data has left the network. Second, TCP-NCR window's worth of data has left the network. Second, TCP-NCR
decouples initial congestion control decisions from retransmission decouples initial congestion control decisions from retransmission
skipping to change at page 9, line 12 skipping to change at page 9, line 12
not congestion. Therefore, the receipt of an ACK that extends the not congestion. Therefore, the receipt of an ACK that extends the
cumulative ACK point MUST terminate Extended Limited Transmit. As cumulative ACK point MUST terminate Extended Limited Transmit. As
described below (in (T.4)), an ACK that extends the cumulative ACK described below (in (T.4)), an ACK that extends the cumulative ACK
point and *also* contains SACK information will also trigger the point and *also* contains SACK information will also trigger the
beginning of a new Extended Limited Transmit phase. beginning of a new Extended Limited Transmit phase.
Upon the termination of Extended Limited Transmit, and especially Upon the termination of Extended Limited Transmit, and especially
when using the Careful variant, TCP-NCR may be in a situation where when using the Careful variant, TCP-NCR may be in a situation where
the entire cwnd is not being utilized and therefore TCP-NCR will be the entire cwnd is not being utilized and therefore TCP-NCR will be
prone to transmitting a burst of segments into the network. prone to transmitting a burst of segments into the network.
Therefore, upon exiting Extended Limited Transmit the following steps Therefore, when a TCP-NCR in the Extended Limited Transmit phase
MUST be taken. receives an ACK that updates the cumulative ACK point (regardless of
whether the ACK contains SACK information), the following steps MUST
When a TCP-NCR in the Extended Limited Transmit phase receives an ACK be taken:
that updates the cumulative ACK point (regardless of whether the ACK
contains SACK information), the following steps MUST be taken:
(T.1) cwnd = min (FlightSize + SMSS,FlightSizePrev) (T.1) cwnd = min (FlightSize + SMSS,FlightSizePrev)
This step ensures that cwnd is not grossly larger than the This step ensures that cwnd is not grossly larger than the
amount of data outstanding --- a situation that would cause a amount of data outstanding --- a situation that would cause a
line rate burst. line rate burst.
(T.2) ssthresh = FlightSizePrev (T.2) ssthresh = FlightSizePrev
This step provides TCP-NCR with a sense of "history". If step This step provides TCP-NCR with a sense of "history". If step
skipping to change at page 11, line 45 skipping to change at page 11, line 44
While we note that all of the changes outlined above are implemented While we note that all of the changes outlined above are implemented
in the sender, the receiver also potentially has a part to play. In in the sender, the receiver also potentially has a part to play. In
particular, TCP-NCR increases the receiver's buffering requirement by particular, TCP-NCR increases the receiver's buffering requirement by
up to an extra cwnd -- in the case of the TCP sender using Aggressive up to an extra cwnd -- in the case of the TCP sender using Aggressive
Limited Transmit and actual loss occurring in the network. Limited Transmit and actual loss occurring in the network.
Therefore, to maximize the benefits from TCP-NCR receivers should Therefore, to maximize the benefits from TCP-NCR receivers should
advertise a large window to absorb the extra out-of-order traffic. In advertise a large window to absorb the extra out-of-order traffic. In
the case that the additonal buffer requirements are not met, the use the case that the additonal buffer requirements are not met, the use
of the above algorithm takes into account the reduced advertised of the above algorithm takes into account the reduced advertised
window, resulting in slighlty reduced robustness to reordering. window.
In addition, using TCP-NCR could delay the delivery of data to the In addition, using TCP-NCR could delay the delivery of data to the
application by up to one RTT because the fast retransmission point is application by up to one RTT because the fast retransmission point is
delayed by roughly one RTT in TCP-NCR. Applications that are delayed by roughly one RTT in TCP-NCR. Applications that are
sensitive to such delays should turn off the TCP-NCR option. For sensitive to such delays should turn off the TCP-NCR option. For
instance, a socket option could be introduced to allow applications instance, a socket option could be introduced to allow applications
to control whether NCR would be used for a particular connection. to control whether NCR would be used for a particular connection.
Finally, the use of TCP-NCR makes the recovery from congestion events Finally, the use of TCP-NCR makes the recovery from congestion events
sluggish in comparison to the standard reaction in [RFC2581]. [BR04, sluggish in comparison to the standard reaction in [RFC2581]. [BR04,
skipping to change at page 12, line 42 skipping to change at page 12, line 41
delays retransmission by a fixed amount (in comparison to standard delays retransmission by a fixed amount (in comparison to standard
TCP), while the other schemes use relatively complex algorithms in an TCP), while the other schemes use relatively complex algorithms in an
attempt to derive a more precise value for DupThresh that depends on attempt to derive a more precise value for DupThresh that depends on
the network conditions. While TCP-NCR offers simplicity the other the network conditions. While TCP-NCR offers simplicity the other
schemes may offer more precision such that applications would not be schemes may offer more precision such that applications would not be
forced to wait as long for their retransmissions. Future work could forced to wait as long for their retransmissions. Future work could
be undertaken to achieve robustness without needless delay. be undertaken to achieve robustness without needless delay.
On the other hand, several schemes have been developed to detect and On the other hand, several schemes have been developed to detect and
mitigate needless retransmissions after the fact. mitigate needless retransmissions after the fact.
[RFC3522,RFC3708,BA02,LG04,SK04] present algorithms to detect [RFC3522,RFC3708,BA02,RFC4015,SK04] present algorithms to detect
spurious retransmits and mitigate the changes these events made to spurious retransmits and mitigate the changes these events made to
the congestion control state. TCP-NCR could be used in conjunction the congestion control state. TCP-NCR could be used in conjunction
with these algorithms, with TCP-NCR attempting to prevent spurious with these algorithms, with TCP-NCR attempting to prevent spurious
retransmits and some other scheme kicking in if the prevention retransmits and some other scheme kicking in if the prevention
failed. In addition, we note that TCP-NCR is concentrated on failed. In addition, we note that TCP-NCR is concentrated on
preventing spurious fast retransmits and some of the above algorithms preventing spurious fast retransmits and some of the above algorithms
also attempt to detect and mitigate spurious timeout-based also attempt to detect and mitigate spurious timeout-based
retransmits. retransmits.
7. Security Considerations 7. Security Considerations
skipping to change at page 13, line 6 skipping to change at page 13, line 4
spurious retransmits and mitigate the changes these events made to spurious retransmits and mitigate the changes these events made to
the congestion control state. TCP-NCR could be used in conjunction the congestion control state. TCP-NCR could be used in conjunction
with these algorithms, with TCP-NCR attempting to prevent spurious with these algorithms, with TCP-NCR attempting to prevent spurious
retransmits and some other scheme kicking in if the prevention retransmits and some other scheme kicking in if the prevention
failed. In addition, we note that TCP-NCR is concentrated on failed. In addition, we note that TCP-NCR is concentrated on
preventing spurious fast retransmits and some of the above algorithms preventing spurious fast retransmits and some of the above algorithms
also attempt to detect and mitigate spurious timeout-based also attempt to detect and mitigate spurious timeout-based
retransmits. retransmits.
7. Security Considerations 7. Security Considerations
We do not believe there are security implications involved with TCP- We do not believe there are security implications involved with TCP-
NCR over and above those for general TCP congestion control NCR over and above those for general TCP congestion control
[RFC2581]. In particular, the Extended Limited Transmit algorithms [RFC2581]. In particular, the Extended Limited Transmit algorithms
specified in this document have been specifically designed not to be specified in this document have been specifically designed not to be
susceptible to the sorts of ACK splitting attacks TCP's general TCP susceptible to the sorts of ACK splitting attacks TCP's general TCP
congestion control is vulnerable to (as discussed in [RFC3465]. congestion control is vulnerable to (as discussed in [RFC3465]).
8. Acknowledgements 8. Acknowledgements
Ted Faber, Sally Floyd, Nauzad Sadry, Pasi Sarolahti and Nitin Vaidya Ted Faber, Wesley Eddy, Gorry Fairhurst, Sally Floyd, Nauzad Sadry,
as well as feedback from from the TCPM working group have contributed Pasi Sarolahti, Joe Touch and Nitin Vaidya as well as feedback from
significantly to this document. Our thanks to all! the TCPM working group have contributed significantly to this
document. Our thanks to all!
9. Normative References 9. Normative References
[RFC793] J. Postel, "Transmission Control Protocol", RFC 793, [RFC793] J. Postel, "Transmission Control Protocol", RFC 793,
September 1981. September 1981.
[RFC2018] M. Mathis, J. Mahdavi, S. Floyd and A. Romanow, "TCP [RFC2018] M. Mathis, J. Mahdavi, S. Floyd and A. Romanow, "TCP
selective acknowledgment options," Internet RFC 2018. selective acknowledgment options," Internet RFC 2018.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
skipping to change at page 14, line 29 skipping to change at page 14, line 29
Communication Review, vol. 18, no. 4, pp. 314-329, Aug. 1988. Communication Review, vol. 18, no. 4, pp. 314-329, Aug. 1988.
ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z. ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z.
[JIDKT03] S. Jaiswal, G. Iannaccone, C. Diot, J. Kurose, and D. [JIDKT03] S. Jaiswal, G. Iannaccone, C. Diot, J. Kurose, and D.
Towsley, "Measurement and Classification of Out-of-Sequence Packets Towsley, "Measurement and Classification of Out-of-Sequence Packets
in a Tier-1 IP Backbone," Proceedings of IEEE INFOCOM, 2003. in a Tier-1 IP Backbone," Proceedings of IEEE INFOCOM, 2003.
[KM02] I. Keslassy and N. McKeown, "Maintaining packet order in [KM02] I. Keslassy and N. McKeown, "Maintaining packet order in
twostage switche s," Proceedings of the IEEE Infocom, June 2002 twostage switche s," Proceedings of the IEEE Infocom, June 2002
[LG04] R. Ludwig, A. Gurtov, "The Eifel Response Algorithm for TCP",
Internet-Draft draft-ietf-tsvwg-tcp-eifel-response-06.txt (work in
progress). September 2004.
[MAF05] A. Medina, M. Allman, S. Floyd. Measuring the Evolution of [MAF05] A. Medina, M. Allman, S. Floyd. Measuring the Evolution of
Transport Protocols in the Internet. ACM Computer Communication Transport Protocols in the Internet. ACM Computer Communication
Review, 35(2), April 2005. Review, 35(2), April 2005.
[NS-2] ns-2 Network Simulator. http://www.isi.edu/nsnam/ [NS-2] ns-2 Network Simulator. http://www.isi.edu/nsnam/
[Pax97] V. Paxson, "End-to-End Internet Packet Dynamics," Proceedings [Pax97] V. Paxson, "End-to-End Internet Packet Dynamics," Proceedings
of ACM SIGCOMM, September 1997. of ACM SIGCOMM, September 1997.
[RFC896] J. Nagle, "Congestion Control in IP/TCP Internetworks", RFC [RFC896] J. Nagle, "Congestion Control in IP/TCP Internetworks", RFC
skipping to change at page 15, line 22 skipping to change at page 15, line 17
Counting (ABC), February 2003. RFC 3465. Counting (ABC), February 2003. RFC 3465.
[RFC3522] R. Ludwig and M. Meyer, "The Eifel Detection Algorithm for [RFC3522] R. Ludwig and M. Meyer, "The Eifel Detection Algorithm for
TCP," RFC 3522, April 2003. TCP," RFC 3522, April 2003.
[RFC3708] E. Blanton and M. Allman, "Using TCP Duplicate Selective [RFC3708] E. Blanton and M. Allman, "Using TCP Duplicate Selective
Acknowledgement (DSACKs) and Stream Control Transmission Protocol Acknowledgement (DSACKs) and Stream Control Transmission Protocol
(SCTP) Duplicate Transmission Sequence Numbers (TSNs) to Detect (SCTP) Duplicate Transmission Sequence Numbers (TSNs) to Detect
Spurious Retransmissions", RFC 3708, February 2004. Spurious Retransmissions", RFC 3708, February 2004.
[RFC4015] R. Ludwig, A. Gurtov, "The Eifel Response Algorithm for
TCP", RFC 4015, February 2005.
[SK04] P. Sarolahti, M. Kojo, "Forward RTO-Recovery (F-RTO): An [SK04] P. Sarolahti, M. Kojo, "Forward RTO-Recovery (F-RTO): An
Algorithm for Detecting Spurious Retransmission Timeouts with TCP and Algorithm for Detecting Spurious Retransmission Timeouts with TCP and
SCTP", Internet-Draft draft-ietf-tcpm-frto-02.txt (work in progress). SCTP", Internet-Draft draft-ietf-tcpm-frto-02.txt (work in progress).
November 2004. November 2004.
[ZKFP03] M. Zhang, B. Karp, S. Floyd, L. Peterson, "RR-TCP: A [ZKFP03] M. Zhang, B. Karp, S. Floyd, L. Peterson, "RR-TCP: A
Reordering-Robust TCP with DSACK", in Proceedings of the Eleventh Reordering-Robust TCP with DSACK", in Proceedings of the Eleventh
IEEE International Conference on Networking Protocols (ICNP 2003), IEEE International Conference on Networking Protocols (ICNP 2003),
Atlanta, GA, November, 2003. Atlanta, GA, November, 2003.
skipping to change at page 16, line 11 skipping to change at page 16, line 10
Mark Allman Mark Allman
ICSI Center for Internet Research ICSI Center for Internet Research
1947 Center Street, Suite 600 1947 Center Street, Suite 600
Berkeley, CA 94704-1198 Berkeley, CA 94704-1198
Phone: (216) 243-7361 Phone: (216) 243-7361
Email: mallman@icir.org Email: mallman@icir.org
URL: http://www.icir.org/mallman/ URL: http://www.icir.org/mallman/
Ethan Blanton Ethan Blanton
Purdue University Computer Sciences Purdue University Computer Science
250 North University Street 250 North University Street
West Lafayette, IN 47907 West Lafayette, IN 47907
Email: eblanton@cs.purdue.edu Email: eblanton@cs.purdue.edu
Intellectual Property Statement Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights this document or the extent to which any license under such rights
 End of changes. 

This html diff was produced by rfcdiff 1.25, available from http://www.levkowetz.com/ietf/tools/rfcdiff/