draft-ietf-tcpm-rto-consider-02.txt   draft-ietf-tcpm-rto-consider-03.txt 
Internet Engineering Task Force M. Allman Internet Engineering Task Force M. Allman
INTERNET-DRAFT ICSI INTERNET-DRAFT ICSI
File: draft-ietf-tcpm-rto-consider-02.txt April 5, 2016 File: draft-ietf-tcpm-rto-consider-03.txt April 15, 2016
Intended Status: Best Current Practice Intended Status: Best Current Practice
Expires: October 5, 2016 Expires: October 15, 2016
Retransmission Timeout Considerations Retransmission Timeout Considerations
Status of this Memo Status of this Memo
This document may not be modified, and derivative works of it may This document may not be modified, and derivative works of it may
not be created, except to format it for publication as an RFC or to not be created, except to format it for publication as an RFC or to
translate it into languages other than English. translate it into languages other than English.
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
skipping to change at page 1, line 34 skipping to change at page 1, line 34
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on October 5, 2016. This Internet-Draft will expire on October 15, 2016.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License. warranty as described in the Simplified BSD License.
Abstract Abstract
Each implementation of a retransmission timeout mechanism must Each implementation of a retransmission timeout mechanism represents
balance correctness and timeliness and therefore no implementation a balance between correctness and timeliness and therefore no
suits all situations. This document provides high-level guidance implementation suits all situations. This document provides
for retransmission timeout schemes appropriate for general use in high-level requirements for retransmission timeout schemes
the Internet. Within the guidelines, implementations have latitude appropriate for general use in the Internet. Within the
to define particulars that best address each situation. requirements, implementations have latitude to define particulars
that best address each situation.
Terminology Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14, RFC 2119 document are to be interpreted as described in BCP 14, RFC 2119
[RFC2119]. [RFC2119].
1 Introduction 1 Introduction
skipping to change at page 2, line 36 skipping to change at page 2, line 37
acknowledgment schemes exist that do not depend on continuous acknowledgment schemes exist that do not depend on continuous
feedback to trigger retransmissions (e.g., [RFC3940]). However, feedback to trigger retransmissions (e.g., [RFC3940]). However,
regardless of these useful alternatives, the only thing we can truly regardless of these useful alternatives, the only thing we can truly
depend on is the passage of time and therefore our ultimate backstop depend on is the passage of time and therefore our ultimate backstop
to ensuring reliability is a timeout. (Note: There is a case when to ensuring reliability is a timeout. (Note: There is a case when
we cannot count on the passage of time, but in this case we believe we cannot count on the passage of time, but in this case we believe
repairing loss will be a moot point and hence we do not further repairing loss will be a moot point and hence we do not further
consider this case in this document.) consider this case in this document.)
Various protocols have defined their own timeout mechanisms (e.g., Various protocols have defined their own timeout mechanisms (e.g.,
TCP [RFC6298], SCTP [RFC4960]). Ideally, if we know a segment will TCP [RFC6298], SCTP [RFC4960], SIP [RFC3261]). Ideally, if we know
be lost before reaching the destination, a second copy of it would a segment will be lost before reaching the destination, a second
be sent immediately after the first transmission. However, in copy of it would be sent immediately after the first transmission.
reality the specifics of retransmission However, in reality the specifics of retransmission timeouts often
timeouts often represent a particular tradeoff between correctness represent a particular tradeoff between correctness and
and responsiveness [AP99]. In other words we want to responsiveness [AP99]. In other words we want to simultaneously:
simultaneously:
- Wait long enough to ensure the decision to retransmit is - Wait long enough to ensure the decision to retransmit is
correct. correct.
- Bound the delay we impose on applications before - Bound the delay we impose on applications before
retransmitting. retransmitting.
However, serving both of these goals is difficult as they pull us in However, serving both of these goals is difficult as they pull in
opposite directions. I.e., towards either (a) withholding needed opposite directions. I.e., towards either (a) withholding needed
retransmissions too long or (b) not waiting long enough and sending retransmissions too long to ensure the retransmissions are truly
spurious retransmissions. Given this fundamental tradeoff [AP99], needed or (b) not waiting long enough to help application
we have found that even though the retransmission timeout (RTO) responsiveness and sending spurious retransmissions. Given this
procedures are standardized, implementations also often add their fundamental tradeoff [AP99], we have found that even though the
own subtle imprint on the specifics of the process to tilt the retransmission timeout (RTO) procedures are standardized,
tradeoff between correctness and responsiveness in some particular implementations often add their own subtle imprint on the specifics
way. of the process to tilt the tradeoff between correctness and
responsiveness in some particular way.
At this point we recognize that often these specific tweaks are not At this point we recognize that often these specific tweaks are not
crucial for network safety. Hence, in this document we outline the crucial for network safety. Hence, in this document we outline the
high-level principles that are crucial for any retransmission high-level requirements that are crucial for any retransmission
timeout scheme to follow. The intent is to then allow timeout scheme to follow. The intent is to then allow
implementations of protocols and applications to instantiate implementations to instantiate mechanisms that best realize their
mechanisms that best realize their specific goals within this specific goals within this framework. These specific mechanisms
framework. These specific mechanisms could be standardized by the could be standardized by the IETF or ad-hoc, but as long as they
IETF or ad-hoc, but as long as they adhere to the guidelines given adhere to the requirements given in this document they would be
in this document they would be considered consistent with the considered consistent with the standards.
standards.
A non-goal of this document is to in any way specify individual Finally, we note the requirements in this document are applicable to
deviations from current IETF standardized RTO specifications that any protocol that uses a retransmission timeout mechanism. The
any particular implementation may exhibit. Rather, we provide a set examples and discussion are framed in terms of TCP, however, that is
of over-arching guidelines that all RTO mechanisms should follow. an artifact of where much of our experience with RTOs comes from and
should not be read as narrowing the scope of the requirements.
Finally, we note the guidelines in this document are applicable to 2 Scope
any protocol that uses an RTO mechanism. The examples and
discussion are framed in terms of TCP, however, that is an artifact
of where much of our experience with RTOs comes from and should not
be read as narrowing the scope of the guidelines.
2 Guidelines This document offers high-level requirements based on experience
with retransmission timer algorithms. However, this document
explicitly does not update or obsolete currently standardized
algorithms nor limit future standardization of specific RTO
mechanisms. Specifically:
We now list the four guidelines that apply when utilizing a (a) RTO mechanisms that are currently standardized are not updated
retransmission timeout (RTO). or obsoleted by this document. This holds even in cases where
the existing specification differs from the requirements in this
document (e.g., [RFC3261] uses a smaller initial RTO than this
document specifies). Existing standard specifications enjoy
their own consensus which this document does not change.
(b) Future standardization efforts that specify RTO mechanisms
SHOULD follow the requirements in this document. This follows
the definition of "SHOULD" [RFC2119] and is explicitly not a
"MUST". That is, the requirements in this document hold unless
the community has consensus that specific deviations in a
particular context are warranted.
(c) RTO mechanisms that are not standardized but adhere to the
requirements in the following section are deemed consistent with
the standards. This includes RTO mechanisms that are deviations
from a specific standardized algorithm, but are still within the
requirements below.
More colloquially we note that each RTO implementation can be placed
into one of the following four categories:
- The implementation precisely follows a standard RTO mechanism
(e.g., [RFC6298]), as well as adhering to the requirements in this
document.
This document represents no change for this situation as such an
implementation is clearly standards compliant.
- The implementation does not precisely follow a standard RTO
mechanism and does not adhere to the requirements in this
document.
This document makes no change to this situation as such an
implementation is clearly not standards compliant.
- The implementation precisely follows a standard RTO mechanism
(e.g., [RFC3261]), but does not precisely adhere to the
requirements in this document.
This document represents no change for this situation as such an
implementation is considered standards compliant by virtue of
precisely implementing a standard mechanism that has community
consensus as a reasonable approach. That is, this document's
stance is to not limit the community's ability to make exceptions
to the requirements herein for particular cases.
- The implementation does not precisely follow a standard RTO
mechanism, yet does adhere to the requirements in this document.
This document represents a change for these implementations and
considers them to be consistent with the standards by virtue of
following the requirements herein that provide for an RTO safe for
operation in the Internet.
In other words, the requirements in this document can be viewed as
specifying the default properties of an RTO mechanism.
Specifications can more concretely nail down specifics within these
defaults or work outside the defaults as necessary. However,
implementations that fall within the defaults do not require
explicit specifications to be considered consistent with the
standards.
3 Requirements
We now list the requirements that SHOULD apply when designing
retransmission timeout (RTO) mechanisms.
(1) In the absence of any knowledge about the latency of a path, the (1) In the absence of any knowledge about the latency of a path, the
RTO MUST be conservatively set to no less than 1 second, per RTO MUST be conservatively set to no less than 1 second.
TCP's current default RTO [RFC6298].
This guideline ensures two important aspects of the RTO. First, This requirement ensures two important aspects of the RTO.
when transmitting into an unknown network, retransmissions will First, when transmitting into an unknown network,
not be sent before an ACK would reasonably be expected to arrive retransmissions will not be sent before an ACK would reasonably
and hence possibly waste scarce network resources. Second, as be expected to arrive and hence possibly waste scarce network
noted below, sometimes retransmissions can lead to ambiguities resources. Second, as noted below, sometimes retransmissions
in assessing the latency of a network path. Therefore, it is can lead to ambiguities in assessing the latency of a network
especially important for the first latency sample to be free of path. Therefore, it is especially important for the first
ambiguities such that there is a baseline for the remainder of latency sample to be free of ambiguities such that there is a
the communication. baseline for the remainder of the communication.
(2) We specify three guidelines that pertain to the sampling of the The specific constant (1 second) comes from the analysis of
latency across a path. Internet RTTs found in Appendix A of [RFC6298].
(2) We specify three requirements that pertain to the sampling of
the latency across a path.
Often measuring the latency is framed as assessing the Often measuring the latency is framed as assessing the
round-trip time (RTT)---e.g., in TCP's RTO computation round-trip time (RTT)---e.g., in TCP's RTO computation
specification [RFC6298]. This is somewhat mis-leading as the specification [RFC6298]. This is somewhat mis-leading as the
latency is better framed as the "feedback time" (FT). In other latency is better framed as the "feedback time" (FT). In other
words, it is not simply a network property, but the length of words, it is not simply a network property, but the length of
time before we expect an acknowledgment for a given segment. time before a sender should reasonably expect a response to a
For instance, this includes any time an ACK is delayed by the query.
recipient [RFC5681].
For instance, consider a DNS request from a client to a
resolver. When the request can be served from the resolver's
cache the FT likely well approximates the network RTT between
the client and resolver. However, on a cache miss the resolver
will have to request the needed information from authoritative
DNS servers, which will non-trivially increase the FT and
therefore the FT between the client and resolver does not well
match the network-based RTT between the two hosts.
(a) In steady state the RTO MUST be set based on recent (a) In steady state the RTO MUST be set based on recent
observations of both the FT and the variance of the FT. observations of both the FT and the variance of the FT.
In other words, the RTO should be based on a reasonable In other words, the RTO should be based on a reasonable
amount of time that the sender should wait for an amount of time that the sender should wait for an
acknowledgment of the data before retransmitting the given acknowledgment of the data before retransmitting the given
data. data.
(b) FT observations MUST be taken regularly. (b) FT observations MUST be taken regularly.
The exact definition of "regularly" is deliberately left The exact definition of "regularly" is deliberately left
vague. TCP takes a FT sample roughly once per RTT, or if vague. TCP takes a FT sample roughly once per RTT, or if
using the timestamp option [RFC7323] on each acknowledgment using the timestamp option [RFC7323] on each acknowledgment
arrival. [AP99] shows that both these approaches result in arrival. [AP99] shows that both these approaches result in
roughly equivalent performance for the RTO estimator. roughly equivalent performance for the RTO estimator.
Additionally, [AP99] shows that taking only a single FT Additionally, [AP99] shows that taking only a single FT
sample per TCP connection is suboptimal and hence the sample per TCP connection is suboptimal and hence the
requirement that the FT be sampled continuously throughout requirement that the FT be sampled continuously throughout
the lifetime of a connection. For the purpose of this the lifetime of a connection. For the purpose of this
guideline, we state that FT samples SHOULD be taken at least requirement, we state that FT samples SHOULD be taken at
once per RTT or as frequently as data is exchanged and ACKed least once per RTT or as frequently as data is exchanged and
if that happens less frequently than every RTT. However, we ACKed if that happens less frequently than every RTT.
also recognize that it may not always be practical to take a However, we also recognize that it may not always be
FT sample this often in all cases and hence this requirement practical to take a FT sample this often in all cases.
is explicitly a "SHOULD" and not a "MUST". Hence, this once-per-RTT sampling requirement is explicitly
a "SHOULD" and not a "MUST".
(c) FT samples used in the computation of the RTO MUST NOT be (c) FT samples used in the computation of the RTO MUST NOT be
ambiguous. ambiguous.
Assume two copies of some segment X are transmitted at times Assume two copies of some segment X are transmitted at times
t0 and t1 and then segment X is acknowledged at time t2. In t0 and t1 and then segment X is acknowledged at time t2. In
some cases, it is not clear which copy of X triggered the some cases, it is not clear which copy of X triggered the
ACK and hence the actual FT is either t2-t1 or t2-t0, but ACK and hence the actual FT is either t2-t1 or t2-t0, but
which is a mystery. Therefore, in this situation an which is a mystery. Therefore, in this situation an
implementation MUST use Karn's algorithm [KP87,RFC6298] and implementation MUST use Karn's algorithm [KP87,RFC6298] and
skipping to change at page 5, line 22 skipping to change at page 6, line 47
This ensures network safety. This ensures network safety.
An exception is made to this rule if an IETF standardized An exception is made to this rule if an IETF standardized
mechanism is used to determine that a particular loss is due to mechanism is used to determine that a particular loss is due to
a non-congestion event (e.g., packet corruption). In such a a non-congestion event (e.g., packet corruption). In such a
case a congestion control action is not required. Additionally, case a congestion control action is not required. Additionally,
RTO-triggered congestion control actions may be reversed when a RTO-triggered congestion control actions may be reversed when a
standard mechanism determines that the cause of the loss was not standard mechanism determines that the cause of the loss was not
congestion after all. congestion after all.
3 Discussion 4 Discussion
We note that research has shown the tension between responsiveness We note that research has shown the tension between the
and correctness of TCP's RTO seems to be a fundamental tradeoff responsiveness and correctness of retransmission timeouts seems to
[AP99]. That is, making TCP's RTO more aggressive (via the EWMA be a fundamental tradeoff [AP99]. That is, making the RTO more
gains, lowering the minimum RTO, etc.) can reduce the time spent aggressive (e.g., via changing TCP's EWMA gains, lowering the
waiting on needed retransmissions. However, at the same time such minimum RTO, etc.) can reduce the time spent waiting on needed
aggressiveness leads to more needless retransmissions, as well. retransmissions. However, at the same time, such aggressiveness
Therefore, being as aggressive as the guidelines sketched in the leads to more needless retransmissions. Therefore, being as
last section allow in any particular situation may not be the best aggressive as the requirements given in the previous section allow
course of action (e.g., because an RTO expiration carries a in any particular situation may not be the best course of action
requirement to slow down). because an RTO expiration carries a requirement to slow down.
While the tradeoff between responsiveness and correctness seems While the tradeoff between responsiveness and correctness seems
fundamental, the tradeoff can be made less relevant if the sender fundamental, the tradeoff can be made less relevant if the sender
can detect and recover from spurious RTOs. Several mechanisms have can detect and recover from spurious RTOs. Several mechanisms have
been proposed for this purpose, such as Eifel [RFC3522], F-RTO been proposed for this purpose, such as Eifel [RFC3522], F-RTO
[RFC5682] and DSACK [RFC2883,RFC3708]. Using such mechanisms may [RFC5682] and DSACK [RFC2883,RFC3708]. Using such mechanisms may
allow a data originator to tip towards being more responsive without allow a data originator to tip towards being more responsive without
incurring (as much of) the attendant costs of needless retransmits. incurring (as much of) the attendant costs of needless retransmits.
Also, note, that in addition to the experiments discussed in [AP99], Also, note, that in addition to the experiments discussed in [AP99],
skipping to change at page 5, line 55 skipping to change at page 7, line 25
mechanisms for many years seemingly without large scale problems mechanisms for many years seemingly without large scale problems
(e.g., using different EWMA gains). Further, a number of (e.g., using different EWMA gains). Further, a number of
implementations use minimum RTOs that are less than the 1 second implementations use minimum RTOs that are less than the 1 second
specified in [RFC6298]. While the implication of these deviations specified in [RFC6298]. While the implication of these deviations
from the standard may be more spurious retransmits (per [AP99]), we from the standard may be more spurious retransmits (per [AP99]), we
are aware of no large scale problems caused by this change to the are aware of no large scale problems caused by this change to the
minimum RTO. minimum RTO.
Finally, we note that while allowing implementations to be more Finally, we note that while allowing implementations to be more
aggressive may in fact increase the number of needless aggressive may in fact increase the number of needless
retransmissions the above guidelines fail safe in that they insist retransmissions the above requirements fail safe in that they insist
on exponential backoff of the RTO and a transmission rate reduction. on exponential backoff of the RTO and a transmission rate reduction.
Therefore, allowing implementers latitude in their instantiations of Therefore, allowing implementers latitude in their instantiations of
an RTO mechanism does not somehow open the flood gates to aggressive an RTO mechanism does not somehow open the flood gates to aggressive
behavior. Since there is a downside to being aggressive the behavior. Since there is a downside to being aggressive the
incentives for proper behavior are retained in the mechanism. incentives for proper behavior are retained in the mechanism.
4 Security Considerations 5 Security Considerations
This document does not alter the security properties of This document does not alter the security properties of
retransmission timeout mechanisms. See [RFC6298] for a discussion retransmission timeout mechanisms. See [RFC6298] for a discussion
of these within the context of TCP. of these within the context of TCP.
Acknowledgments Acknowledgments
This document benefits from years of discussions with Ethan Blanton, This document benefits from years of discussions with Ethan Blanton,
Sally Floyd, Jana Iyengar, Shawn Ostermann, Vern Paxson and the Sally Floyd, Jana Iyengar, Shawn Ostermann, Vern Paxson, and the
members of the TCPM and TCP-IMPL working groups. Ran Atkinson, members of the TCPM and TCP-IMPL working groups. Ran Atkinson,
Yuchung Cheng, Jonathan Looney and Michael Scharf provided useful Yuchung Cheng, Jonathan Looney and Michael Scharf provided useful
comments on a previous version of this draft. comments on a previous version of this draft.
Normative References Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
Informative References Informative References
skipping to change at page 6, line 43 skipping to change at page 8, line 13
[KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time
Estimates in Reliable Transport Protocols", SIGCOMM 87. Estimates in Reliable Transport Protocols", SIGCOMM 87.
[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
Selective Acknowledgment Options", RFC 2018, October 1996. Selective Acknowledgment Options", RFC 2018, October 1996.
[RFC2883] Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An [RFC2883] Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An
Extension to the Selective Acknowledgement (SACK) Option for Extension to the Selective Acknowledgement (SACK) Option for
TCP", RFC 2883, July 2000. TCP", RFC 2883, July 2000.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
A., Peterson, J., Sparks, R., Handley, M., and E. Schooler,
"SIP: Session Initiation Protocol", RFC 3261, June 2002.
[RFC3522] Ludwig, R., M. Meyer, "The Eifel Detection Algorithm for [RFC3522] Ludwig, R., M. Meyer, "The Eifel Detection Algorithm for
TCP", RFC 3522, april 2003. TCP", RFC 3522, april 2003.
[RFC3708] Blanton, E., M. Allman, "Using TCP Duplicate Selective [RFC3708] Blanton, E., M. Allman, "Using TCP Duplicate Selective
Acknowledgement (DSACKs) and Stream Control Transmission Acknowledgement (DSACKs) and Stream Control Transmission
Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs) Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs)
to Detect Spurious Retransmissions", RFC 3708, February 2004. to Detect Spurious Retransmissions", RFC 3708, February 2004.
[RFC3940] Adamson, B., C. Bormann, M. Handley, J. Macker, [RFC3940] Adamson, B., C. Bormann, M. Handley, J. Macker,
"Negative-acknowledgment (NACK)-Oriented Reliable Multicast "Negative-acknowledgment (NACK)-Oriented Reliable Multicast
 End of changes. 24 change blocks. 
79 lines changed or deleted 161 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/