draft-ietf-tcpm-2140bis-01.txt   draft-ietf-tcpm-2140bis-02.txt 
TCPM WG J. Touch TCPM WG J. Touch
Internet Draft Independent Internet Draft Independent
Intended status: Informational M. Welzl Intended status: Informational M. Welzl
Obsoletes: 2140 S. Islam Obsoletes: 2140 S. Islam
Expires: May 2020 University of Oslo Expires: August 2020 University of Oslo
November 19, 2019 February 28, 2020
TCP Control Block Interdependence TCP Control Block Interdependence
draft-ietf-tcpm-2140bis-01.txt draft-ietf-tcpm-2140bis-02.txt
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
This document may contain material from IETF Documents or IETF This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this 10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow material may not have granted the IETF Trust the right to allow
skipping to change at page 1, line 45 skipping to change at page 1, line 45
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on May 19, 2020. This Internet-Draft will expire on August 28, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided Section 4.e of the Trust Legal Provisions and are provided
skipping to change at page 2, line 42 skipping to change at page 2, line 42
across connections to the same host. Such sharing is intended to across connections to the same host. Such sharing is intended to
improve overall transient transport performance, while maintaining improve overall transient transport performance, while maintaining
backward-compatibility with existing implementations. The sharing backward-compatibility with existing implementations. The sharing
described herein is limited to only the TCB initialization and so described herein is limited to only the TCB initialization and so
has no effect on the long-term behavior of TCP after a connection has no effect on the long-term behavior of TCP after a connection
has been established. has been established.
Table of Contents Table of Contents
1. Introduction...................................................3 1. Introduction...................................................3
2. Conventions used in this document..............................4 2. Conventions Used in This Document..............................4
3. Terminology....................................................4 3. Terminology....................................................4
4. The TCP Control Block (TCB)....................................4 4. The TCP Control Block (TCB)....................................4
5. TCB Interdependence............................................5 5. TCB Interdependence............................................5
6. An Example of Temporal Sharing.................................6 6. Temporal Sharing...............................................6
7. An Example of Ensemble Sharing.................................9 6.1. Initialization of the new TCB................................6
8. Compatibility Issues..........................................11 6.2. Updates to the new TCP.......................................7
9. Implications..................................................13 6.3. Discussion...................................................8
10. Implementation Observations..................................14 7. Ensemble Sharing...............................................9
11. Updates to RFC 2140..........................................15 7.1. Initialization of a new TCB..................................9
12. Security Considerations......................................16 7.2. Updates to the new TCB......................................10
13. IANA Considerations..........................................16 7.3. Discussion..................................................11
14. References...................................................16 8. Compatibility Issues..........................................12
14.1. Normative References....................................16 8.1. Traversing the same network path............................13
14.2. Informative References..................................17 8.2. State dependence............................................13
15. Acknowledgments..............................................19 8.3. Problems with IP sharing....................................14
16. Change log...................................................20 9. Implications..................................................14
Appendix A : TCB sharing history.................................22 9.1. Layering....................................................14
Appendix B : TCP Option Sharing and Caching......................22 9.2. Other possibilities.........................................15
10. Implementation Observations..................................15
11. Updates to RFC 2140..........................................16
12. Security Considerations......................................17
13. IANA Considerations..........................................17
14. References...................................................18
14.1. Normative References....................................18
14.2. Informative References..................................18
15. Acknowledgments..............................................21
16. Change log...................................................21
Appendix A : TCB Sharing History.................................24
Appendix B : TCP Option Sharing and Caching......................25
Appendix C : Automating the Initial Window in TCP over Long Appendix C : Automating the Initial Window in TCP over Long
Timescales.......................................................25 Timescales.......................................................27
C.1. Introduction.............................................25 C.1. Introduction.............................................27
C.2. Design Considerations....................................25 C.2. Design Considerations....................................27
C.3. Proposed IW Algorithm....................................26 C.3. Proposed IW Algorithm....................................28
C.4. Discussion...............................................29 C.4. Discussion...............................................31
C.5. Observations.............................................30 C.5. Observations.............................................32
1. Introduction 1. Introduction
TCP is a connection-oriented reliable transport protocol layered TCP is a connection-oriented reliable transport protocol layered
over IP [RFC793]. Each TCP connection maintains state, usually in a over IP [RFC793]. Each TCP connection maintains state, usually in a
data structure called the TCP Control Block (TCB). The TCB contains data structure called the TCP Control Block (TCB). The TCB contains
information about the connection state, its associated local information about the connection state, its associated local
process, and feedback parameters about the connection's transmission process, and feedback parameters about the connection's transmission
properties. As originally specified and usually implemented, most properties. As originally specified and usually implemented, most
TCB information is maintained on a per-connection basis. Some TCB information is maintained on a per-connection basis. Some
implementations can (and now do) share certain TCB information implementations can (and now do) share certain TCB information
across connections to the same host [RFC2140]. Such sharing is across connections to the same host [RFC2140]. Such sharing is
intended to lead to better overall transient performance, especially intended to lead to better overall transient performance, especially
for numerous short-lived and simultaneous connections, as often used for numerous short-lived and simultaneous connections, as often used
in the World-Wide Web [Be94],[Br02]. This sharing of state is in the World-Wide Web [Be94][Br02]. This sharing of state is
intended to help TCP connections converge to steady-state behavior intended to help TCP connections converge to steady-state behavior
more quickly without affecting TCP interoperability. more quickly without affecting TCP interoperability.
This document updates RFC 2140's discussion of TCB state sharing and This document updates RFC 2140's discussion of TCB state sharing and
provides a complete replacement for that document. This state provides a complete replacement for that document. This state
sharing affects only TCB initialization [RFC2140] and thus has no sharing affects only TCB initialization [RFC2140] and thus has no
effect on the long-term behavior of TCP after a connection has been effect on the long-term behavior of TCP after a connection has been
established nor on interoperability. Path information shared across established nor on interoperability. Path information shared across
SYN destination port numbers assumes that TCP segments having the SYN destination port numbers assumes that TCP segments having the
same host-pair experience the same path properties, irrespective of same host-pair experience the same path properties, irrespective of
TCP port numbers. The observations about TCB sharing in this TCP port numbers. The observations about TCB sharing in this
document apply similarly to any protocol with congestion state, document apply similarly to any protocol with congestion state,
including SCTP [RFC4960] and DCCP [RFC4340], as well as for including SCTP [RFC4960] and DCCP [RFC4340], as well as for
individual subflows in Multipath TCP [RFC6824]. individual subflows in Multipath TCP [RFC6824].
2. Conventions used in this document 2. Conventions Used in This Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
However, this document is intended to describe behavior that is However, this document is intended to describe behavior that is
already permitted by TCP implementers. As a result, it provides already permitted by TCP standards. As a result, it provides
informative guidance but does not use such normative language, informative guidance but does not use such normative language,
except when quoting other documents. except when quoting other documents.
3. Terminology 3. Terminology
Host - a source or sink of TCP segments associated with a single IP Host - a source or sink of TCP segments associated with a single IP
address address
Host-pair - a pair of hosts and their corresponding IP addresses Host-pair - a pair of hosts and their corresponding IP addresses
skipping to change at page 5, line 18 skipping to change at page 5, line 18
pointers to Internet Protocol (IP) PCB pointers to Internet Protocol (IP) PCB
Per-connection shared state Per-connection shared state
macro-state macro-state
connection state connection state
timers timers
flags flags
local and remote host numbers and ports local and remote host numbers and ports
TCP option state TCP option state
micro-state micro-state
send and receive window state (size*, current number) send and receive window state (size*, current number)
round-trip time and variance
cong. window size (snd_cwnd)* cong. window size (snd_cwnd)*
cong. window size threshold (ssthresh)* cong. window size threshold (ssthresh)*
max window size seen* max window size seen*
sendMSS# sendMSS#
MMS_S# MMS_S#
MMS_R# MMS_R#
PMTU# PMTU#
round-trip time and variance# round-trip time and variance#
The per-connection information is shown as split into macro-state The per-connection information is shown as split into macro-state
skipping to change at page 6, line 5 skipping to change at page 6, line 5
5. TCB Interdependence 5. TCB Interdependence
There are two cases of TCB interdependence. Temporal sharing occurs There are two cases of TCB interdependence. Temporal sharing occurs
when the TCB of an earlier (now CLOSED) connection to a host is used when the TCB of an earlier (now CLOSED) connection to a host is used
to initialize some parameters of a new connection to that same host, to initialize some parameters of a new connection to that same host,
i.e., in sequence. Ensemble sharing occurs when a currently active i.e., in sequence. Ensemble sharing occurs when a currently active
connection to a host is used to initialize another (concurrent) connection to a host is used to initialize another (concurrent)
connection to that host. connection to that host.
6. An Example of Temporal Sharing 6. Temporal Sharing
The TCB data cache is accessed in two ways: it is read to initialize The TCB data cache is accessed in two ways: it is read to initialize
new TCBs and written when more current per-host state is available. new TCBs and written when more current per-host state is available.
New TCBs can be initialized using context from past connections as
follows:
TEMPORAL SHARING - TCB Initialization 6.1. Initialization of the new TCB
TCBs for new connections can be initialized using context from past
connections as follows:
TEMPORAL SHARING - TCB Initialization
Cached TCB New TCB Cached TCB New TCB
-------------------------------------- --------------------------------------
old_MMS_S old_MMS_S or not cached old_MMS_S old_MMS_S or not cached
old_MMS_R old_MMS_R or not cached old_MMS_R old_MMS_R or not cached
old_sendMSS old_sendMSS old_sendMSS old_sendMSS
old_PMTU old_PMTU old_PMTU old_PMTU
skipping to change at page 6, line 34 skipping to change at page 6, line 37
old_RTT old_RTT old_RTT old_RTT
old_RTTvar old_RTTvar old_RTTvar old_RTTvar
old_option (option specific) old_option (option specific)
old_ssthresh old_ssthresh old_ssthresh old_ssthresh
old_snd_cwnd old_snd_cwnd old_snd_cwnd old_snd_cwnd
Sections 8 and 9 discuss compatibility issues and implications of
sharing the specific information listed above. Section 10 gives an
overview of known implementations.
Most cached TCB values are updated when a connection closes. The
exceptions are MMS_R and MMS_S, which are reported by IP [RFC1122],
PMTU which is updated after Path MTU Discovery
[RFC1191][RFC4821][RFC8201], and sendMSS, which is updated if the
MSS option is received in the TCP SYN header.
Sharing sendMSS information affects only data in the SYN of the next
connection, because sendMSS information is typically included in
most TCP SYN segments. Caching PMTU can accelerate the efficiency of
PMTUD, but can also result in black-holing until corrected if in
error. Caching MMS_R and MMS_S may be of little direct value as they
are reported by the local IP stack anyway.
The way in which other TCP option state can be shared depends on the
details of that option. E.g., TFO state includes the TCP Fast Open
Cookie [RFC7413] or, in case TFO fails, a negative TCP Fast Open
response. RFC 7413 states, "The client MUST cache negative responses
from the server in order to avoid potential connection failures.
Negative responses include the server not acknowledging the data in
the SYN, ICMP error messages, and (most importantly) no response
(SYN-ACK) from the server at all, i.e., connection timeout." [RFC
7413]. TFOinfo is cached when a connection is established.
Other TCP option state might not be as readily cached. E.g., TCP-AO
[RFC5925] success or failure between a host pair for a single SYN
destination port might be usefully cached. TCP-AO success or failure
to other SYN destination ports on that host pair is never useful to
cache because TCP-AO security parameters can vary per service.
The table below gives an overview of option-specific information The table below gives an overview of option-specific information
that can be shared. Additional information on TCP options and that can be shared. Additional information on some specific TCP
sharing is provided in Appendix B. options and sharing is provided in 0.
TEMPORAL SHARING - Option info TEMPORAL SHARING - Option Info Initialization
Cached New Cached New
---------------------------------------- ----------------------------------------
old_TFO_Cookie old_TFO_Cookie old_TFO_Cookie old_TFO_Cookie
old_TFO_Failure old_TFO_Failure old_TFO_Failure old_TFO_Failure
6.2. Updates to the new TCP
During the connection, the associated TCB can be updated based on
particular events, as shown below:
TEMPORAL SHARING - Cache Updates TEMPORAL SHARING - Cache Updates
Cached TCB Current TCB when? New Cached TCB Cached TCB Current TCB when? New Cached TCB
------------------------------------------------------ ------------------------------------------------------
old_MMS_S curr_ MMS_S OPEN curr MMS_S old_MMS_S curr_ MMS_S OPEN curr MMS_S
old_MMS_R curr_ MMS_R OPEN curr_MMS_R old_MMS_R curr_ MMS_R OPEN curr_MMS_R
old_sendMSS curr_sendMSS MSSopt curr_sendMSS old_sendMSS curr_sendMSS MSSopt curr_sendMSS
skipping to change at page 8, line 26 skipping to change at page 7, line 40
old_RTT curr_RTT CLOSE merge(curr,old) old_RTT curr_RTT CLOSE merge(curr,old)
old_RTTvar curr_RTTvar CLOSE merge(curr,old) old_RTTvar curr_RTTvar CLOSE merge(curr,old)
old_option curr option ESTAB (depends on option) old_option curr option ESTAB (depends on option)
old_ssthresh curr_ssthresh CLOSE merge(curr,old) old_ssthresh curr_ssthresh CLOSE merge(curr,old)
old_snd_cwnd curr_snd_cwnd CLOSE merge(curr,old) old_snd_cwnd curr_snd_cwnd CLOSE merge(curr,old)
Caching PMTU and sendMSS is trivial; reported values are cached, and The table below gives an overview of option-specific information
the most recent values are used. The cache is updated when the MSS that can be similarly shared.
option is received in a SYN or after PMTUD (i.e., when an ICMPv4
Fraqmentation Needed [RFC1191] or ICMPv6 Packet Too Big message is
received [RFC8201] or the equivalent is inferred, e.g. as from
PLPMTUD [RFC4821]), respectively, so the cache always has the most
recent values from any connection. For sendMSS, the cache is
consulted only at connection establishment and not otherwise
updated, which means that MSS options do not affect current
connections. The default sendMSS is never saved; only reported MSS
values update the cache, so an explicit override is required to
reduce the sendMSS. There is no particular benefit to caching MMS_S
and MMS R as these are reported by the local IP stack.
TCP options are copied or merged depending on the details of each TEMPORAL SHARING - Option Info Updates
option, where "merge" is some function that combines the values of
"curr" and "old". E.g., TFO state is updated when a connection is Cached Current when? New Cached
established and read before establishing a new connection. ----------------------------------------------------------------
old_TFO_Cookie old_TFO_Cookie ESTAB old_TFO_Cookie
old_TFO_Failure old_TFO_Failure ESTAB old_TFO_Failure
6.3. Discussion
There is no particular benefit to caching MMS_S and MMS R as these
are reported by the local IP stack. Caching sendMSS and PMTU is
trivial; reported values are cached, and the most recent values are
used. The cache is updated when the MSS option is received in a SYN
or after PMTUD (i.e., when an ICMPv4 Fraqmentation Needed [RFC1191]
or ICMPv6 Packet Too Big message is received [RFC8201] or the
equivalent is inferred, e.g. as from PLPMTUD [RFC4821]),
respectively, so the cache always has the most recent values from
any connection. For sendMSS, the cache is consulted only at
connection establishment and not otherwise updated, which means that
MSS options do not affect current connections. The default sendMSS
is never saved; only reported MSS values update the cache, so an
explicit override is required to reduce the sendMSS.
RTT values are updated by formulae that merge the old and new RTT values are updated by formulae that merge the old and new
values. Dynamic RTT estimation requires a sequence of RTT values. Dynamic RTT estimation requires a sequence of RTT
measurements. As a result, the cached RTT (and its variance) is an measurements. As a result, the cached RTT (and its variance) is an
average of its previous value with the contents of the currently average of its previous value with the contents of the currently
active TCB for that host, when a TCB is closed. RTT values are active TCB for that host, when a TCB is closed. RTT values are
updated only when a connection is closed. The method for merging old updated only when a connection is closed. The method for merging old
and current values needs to attempt to reduce the transient for new and current values needs to attempt to reduce the transient effects
connections. of the new connections.
The updates for RTT, RTTvar and ssthresh rely on existing The updates for RTT, RTTvar and ssthresh rely on existing
information, i.e., old values. Should no such values exist, the information, i.e., old values. Should no such values exist, the
current values are cached instead. current values are cached instead.
TEMPORAL SHARING - Option info Updates TCP options are copied or merged depending on the details of each
option, where "merge" is some function that combines the values of
"curr" and "old". E.g., TFO state is updated when a connection is
established and read before establishing a new connection.
Cached Current when? New Cached Sections 8 and 9 discuss compatibility issues and implications of
---------------------------------------------------------------- sharing the specific information listed above. Section 10 gives an
old_TFO_Cookie old_TFO_Cookie ESTAB old_TFO_Cookie overview of known implementations.
old_TFO_Failure old_TFO_Failure ESTAB old_TFO_Failure Most cached TCB values are updated when a connection closes. The
exceptions are MMS_R and MMS_S, which are reported by IP [RFC1122],
PMTU which is updated after Path MTU Discovery
[RFC1191][RFC4821][RFC8201], and sendMSS, which is updated if the
MSS option is received in the TCP SYN header.
7. An Example of Ensemble Sharing Sharing sendMSS information affects only data in the SYN of the next
connection, because sendMSS information is typically included in
most TCP SYN segments. Caching PMTU can accelerate the efficiency of
PMTUD, but can also result in black-holing until corrected if in
error. Caching MMS_R and MMS_S may be of little direct value as they
are reported by the local IP stack anyway.
The way in which other TCP option state can be shared depends on the
details of that option. E.g., TFO state includes the TCP Fast Open
Cookie [RFC7413] or, in case TFO fails, a negative TCP Fast Open
response. RFC 7413 states, "The client MUST cache negative responses
from the server in order to avoid potential connection failures.
Negative responses include the server not acknowledging the data in
the SYN, ICMP error messages, and (most importantly) no response
(SYN-ACK) from the server at all, i.e., connection timeout." [RFC
7413]. TFOinfo is cached when a connection is established.
Other TCP option state might not be as readily cached. E.g., TCP-AO
[RFC5925] success or failure between a host pair for a single SYN
destination port might be usefully cached. TCP-AO success or failure
to other SYN destination ports on that host pair is never useful to
cache because TCP-AO security parameters can vary per service.
7. Ensemble Sharing
Sharing cached TCB data across concurrent connections requires Sharing cached TCB data across concurrent connections requires
attention to the aggregate nature of some of the shared state. For attention to the aggregate nature of some of the shared state. For
example, although MSS and RTT values can be shared by copying, it example, although MSS and RTT values can be shared by copying, it
may not be appropriate to simply copy congestion window or ssthresh may not be appropriate to simply copy congestion window or ssthresh
information; instead, the new values can be a function (f) of the information; instead, the new values can be a function (f) of the
cumulative values and the number of connections (N). cumulative values and the number of connections (N).
7.1. Initialization of a new TCB
TCBs for new connections can be initialized using context from
concurrent connections as follows:
ENSEMBLE SHARING - TCB Initialization ENSEMBLE SHARING - TCB Initialization
Cached TCB New TCB Cached TCB New TCB
-------------------------------- --------------------------------
old_MMS_S old_MMS_S old_MMS_S old_MMS_S
old_MMS_R old_MMS_R old_MMS_R old_MMS_R
old_sendMSS old_sendMSS old_sendMSS old_sendMSS
skipping to change at page 10, line 5 skipping to change at page 10, line 27
old_RTT old_RTT old_RTT old_RTT
old_RTTvar old_RTTvar old_RTTvar old_RTTvar
old ssthresh sum f(old ssthresh sum, N) old ssthresh sum f(old ssthresh sum, N)
old snd_cwnd sum f(old snd cwnd sum, N) old snd_cwnd sum f(old snd cwnd sum, N)
old_option (option-specific) old_option (option-specific)
Sections 8 and 9 discuss compatibility issues and implications of
sharing the specific information listed above.
The table below gives an overview of option-specific information The table below gives an overview of option-specific information
that can be shared. that can be similarly shared.
ENSEMBLE SHARING Option info ENSEMBLE SHARING - Option Info Initialization
Cached New Cached New
---------------------------------------- ----------------------------------------
old_TFO_Cookie old_TFO_Cookie old_TFO_Cookie old_TFO_Cookie
old_TFO_Failure old_TFO_Failure old_TFO_Failure old_TFO_Failure
7.2. Updates to the new TCB
During the connection, the associated TCB can be updated based on
changes to concurrent connections, as shown below:
ENSEMBLE SHARING - Cache Updates ENSEMBLE SHARING - Cache Updates
Cached TCB Current TCB when? New Cached TCB Cached TCB Current TCB when? New Cached TCB
----------------------------------------------------- -----------------------------------------------------
old_MMS_S curr_MMS_S OPEN curr_MMS_S old_MMS_S curr_MMS_S OPEN curr_MMS_S
old_MMS_R curr_MMS_R OPEN curr_MMS_R old_MMS_R curr_MMS_R OPEN curr_MMS_R
old_sendMSS curr_sendMSS MSSopt curr_sendMSS old_sendMSS curr_sendMSS MSSopt curr_sendMSS
skipping to change at page 10, line 42 skipping to change at page 11, line 28
old_RTT curr_RTT update rtt_update(old,curr) old_RTT curr_RTT update rtt_update(old,curr)
old_RTTvar curr_RTTvar update rtt_update(old,curr) old_RTTvar curr_RTTvar update rtt_update(old,curr)
old ssthresh curr ssthresh update adjust sum as appopriate old ssthresh curr ssthresh update adjust sum as appopriate
old snd_cwnd curr snd_cwnd update adjust sum as appopriate old snd_cwnd curr snd_cwnd update adjust sum as appopriate
old_option curr option (depends) (option specific) old_option curr option (depends) (option specific)
The table below gives an overview of option-specific information
that can be similarly shared.
ENSEMBLE SHARING - Option Info Updates
Cached Current when? New Cached
----------------------------------------------------------------
old_TFO_Cookie old_TFO_Cookie ESTAB old_TFO_Cookie
old_TFO_Failure old_TFO_Failure ESTAB old_TFO_Failure
7.3. Discussion
For ensemble sharing, TCB information should be cached as early as For ensemble sharing, TCB information should be cached as early as
possible, sometimes before a connection is closed. Otherwise, possible, sometimes before a connection is closed. Otherwise,
opening multiple concurrent connections may not result in TCB data opening multiple concurrent connections may not result in TCB data
sharing if no connection closes before others open. The amount of sharing if no connection closes before others open. The amount of
work involved in updating the aggregate average should be minimized, work involved in updating the aggregate average should be minimized,
but the resulting value should be equivalent to having all values but the resulting value should be equivalent to having all values
measured within a single connection. The function "rtt_update" in measured within a single connection. The function "rtt_update" in
the ensemble sharing table indicates this operation, which occurs the ensemble sharing table indicates this operation, which occurs
whenever the RTT would have been updated in the individual TCP whenever the RTT would have been updated in the individual TCP
connection. As a result, the cache contains the shared RTT connection. As a result, the cache contains the shared RTT
variables, which no longer need to reside in the TCB. variables, which no longer need to reside in the TCB.
Congestion window size and ssthresh aggregation are more complicated Congestion window size and ssthresh aggregation are more complicated
in the concurrent case. When there is an ensemble of connections, we in the concurrent case. When there is an ensemble of connections, we
need to decide how that ensemble would have shared these variables, need to decide how that ensemble would have shared these variables,
in order to derive initial values for new TCBs. in order to derive initial values for new TCBs.
ENSEMBLE SHARING - Option info Updates Sections 8 and 9 discuss compatibility issues and implications of
sharing the specific information listed above.
Cached Current when? New Cached
----------------------------------------------------------------
old_TFO_Cookie old_TFO_Cookie ESTAB old_TFO_Cookie
old_TFO_Failure old_TFO_Failure ESTAB old_TFO_Failure
Any assumption of this sharing can be incorrect because identical Any assumption of TCB information sharing can be incorrect because
endpoint address pairs may not share network paths. In current identical endpoint address pairs may not share network paths. In
implementations, new congestion windows are set at an initial value current implementations, new congestion windows are set at an
of 4-10 segments [RFC3390][RFC6928], so that the sum of the current initial value of 4-10 segments [RFC3390][RFC6928], so that the sum
windows is increased for any new connection. This can have of the current windows is increased for any new connection. This can
detrimental consequences where several connections share a highly have detrimental consequences where several connections share a
congested link. highly congested link.
There are several ways to initialize the congestion window in a new There are several ways to initialize the congestion window in a new
TCB among an ensemble of current connections to a host. Current TCP TCB among an ensemble of current connections to a host. Current TCP
implementations initialize it to four segments as standard [rfc3390] implementations initialize it to four segments as standard [rfc3390]
and 10 segments experimentally [RFC6928]. These approaches assume and 10 segments experimentally [RFC6928]. These approaches assume
that new connections should behave as conservatively as possible. that new connections should behave as conservatively as possible.
The algorithm described in [Ba12] adjusts the initial cwnd depending The algorithm described in [Ba12] adjusts the initial cwnd depending
on the cwnd values of ongoing connections. There have also been on the cwnd values of ongoing connections. There have also been
suggestions to use the kind of sharing mechanisms described in this suggestions to use the kind of sharing mechanisms described in this
document over long timescales to adapt TCP's initial window document over long timescales to adapt TCP's initial window
automatically, as described further in Appendix A [To12]. automatically, as described further in Appendix A [To12].
8. Compatibility Issues 8. Compatibility Issues
Here, we discuss various types of problems that may arise with TCB
information sharing.
For the congestion and current window information, the initial For the congestion and current window information, the initial
values computed by TCB interdependence may not be consistent with values computed by TCB interdependence may not be consistent with
the long-term aggregate behavior of a set of concurrent connections the long-term aggregate behavior of a set of concurrent connections
between the same endpoints. Under conventional TCP congestion between the same endpoints. Under conventional TCP congestion
control, if a single existing connection has converged to a control, if a single existing connection has converged to a
congestion window of 40 segments, two newly joining concurrent congestion window of 40 segments, two newly joining concurrent
connections assume initial windows of 10 segments [RFC6928], and the connections assume initial windows of 10 segments [RFC6928], and the
current connection's window doesn't decrease to accommodate this current connection's window doesn't decrease to accommodate this
additional load and connections can mutually interfere. One example additional load and connections can mutually interfere. One example
of this is seen on low-bandwidth, high-delay links, where concurrent of this is seen on low-bandwidth, high-delay links, where concurrent
connections supporting Web traffic can collide because their initial connections supporting Web traffic can collide because their initial
windows were too large, even when set at one segment. windows were too large, even when set at one segment.
The authors of [Hu12] recommend caching ssthresh for temporal The authors of [Hu12] recommend caching ssthresh for temporal
sharing only when flows are long. Some studies suggest that sharing sharing only when flows are long. Some studies suggest that sharing
ssthresh between short flows can deteriorate the performance of ssthresh between short flows can deteriorate the performance of
individual connections [Hu12, Du16], although this may benefit individual connections [Hu12, Du16], although this may benefit
aggregate network performance. aggregate network performance.
Due to mechanisms like ECMP and LAG [RFC7424], TCP connections 8.1. Traversing the same network path
sharing the same host-pair may not always share the same path. This
does not matter for host-specific information such as RWIN and TCP TCP is sometimes used in situations where packets of the same host-
option state, such as TFOinfo. When TCB information is shared across pair do not always take the same path. Multipath routing that relies
different SYN destination ports, path-related information can be on examining transport headers, such as ECMP and LAG [RFC7424], may
incorrect; however, the impact of this error is potentially not result in repeatable path selection when TCP segments are
diminished if (as discussed here) TCB sharing affects only the encapsulated, encrypted, or altered - for example, in some Virtual
transient event of a connection start or if TCB information is Private Network (VPN) tunnels that rely on proprietary
shared only within connections to the same SYN destination port. In encapsulation. Similarly, such approaches cannot operate
case of Temporal Sharing, TCB information could also become invalid deterministically when the TCP header is encrypted, e.g., when using
over time. Because this is similar to the case when a connection IPsec ESP (although TCB interdependence among the entire set sharing
becomes idle, mechanisms that address idle TCP connections (e.g., the same endpoint IP addresses should work without problems when the
[RFC7661]) could also be applied to TCB cache management, especially TCP header is encrypted). Measures to increase the probability that
when TCP Fast Open is used [RFC7413]. connections use the same path could be applied: e.g., the
connections could be given the same IPv6 flow label. TCB
interdependence can also be extended to sets of host IP address
pairs that share the same network path conditions, such as when a
group of addresses is on the same LAN (see Section 9).
Traversing the same path is not important for host-specific
information such as RWIN and TCP option state, such as TFOinfo. When
TCB information is shared across different SYN destination ports,
path-related information can be incorrect; however, the impact of
this error is potentially diminished if (as discussed here) TCB
sharing affects only the transient event of a connection start or if
TCB information is shared only within connections to the same SYN
destination port. In case of Temporal Sharing, TCB information could
also become invalid over time. Because this is similar to the case
when a connection becomes idle, mechanisms that address idle TCP
connections (e.g., [RFC7661]) could also be applied to TCB cache
management, especially when TCP Fast Open is used [RFC7413].
8.2. State dependence
There may be additional considerations to the way in which TCB There may be additional considerations to the way in which TCB
interdependence rebalances congestion feedback among the current interdependence rebalances congestion feedback among the current
connections, e.g., it may be appropriate to consider the impact of a connections, e.g., it may be appropriate to consider the impact of a
connection being in Fast Recovery [RFC5681] or some other similar connection being in Fast Recovery [RFC5681] or some other similar
unusual feedback state, e.g., as inhibiting or affecting the unusual feedback state, e.g., as inhibiting or affecting the
calculations described herein. calculations described herein.
TCP is sometimes used in situations where packets of the same host- 8.3. Problems with IP sharing
pair do not always take the same path. Multipath routing that relies
on examining transport headers, such as ECMP and LAG, may not result
in repeatable path selection when TCP segments are encapsulated,
encrypted, or altered - for example, in some Virtual Private Network
(VPN) tunnels that rely on proprietary encapsulation. Similarly,
such approaches cannot operate deterministically when the TCP header
is encrypted, e.g., when using IPsec ESP. TCB interdependence among
the entire set sharing the same endpoint IP addresses should work
without problems under these circumstances. Moreover, measures to
increase the probability that connections use the same path could be
applied: e.g., the connections could be given the same IPv6 flow
label. TCB interdependence can also be extended to sets of host IP
address pairs that share the same network path conditions, such as
when a group of addresses is on the same LAN (see Section 9).
It can be wrong to share TCB information between TCP connections on It can be wrong to share TCB information between TCP connections on
the same host as identified by the IP address if an IP address is the same host as identified by the IP address if an IP address is
assigned to a new host (e.g., IP address spinning, as is used by assigned to a new host (e.g., IP address spinning, as is used by
ISPs to inhibit running servers). It can be wrong if Network Address ISPs to inhibit running servers). It can be wrong if Network Address
(and Port) Translation (NA(P)T) [RFC2663] or any other IP sharing (and Port) Translation (NA(P)T) [RFC2663] or any other IP sharing
mechanism is used. Such mechanisms are less likely to be used with mechanism is used. Such mechanisms are less likely to be used with
IPv6. Other methods to identify a host could also be considered to IPv6. Other methods to identify a host could also be considered to
make correct TCB sharing more likely. Moreover, some TCB information make correct TCB sharing more likely. Moreover, some TCB information
is about dominant path properties rather than the specific host. IP is about dominant path properties rather than the specific host. IP
skipping to change at page 13, line 34 skipping to change at page 14, line 34
[RFC7231]. Protocols like HTTP/2 [RFC7540] avoid connection [RFC7231]. Protocols like HTTP/2 [RFC7540] avoid connection
reestablishment costs by serializing or multiplexing a set of per- reestablishment costs by serializing or multiplexing a set of per-
host connections across a single TCP connection. This avoids TCP's host connections across a single TCP connection. This avoids TCP's
per-connection OPEN handshake and also avoids recomputing the MSS, per-connection OPEN handshake and also avoids recomputing the MSS,
RTT, and congestion window values. By avoiding the so-called, "slow- RTT, and congestion window values. By avoiding the so-called, "slow-
start restart," performance can be optimized [Hu01]. TCB start restart," performance can be optimized [Hu01]. TCB
interdependence can provide the "slow-start restart avoidance" of interdependence can provide the "slow-start restart avoidance" of
multiplexing, without requiring a multiplexing mechanism at the multiplexing, without requiring a multiplexing mechanism at the
application layer. application layer.
Like the initial version of this document [RFC2140], this update's
approach to TCB interdependence focuses on sharing a set of TCBs by
updating the TCB state to reduce the impact of transients when
connections begin or end. Other mechanisms have since been proposed
to continuously share information between all ongoing communication
(including connectionless protocols), updating the congestion state
during any congestion-related event (e.g., timeout, loss
confirmation, etc.) [RFC3124]. By dealing exclusively with
transients, TCB interdependence is more likely to exhibit the same
behavior as unmodified, independent TCP connections.
9.1. Layering
TCB interdependence pushes some of the TCP implementation from the TCB interdependence pushes some of the TCP implementation from the
traditional transport layer (in the ISO model), to the network traditional transport layer (in the ISO model), to the network
layer. This acknowledges that some state is in fact per-host-pair or layer. This acknowledges that some state is in fact per-host-pair or
can be per-path as indicated solely by that host-pair. Transport can be per-path as indicated solely by that host-pair. Transport
protocols typically manage per-application-pair associations (per protocols typically manage per-application-pair associations (per
stream), and network protocols manage per-host-pair and path stream), and network protocols manage per-host-pair and path
associations (routing). Round-trip time, MSS, and congestion associations (routing). Round-trip time, MSS, and congestion
information could be more appropriately handled in a network-layer information could be more appropriately handled in a network-layer
fashion, aggregated among concurrent connections, and shared across fashion, aggregated among concurrent connections, and shared across
connection instances [RFC3124]. connection instances [RFC3124].
skipping to change at page 14, line 8 skipping to change at page 15, line 20
An earlier version of RTT sharing suggested implementing RTT state An earlier version of RTT sharing suggested implementing RTT state
at the IP layer, rather than at the TCP layer. Our observations at the IP layer, rather than at the TCP layer. Our observations
describe sharing state among TCP connections, which avoids some of describe sharing state among TCP connections, which avoids some of
the difficulties in an IP-layer solution. One such problem of an IP the difficulties in an IP-layer solution. One such problem of an IP
layer solution is determining the correspondence between packet layer solution is determining the correspondence between packet
exchanges using IP header information alone, where such exchanges using IP header information alone, where such
correspondence is needed to compute RTT. Because TCB sharing correspondence is needed to compute RTT. Because TCB sharing
computes RTTs inside the TCP layer using TCP header information, it computes RTTs inside the TCP layer using TCP header information, it
can be implemented more directly and simply than at the IP layer. can be implemented more directly and simply than at the IP layer.
This is a case where information should be computed at the transport This is a case where information should be computed at the transport
layer, but could be shared at the network layer. layer but could be shared at the network layer.
9.2. Other possibilities
Per-host-pair associations are not the limit of these techniques. It Per-host-pair associations are not the limit of these techniques. It
is possible that TCBs could be similarly shared between hosts on a is possible that TCBs could be similarly shared between hosts on a
subnet or within a cluster, because the predominant path can be subnet or within a cluster, because the predominant path can be
subnet-subnet, rather than host-host. Additionally, TCB subnet-subnet, rather than host-host. Additionally, TCB
interdependence can be applied to any protocol with congestion interdependence can be applied to any protocol with congestion
state, including SCTP [RFC4960] and DCCP [RFC4340], as well as for state, including SCTP [RFC4960] and DCCP [RFC4340], as well as for
individual subflows in Multipath TCP [RFC6824]. individual subflows in Multipath TCP [RFC6824].
There may be other information that can be shared between concurrent There may be other information that can be shared between concurrent
connections. For example, knowing that another connection has just connections. For example, knowing that another connection has just
tried to expand its window size and failed, a connection may not tried to expand its window size and failed, a connection may not
attempt to do the same for some period. The idea is that existing attempt to do the same for some period. The idea is that existing
TCP implementations infer the behavior of all competing connections, TCP implementations infer the behavior of all competing connections,
including those within the same host or subnet. One possible including those within the same host or subnet. One possible
optimization is to make that implicit feedback explicit, via optimization is to make that implicit feedback explicit, via
extended information associated with the endpoint IP address and its extended information associated with the endpoint IP address and its
TCP implementation, rather than per-connection state in the TCB. TCP implementation, rather than per-connection state in the TCB.
Like the initial version of this document [RFC2140], this update's
approach to TCB interdependence focuses on sharing a set of TCBs by
updating the TCB state to reduce the impact of transients when
connections begin or end. Other mechanisms have since been proposed
to continuously share information between all ongoing communication
(including connectionless protocols), updating the congestion state
during any congestion-related event (e.g., timeout, loss
confirmation, etc.) [RFC3124]. By dealing exclusively with
transients, TCB interdependence is more likely to exhibit the same
behavior as unmodified, independent TCP connections.
10. Implementation Observations 10. Implementation Observations
The observation that some TCB state is host-pair specific rather The observation that some TCB state is host-pair specific rather
than application-pair dependent is not new and is a common than application-pair dependent is not new and is a common
engineering decision in layered protocol implementations. Although engineering decision in layered protocol implementations. Although
now deprecated, T/TCP [RFC1644] was the first to propose using now deprecated, T/TCP [RFC1644] was the first to propose using
caches in order to maintain TCB states (see Appendix A for more caches in order to maintain TCB states (see Appendix A for more
information). information).
The table below describes the current implementation status for some The table below describes the current implementation status for some
TCB information in Linux kernel version 4.6, FreeBSD 10 and Windows TCB information in Linux kernel version 4.6, FreeBSD 10 and Windows
(as of October 2016). In the table, "shared" only refers to temporal (as of October 2016). In the table, "shared" only refers to temporal
sharing. sharing.
CURRENT IMPLEMENTATION STATUS (as of 2016)
TCB data Status TCB data Status
----------------------------------------------------------- -----------------------------------------------------------
old MMS_S Not shared old MMS_S Not shared
old MMS_R Not shared old MMS_R Not shared
old_sendMSS Cached and shared in Linux (MSS) old_sendMSS Cached and shared in Linux (MSS)
old PMTU Cached and shared in FreeBSD and Windows (PMTU) old PMTU Cached and shared in FreeBSD and Windows (PMTU)
skipping to change at page 16, line 18 skipping to change at page 17, line 21
sharing over long timescales to adapt TCP's initial window sharing over long timescales to adapt TCP's initial window
automatically, largely imported from [To12]. automatically, largely imported from [To12].
Finally, this document updates and significantly expands the Finally, this document updates and significantly expands the
referenced literature. referenced literature.
12. Security Considerations 12. Security Considerations
These presented implementation methods do not have additional These presented implementation methods do not have additional
ramifications for explicit attacks. They may be susceptible to ramifications for explicit attacks. They may be susceptible to
denial-of-service attacks if not otherwise secured. For example, an denial-of-service attacks if not otherwise secured.
application can open a connection and set its window size to zero,
denying service to any other subsequent connection between those
hosts.
TCB sharing may be susceptible to denial-of-service attacks, TCB sharing may be susceptible to denial-of-service attacks,
wherever the TCB is shared, between connections in a single host, or wherever the TCB is shared, between connections in a single host, or
between hosts if TCB sharing is implemented within a subnet (see between hosts if TCB sharing is implemented within a subnet (see
Implications section). Some shared TCB parameters are used only to Implications section). Some shared TCB parameters are used only to
create new TCBs, others are shared among the TCBs of ongoing create new TCBs, others are shared among the TCBs of ongoing
connections. New connections can join the ongoing set, e.g., to connections. New connections can join the ongoing set, e.g., to
optimize send window size among a set of connections to the same optimize send window size among a set of connections to the same
host. host.
Attacks on parameters used only for initialization affect only the Attacks on parameters used only for initialization affect only the
transient performance of a TCP connection. For short connections, transient performance of a TCP connection. For short connections,
the performance ramification can approach that of a denial-of- the performance ramification can approach that of a denial-of-
service attack. E.g., if an application changes its TCB to have a service attack. E.g., if an application changes its TCB to have a
false and small window size, subsequent connections would experience false and small window size, subsequent connections would experience
performance degradation until their window grew appropriately. performance degradation until their window grew appropriately.
TCB sharing reuses and mixes information from past and current
connections. Although reusing information could create a potential
for fingerprinting to identify hosts, the mixing reduces that
potential. There has been no evidence of fingerprinting based on
this technique and it is currently considered safe in that regard.
13. IANA Considerations 13. IANA Considerations
There are no IANA implications or requests in this document. There are no IANA implications or requests in this document.
This section should be removed upon final publication as an RFC. This section should be removed upon final publication as an RFC.
14. References 14. References
14.1. Normative References 14.1. Normative References
[RFC793] Postel, Jon, "Transmission Control Protocol," Network
Working Group RFC-793/STD-7, ISI, Sept. 1981.
[RFC1122] Braden, R. (ed), "Requirements for Internet Hosts --
Communication Layers", RFC-1122, Oct. 1989.
[RFC1191] Mogul, J., Deering, S., "Path MTU Discovery," RFC 1191,
Nov. 1990.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4821] Mathis, M., Heffner, J., "Packetization Layer Path MTU
Discovery," RFC 4821, Mar. 2007.
[RFC5681] Allman, M., Paxson, V., Blanton, E., "TCP Congestion
Control," RFC 5681 (Standards Track), Sep. 2009.
[RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., Jain, A., "TCP Fast
Open", RFC 7413, Dec. 2014.
[RFC8174] Leiba., B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba., B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", RFC 8174, May 2017. 2119 Key Words", RFC 8174, May 2017.
[RFC8201] McCann, J., Deering. S., Mogul, J., Hinden, R. (Ed.),
"Path MTU Discovery for IP version 6," RFC 8201, Jul.
2017.
14.2. Informative References 14.2. Informative References
[Al10] Allman, M., "Initial Congestion Window Specification", [Al10] Allman, M., "Initial Congestion Window Specification",
(work in progress), draft-allman-tcpm-bump-initcwnd-00, (work in progress), draft-allman-tcpm-bump-initcwnd-00,
Nov. 2010. Nov. 2010.
[Ba12] Barik, R., Welzl, M., Ferlin, S., Alay, O., " LISA: A [Ba12] Barik, R., Welzl, M., Ferlin, S., Alay, O., " LISA: A
Linked Slow-Start Algorithm for MPTCP", IEEE ICC, Kuala Linked Slow-Start Algorithm for MPTCP", IEEE ICC, Kuala
Lumpur, Malaysia, May 23-27 2016. Lumpur, Malaysia, May 23-27 2016.
skipping to change at page 17, line 48 skipping to change at page 19, line 29
Start Restart After Idle", draft-hughes-restart-00 Start Restart After Idle", draft-hughes-restart-00
(expired), Dec. 2001. (expired), Dec. 2001.
[Hu12] Hurtig, P., Brunstrom, A., "Enhanced metric caching for [Hu12] Hurtig, P., Brunstrom, A., "Enhanced metric caching for
short TCP flows," 2012 IEEE International Conference on short TCP flows," 2012 IEEE International Conference on
Communications (ICC), Ottawa, ON, 2012, pp. 1209-1213. Communications (ICC), Ottawa, ON, 2012, pp. 1209-1213.
[Ja88] Jacobson, V., M. Karels, "Congestion Avoidance and [Ja88] Jacobson, V., M. Karels, "Congestion Avoidance and
Control", Proc. Sigcomm 1988. Control", Proc. Sigcomm 1988.
[RFC793] Postel, Jon, "Transmission Control Protocol," Network
Working Group RFC-793/STD-7, ISI, Sept. 1981.
[RFC1122] Braden, R. (ed), "Requirements for Internet Hosts --
Communication Layers", RFC-1122, Oct. 1989.
[RFC1191] Mogul, J., Deering, S., "Path MTU Discovery," RFC 1191,
Nov. 1990.
[RFC1644] Braden, R., "T/TCP -- TCP Extensions for Transactions [RFC1644] Braden, R., "T/TCP -- TCP Extensions for Transactions
Functional Specification," RFC-1644, July 1994. Functional Specification," RFC-1644, July 1994.
[RFC1379] Braden, R., "Transaction TCP -- Concepts," RFC-1379, [RFC1379] Braden, R., "Transaction TCP -- Concepts," RFC-1379,
September 1992. September 1992.
[RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast [RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast
Retransmit, and Fast Recovery Algorithms", RFC2001 Retransmit, and Fast Recovery Algorithms", RFC2001
(Standards Track), Jan. 1997. (Standards Track), Jan. 1997.
skipping to change at page 18, line 46 skipping to change at page 20, line 17
[RFC3390] Allman, M., Floyd, S., Partridge, C., "Increasing TCP's [RFC3390] Allman, M., Floyd, S., Partridge, C., "Increasing TCP's
Initial Window," RFC 3390, Oct. 2002. Initial Window," RFC 3390, Oct. 2002.
[RFC3124] Balakrishnan, H., Seshan, S., "The Congestion Manager," [RFC3124] Balakrishnan, H., Seshan, S., "The Congestion Manager,"
RFC 3124, June 2001. RFC 3124, June 2001.
[RFC4340] Kohler, E., Handley, M., Floyd, S., "Datagram Congestion [RFC4340] Kohler, E., Handley, M., Floyd, S., "Datagram Congestion
Control Protocol (DCCP)," RFC 4340, Mar. 2006. Control Protocol (DCCP)," RFC 4340, Mar. 2006.
[RFC4821] Mathis, M., Heffner, J., "Packetization Layer Path MTU
Discovery," RFC 4821, Mar. 2007.
[RFC4960] Stewart, R., (Ed.), "Stream Control Transmission [RFC4960] Stewart, R., (Ed.), "Stream Control Transmission
Protocol," RFC4960, Sept. 2007. Protocol," RFC4960, Sept. 2007.
[RFC5681] Allman, M., Paxson, V., Blanton, E., "TCP Congestion
Control," RFC 5681 (Standards Track), Sep. 2009.
[RFC5925] Touch, J., Mankin, A., Bonica, R., "The TCP Authentication [RFC5925] Touch, J., Mankin, A., Bonica, R., "The TCP Authentication
Option," RFC 5925, June 2010. Option," RFC 5925, June 2010.
[RFC6824] Ford, A., Raiciu, C., Handley, M., Bonaventure, O., "TCP [RFC6824] Ford, A., Raiciu, C., Handley, M., Bonaventure, O., "TCP
Extensions for Multipath Operation with Multiple Extensions for Multipath Operation with Multiple
Addresses," RFC 6824, Jan. 2013. Addresses," RFC 6824, Jan. 2013.
[RFC6928] Chu, J., Dukkipati, N., Cheng, Y., Mathis, M., "Increasing [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., Mathis, M., "Increasing
TCP's Initial Window," RFC 6928, Apr. 2013. TCP's Initial Window," RFC 6928, Apr. 2013.
[RFC7231] Fielding, R., J. Reshke, Eds., "HTTP/1.1 Semantics and [RFC7231] Fielding, R., J. Reshke, Eds., "HTTP/1.1 Semantics and
Content," RFC-7231, June 2014. Content," RFC-7231, June 2014.
[RFC7323] Borman, D., B. Braden, V. Jacobson, R. Scheffenegger [RFC7323] Borman, D., B. Braden, V. Jacobson, R. Scheffenegger
(Ed.), "TCP Extensions for High Performance," RFC 7323, (Ed.), "TCP Extensions for High Performance," RFC 7323,
Sept. 2014. Sept. 2014.
[RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., Jain, A., "TCP Fast
Open", RFC 7413, Dec. 2014.
[RFC7424] Krishnan, R., Yong, L., Ghanwani, A., So, N., Khasnabish, [RFC7424] Krishnan, R., Yong, L., Ghanwani, A., So, N., Khasnabish,
B., "Mechanisms for Optimizing Link Aggregation Group B., "Mechanisms for Optimizing Link Aggregation Group
(LAG) and Equal-Cost Multipath (ECMP) Component Link (LAG) and Equal-Cost Multipath (ECMP) Component Link
Utilization in Networks", RFC 7424, Jan. 2015 Utilization in Networks", RFC 7424, Jan. 2015
[RFC7540] Belshe, M., Peon, R., Thomson, M., "Hypertext Transfer [RFC7540] Belshe, M., Peon, R., Thomson, M., "Hypertext Transfer
Protocol Version 2 (HTTP/2)", RFC 7540, May 2015. Protocol Version 2 (HTTP/2)", RFC 7540, May 2015.
[RFC7661] Fairhurst, G., Sathiaseelan, A., Secchi, R., "Updating TCP [RFC7661] Fairhurst, G., Sathiaseelan, A., Secchi, R., "Updating TCP
to Support Rate-Limited Traffic", RFC 7661, Oct. 2015. to Support Rate-Limited Traffic", RFC 7661, Oct. 2015.
[RFC8201] McCann, J., Deering. S., Mogul, J., Hinden, R. (Ed.),
"Path MTU Discovery for IP version 6," RFC 8201, Jul.
2017.
[To12] Touch, J., "Automating the Initial Window in TCP," draft- [To12] Touch, J., "Automating the Initial Window in TCP," draft-
touch-tcpm-automatic-iw-03 (expired), July 2012. touch-tcpm-automatic-iw-03 (expired), July 2012.
15. Acknowledgments 15. Acknowledgments
The authors would like to thank for Praveen Balasubramanian for The authors would like to thank for Praveen Balasubramanian for
information regarding TCB sharing in Windows, and Yuchung Cheng, information regarding TCB sharing in Windows, and Yuchung Cheng,
Lars Eggert, Ilpo Jarvinen and Michael Scharf for comments on Lars Eggert, Ilpo Jarvinen and Michael Scharf for comments on
earlier versions of the draft. Earlier revisions of this work earlier versions of the draft. Earlier revisions of this work
received funding from a collaborative research project between the received funding from a collaborative research project between the
University of Oslo and Huawei Technologies Co., Ltd. and were partly University of Oslo and Huawei Technologies Co., Ltd. and were partly
supported by USC/ISI's Postel Center. supported by USC/ISI's Postel Center.
This document was prepared using 2-Word-v2.0.template.dot. This document was prepared using 2-Word-v2.0.template.dot.
16. Change log 16. Change log
This section should be removed upon final publication as an RFC. This section should be removed upon final publication as an RFC.
ietf-02:
- Minor reorganization and correction of typographic errors
- Added text to address fingerprinting in Security section
- Now retains Appendix B and body option tables upon publication
ietf-01: ietf-01:
- Added Appendix C to address long-timescale temporal adaptation. - Added Appendix C to address long-timescale temporal adaptation.
ietf-00: ietf-00:
- Re-issued as draft-ietf-tcpm-2140bis due to WG adoption. - Re-issued as draft-ietf-tcpm-2140bis due to WG adoption.
- Cleaned orphan references to T/TCP, removed incomplete refs - Cleaned orphan references to T/TCP, removed incomplete refs
- Moved references to informative section and updated Sec 2 - Moved references to informative section and updated Sec 2
- Updated to clarify no impact to interoperability - Updated to clarify no impact to interoperability
skipping to change at page 21, line 14 skipping to change at page 22, line 26
- Marked entries that are considered safe to share with an - Marked entries that are considered safe to share with an
asterisk (suggestion was to split the table) asterisk (suggestion was to split the table)
- Discussed correct host identification: NATs may make IP - Discussed correct host identification: NATs may make IP
addresses the wrong input, could e.g. use HTTP cookie. addresses the wrong input, could e.g. use HTTP cookie.
- Included MMS_S and MMS_R from RFC1122; fixed the use of MSS and - Included MMS_S and MMS_R from RFC1122; fixed the use of MSS and
MTU MTU
- Added information about option sharing, listed options in - Added information about option sharing, listed options in 0
Appendix B
Authors' Addresses Authors' Addresses
Joe Touch Joe Touch
Manhattan Beach, CA 90266 Manhattan Beach, CA 90266
USA USA
Phone: +1 (310) 560-0334 Phone: +1 (310) 560-0334
Email: touch@strayalpha.com Email: touch@strayalpha.com
skipping to change at page 21, line 34 skipping to change at page 23, line 4
Email: touch@strayalpha.com Email: touch@strayalpha.com
Michael Welzl Michael Welzl
University of Oslo University of Oslo
PO Box 1080 Blindern PO Box 1080 Blindern
Oslo N-0316 Oslo N-0316
Norway Norway
Phone: +47 22 85 24 20 Phone: +47 22 85 24 20
Email: michawe@ifi.uio.no Email: michawe@ifi.uio.no
Safiqul Islam Safiqul Islam
University of Oslo University of Oslo
PO Box 1080 Blindern PO Box 1080 Blindern
Oslo N-0316 Oslo N-0316
Norway Norway
Phone: +47 22 84 08 37 Phone: +47 22 84 08 37
Email: safiquli@ifi.uio.no Email: safiquli@ifi.uio.no
Appendix A: TCB sharing history Appendix A: TCB Sharing History
T/TCP proposed using caches to maintain TCB information across T/TCP proposed using caches to maintain TCB information across
instances (temporal sharing), e.g., smoothed RTT, RTT variance, instances (temporal sharing), e.g., smoothed RTT, RTT variance,
congestion avoidance threshold, and MSS [RFC1644]. These values were congestion avoidance threshold, and MSS [RFC1644]. These values were
in addition to connection counts used by T/TCP to accelerate data in addition to connection counts used by T/TCP to accelerate data
delivery prior to the full three-way handshake during an OPEN. The delivery prior to the full three-way handshake during an OPEN. The
goal was to aggregate TCB components where they reflect one goal was to aggregate TCB components where they reflect one
association - that of the host-pair, rather than artificially association - that of the host-pair, rather than artificially
separating those components by connection. separating those components by connection.
At least one T/TCP implementation saved the MSS and aggregated the At least one T/TCP implementation saved the MSS and aggregated the
RTT parameters across multiple connections, but omitted caching the RTT parameters across multiple connections but omitted caching the
congestion window information [Br94], as originally specified in congestion window information [Br94], as originally specified in
[RFC1379]. Some T/TCP implementations immediately updated MSS when [RFC1379]. Some T/TCP implementations immediately updated MSS when
the TCP MSS header option was received [Br94], although this was not the TCP MSS header option was received [Br94], although this was not
addressed specifically in the concepts or functional specification addressed specifically in the concepts or functional specification
[RFC1379][RFC1644]. In later T/TCP implementations, RTT values were [RFC1379][RFC1644]. In later T/TCP implementations, RTT values were
updated only after a CLOSE, which does not benefit concurrent updated only after a CLOSE, which does not benefit concurrent
sessions. sessions.
Temporal sharing of cached TCB data was originally implemented in Temporal sharing of cached TCB data was originally implemented in
the SunOS 4.1.3 T/TCP extensions [Br94] and the FreeBSD port of same the SunOS 4.1.3 T/TCP extensions [Br94] and the FreeBSD port of same
[FreeBSD]. As mentioned before, only the MSS and RTT parameters were [FreeBSD]. As mentioned before, only the MSS and RTT parameters were
cached, as originally specified in [RFC1379]. Later discussion of cached, as originally specified in [RFC1379]. Later discussion of
T/TCP suggested including congestion control parameters in this T/TCP suggested including congestion control parameters in this
cache; for example, [RFC1644] (Section 3.1) hints at initializing cache; for example, [RFC1644] (Section 3.1) hints at initializing
the congestion window to the old window size. the congestion window to the old window size.
Appendix B: TCP Option Sharing and Caching Appendix B: TCP Option Sharing and Caching
In addition to the options that can be cached and shared, this memo In addition to the options that can be cached and shared, this memo
also lists known options for which state is unsafe to be kept. This also lists known options for which state is unsafe to be kept. This
list is meant to avoid work duplication and should be removed upon list is not intended to be authoritative or exhaustive.
publication.
Obsolete (unsafe to keep state): Obsolete (unsafe to keep state):
ECHO ECHO
ECHO REPLY ECHO REPLY
PO Conn permitted PO Conn permitted
PO service profile PO service profile
 End of changes. 56 change blocks. 
189 lines changed or deleted 248 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/