draft-ietf-tcpm-2140bis-04.txt   draft-ietf-tcpm-2140bis-05.txt 
TCPM WG J. Touch TCPM WG J. Touch
Internet Draft Independent Internet Draft Independent
Intended status: Informational M. Welzl Intended status: Informational M. Welzl
Obsoletes: 2140 S. Islam Obsoletes: 2140 S. Islam
Expires: October 2020 University of Oslo Expires: October 2020 University of Oslo
April 29, 2020 April 29, 2020
TCP Control Block Interdependence TCP Control Block Interdependence
draft-ietf-tcpm-2140bis-04.txt draft-ietf-tcpm-2140bis-05.txt
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
This document may contain material from IETF Documents or IETF This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this 10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow material may not have granted the IETF Trust the right to allow
skipping to change at page 2, line 44 skipping to change at page 2, line 44
backward-compatibility with existing implementations. The sharing backward-compatibility with existing implementations. The sharing
described herein is limited to only the TCB initialization and so described herein is limited to only the TCB initialization and so
has no effect on the long-term behavior of TCP after a connection has no effect on the long-term behavior of TCP after a connection
has been established. has been established.
Table of Contents Table of Contents
1. Introduction...................................................3 1. Introduction...................................................3
2. Conventions Used in This Document..............................4 2. Conventions Used in This Document..............................4
3. Terminology....................................................4 3. Terminology....................................................4
4. The TCP Control Block (TCB)....................................4 4. The TCP Control Block (TCB)....................................6
5. TCB Interdependence............................................5 5. TCB Interdependence............................................6
6. Temporal Sharing...............................................6 6. Temporal Sharing...............................................7
6.1. Initialization of the new TCB................................6 6.1. Initialization of the new TCB................................7
6.2. Updates to the new TCB.......................................7 6.2. Updates to the new TCB.......................................8
6.3. Discussion...................................................7 6.3. Discussion...................................................9
7. Ensemble Sharing...............................................9 7. Ensemble Sharing..............................................10
7.1. Initialization of a new TCB..................................9 7.1. Initialization of a new TCB.................................10
7.2. Updates to the new TCB......................................10 7.2. Updates to the new TCB......................................11
7.3. Discussion..................................................11 7.3. Discussion..................................................12
8. Compatibility Issues..........................................12 8. Compatibility Issues..........................................13
8.1. Traversing the same network path............................12 8.1. Traversing the same network path............................14
8.2. State dependence............................................13 8.2. State dependence............................................14
8.3. Problems with IP sharing....................................13 8.3. Problems with IP sharing....................................15
9. Implications..................................................13 9. Implications..................................................15
9.1. Layering....................................................14 9.1. Layering....................................................15
9.2. Other possibilities.........................................14 9.2. Other possibilities.........................................16
10. Implementation Observations..................................15 10. Implementation Observations..................................16
11. Updates to RFC 2140..........................................16 11. Updates to RFC 2140..........................................17
12. Security Considerations......................................16 12. Security Considerations......................................18
13. IANA Considerations..........................................17 13. IANA Considerations..........................................18
14. References...................................................17 14. References...................................................19
14.1. Normative References....................................17 14.1. Normative References....................................19
14.2. Informative References..................................18 14.2. Informative References..................................19
15. Acknowledgments..............................................20 15. Acknowledgments..............................................21
16. Change log...................................................20 16. Change log...................................................22
Appendix A : TCB Sharing History.................................23 Appendix A : TCB Sharing History.................................25
Appendix B : TCP Option Sharing and Caching......................24 Appendix B : TCP Option Sharing and Caching......................26
Appendix C : Automating the Initial Window in TCP over Long Appendix C : Automating the Initial Window in TCP over Long
Timescales.......................................................26 Timescales.......................................................28
C.1. Introduction.............................................26 C.1. Introduction.............................................28
C.2. Design Considerations....................................26 C.2. Design Considerations....................................28
C.3. Proposed IW Algorithm....................................27 C.3. Proposed IW Algorithm....................................29
C.4. Discussion...............................................30 C.4. Discussion...............................................32
C.5. Observations.............................................31 C.5. Observations.............................................33
1. Introduction 1. Introduction
TCP is a connection-oriented reliable transport protocol layered TCP is a connection-oriented reliable transport protocol layered
over IP [RFC793]. Each TCP connection maintains state, usually in a over IP [RFC793]. Each TCP connection maintains state, usually in a
data structure called the TCP Control Block (TCB). The TCB contains data structure called the TCP Control Block (TCB). The TCB contains
information about the connection state, its associated local information about the connection state, its associated local
process, and feedback parameters about the connection's transmission process, and feedback parameters about the connection's transmission
properties. As originally specified and usually implemented, most properties. As originally specified and usually implemented, most
TCB information is maintained on a per-connection basis. Some TCB information is maintained on a per-connection basis. Some
skipping to change at page 4, line 32 skipping to change at page 4, line 32
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
However, this document is intended to describe behavior that is However, this document is intended to describe behavior that is
already permitted by TCP standards. As a result, it provides already permitted by TCP standards. As a result, it provides
informative guidance but does not use such normative language, informative guidance but does not use such normative language,
except when quoting other documents. except when quoting other documents.
3. Terminology 3. Terminology
The following terminology is used frequently in this document. Items
preceded with a "+" may be part of the state maintained as TCP
connection state in the associated connections TCB and are the focus
of sharing as described in this document.
+cwnd - the TCP congestion window size [RFC5681]
Host - a source or sink of TCP segments associated with a single IP Host - a source or sink of TCP segments associated with a single IP
address address
Host-pair - a pair of hosts and their corresponding IP addresses Host-pair - a pair of hosts and their corresponding IP addresses
+MMS_R - the maximum message size that can be received, the largest
received transport payload of an IP datagram [RFC1122]
+MMS_S - the maximum message size that can be sent, the largest
transmitted transport payload of an IP datagram [RFC1122]
Path - an Internet path between the IP addresses of two hosts Path - an Internet path between the IP addresses of two hosts
PCB - protocol control block, the data associated with a protocol as
maintained by an endpoint; a TCP PCB is called a TCB
PLPMTUD - packetization-layer path MTU discovery, a mechanism that
uses transport packets to discovery the PMTU [RFC4821]
+PMTU - the largest IP datagram that can traverse a path
[RFC1191][RFC8201]
PMTUD - path-layer MTU discovery, a mechanism that relies on ICMP
error messages to discover the PMTU [RFC1191][RFC8201]
+RTT - the round-trip time of a TCP packet exchange [RFC793]
+RTTvar - the variance of the round-trip times of a TCP packet
exchange [RFC6298]
+RWIN - the TCP receive window size [RFC793]
+sendcwnd - the TCP send-side congestion window (cwnd) size
[RFC5681]
+sendMSS - the TCP maximum segment size, a value transmitted in a
TCP option that represents the largest TCP user data payload that
can be received [RFC793]
+ssthresh - the TCP slow-start threshold [RFC5681]
TCB - TCP Control Block, the data associated with a TCP connection
as maintained by an endpoint
TCP-AO - the TCP Authentication Option [RFC5925]
TFO - TCP Fast Open option [RFC7413]
+TFO_cookie - the TCP Fast Open cookie, state that is used as part
of the TFO mechanism, when TFO is supported [RFC7413]
+TFO_failure - an indication of when TFO option negotiation failed,
when TFO is supported
+TFOinfo - information cached when a TFO connection is established,
which includes the TFO_cookie [RFC7413]
4. The TCP Control Block (TCB) 4. The TCP Control Block (TCB)
A TCB describes the data associated with each connection, i.e., with A TCB describes the data associated with each connection, i.e., with
each association of a pair of applications across the network. The each association of a pair of applications across the network. The
TCB contains at least the following information [RFC793]: TCB contains at least the following information [RFC793]:
Local process state Local process state
pointers to send and receive buffers pointers to send and receive buffers
pointers to retransmission queue and current segment pointers to retransmission queue and current segment
skipping to change at page 6, line 35 skipping to change at page 7, line 39
old_PMTU old_PMTU old_PMTU old_PMTU
old_RTT old_RTT old_RTT old_RTT
old_RTTvar old_RTTvar old_RTTvar old_RTTvar
old_option (option specific) old_option (option specific)
old_ssthresh old_ssthresh old_ssthresh old_ssthresh
old_snd_cwnd old_snd_cwnd old_sendcwnd old_sendcwnd
The table below gives an overview of option-specific information The table below gives an overview of option-specific information
that can be shared. Additional information on some specific TCP that can be shared. Additional information on some specific TCP
options and sharing is provided in Appendix B. options and sharing is provided in Appendix B.
TEMPORAL SHARING - Option Info Initialization TEMPORAL SHARING - Option Info Initialization
Cached New Cached New
------------------------------------ ------------------------------------
old_TFO_Cookie old_TFO_Cookie old_TFO_cookie old_TFO_cookie
old_TFO_Failure old_TFO_Failure old_TFO_failure old_TFO_failure
6.2. Updates to the new TCB 6.2. Updates to the new TCB
During the connection, the associated TCB can be updated based on During the connection, the associated TCB can be updated based on
particular events, as shown below: particular events, as shown below:
TEMPORAL SHARING - Cache Updates TEMPORAL SHARING - Cache Updates
Cached TCB Current TCB when? New Cached TCB Cached TCB Current TCB when? New Cached TCB
---------------------------------------------------------- ----------------------------------------------------------
skipping to change at page 7, line 30 skipping to change at page 8, line 38
old_PMTU curr_PMTU PMTUD curr_PMTU old_PMTU curr_PMTU PMTUD curr_PMTU
old_RTT curr_RTT CLOSE merge(curr,old) old_RTT curr_RTT CLOSE merge(curr,old)
old_RTTvar curr_RTTvar CLOSE merge(curr,old) old_RTTvar curr_RTTvar CLOSE merge(curr,old)
old_option curr_option ESTAB (depends on option) old_option curr_option ESTAB (depends on option)
old_ssthresh curr_ssthresh CLOSE merge(curr,old) old_ssthresh curr_ssthresh CLOSE merge(curr,old)
old_snd_cwnd curr_snd_cwnd CLOSE merge(curr,old) old_sendcwnd curr_sendcwnd CLOSE merge(curr,old)
The table below gives an overview of option-specific information The table below gives an overview of option-specific information
that can be similarly shared. that can be similarly shared.
TEMPORAL SHARING - Option Info Updates TEMPORAL SHARING - Option Info Updates
Cached Current when? New Cached Cached Current when? New Cached
--------------------------------------------------------- ---------------------------------------------------------
old_TFO_Cookie old_TFO_Cookie ESTAB old_TFO_Cookie old_TFO_cookie old_TFO_cookie ESTAB old_TFO_cookie
old_TFO_Failure old_TFO_Failure ESTAB old_TFO_Failure old_TFO_failure old_TFO_failure ESTAB old_TFO_failure
6.3. Discussion 6.3. Discussion
There is no particular benefit to caching MMS_S and MMS_R as these There is no particular benefit to caching MMS_S and MMS_R as these
are reported by the local IP stack. Caching sendMSS and PMTU is are reported by the local IP stack. Caching sendMSS and PMTU is
trivial; reported values are cached, and the most recent values are trivial; reported values are cached, and the most recent values are
used. The cache is updated when the MSS option is received in a SYN used. The cache is updated when the MSS option is received in a SYN
or after PMTUD (i.e., when an ICMPv4 Fraqmentation Needed [RFC1191] or after PMTUD (i.e., when an ICMPv4 Fraqmentation Needed [RFC1191]
or ICMPv6 Packet Too Big message is received [RFC8201] or the or ICMPv6 Packet Too Big message is received [RFC8201] or the
equivalent is inferred, e.g. as from PLPMTUD [RFC4821]), equivalent is inferred, e.g. as from PLPMTUD [RFC4821]),
skipping to change at page 9, line 48 skipping to change at page 11, line 23
old_sendMSS old_sendMSS old_sendMSS old_sendMSS
old_PMTU old_PMTU old_PMTU old_PMTU
old_RTT old_RTT old_RTT old_RTT
old_RTTvar old_RTTvar old_RTTvar old_RTTvar
sum(old_ssthresh) f(sum(old_ssthresh), N) sum(old_ssthresh) f(sum(old_ssthresh), N)
sum(old_snd_cwnd) f(sum(old_snd_cwnd), N) sum(old_sendcwnd) f(sum(old_sendcwnd), N)
_ _
old_option (option specific) old_option (option specific)
The table below gives an overview of option-specific information The table below gives an overview of option-specific information
that can be similarly shared. that can be similarly shared.
ENSEMBLE SHARING - Option Info Initialization ENSEMBLE SHARING - Option Info Initialization
Cached New Cached New
------------------------------------ ------------------------------------
old_TFO_Cookie old_TFO_Cookie old_TFO_cookie old_TFO_cookie
old_TFO_Failure old_TFO_Failure old_TFO_failure old_TFO_failure
7.2. Updates to the new TCB 7.2. Updates to the new TCB
During the connection, the associated TCB can be updated based on During the connection, the associated TCB can be updated based on
changes to concurrent connections, as shown below: changes to concurrent connections, as shown below:
ENSEMBLE SHARING - Cache Updates ENSEMBLE SHARING - Cache Updates
Cached TCB Current TCB when? New Cached TCB Cached TCB Current TCB when? New Cached TCB
--------------------------------------------------------------- ---------------------------------------------------------------
old_MMS_S curr_MMS_S OPEN curr_MMS_S old_MMS_S curr_MMS_S OPEN curr_MMS_S
old_MMS_R curr_MMS_R OPEN curr_MMS_R old_MMS_R curr_MMS_R OPEN curr_MMS_R
old_sendMSS curr_sendMSS MSSopt curr_sendMSS old_sendMSS curr_sendMSS MSSopt curr_sendMSS
old_PMTU curr_PMTU PMTUD curr_PMTU / PLPMTUD old_PMTU curr_PMTU PMTUD / curr_PMTU
PLPMTUD
old_RTT curr_RTT update rtt_update(old,curr) old_RTT curr_RTT update rtt_update(old,curr)
old_RTTvar curr_RTTvar update rtt_update(old,curr) old_RTTvar curr_RTTvar update rtt_update(old,curr)
old_ssthresh curr_ssthresh update adjust sum as appropriate old_ssthresh curr_ssthresh update adjust sum as appropriate
old_snd_cwnd curr_snd_cwnd update adjust sum as appropriate old_sendcwnd curr_sendcwnd update adjust sum as appropriate
old_option curr_option (depends) (option specific) old_option curr_option (depends) (option specific)
The table below gives an overview of option-specific information The table below gives an overview of option-specific information
that can be similarly shared. that can be similarly shared.
ENSEMBLE SHARING - Option Info Updates ENSEMBLE SHARING - Option Info Updates
Cached Current when? New Cached Cached Current when? New Cached
---------------------------------------------------------- ----------------------------------------------------------
old_TFO_Cookie old_TFO_Cookie ESTAB old_TFO_Cookie old_TFO_cookie old_TFO_cookie ESTAB old_TFO_cookie
old_TFO_Failure old_TFO_Failure ESTAB old_TFO_Failure old_TFO_failure old_TFO_failure ESTAB old_TFO_failure
7.3. Discussion 7.3. Discussion
For ensemble sharing, TCB information should be cached as early as For ensemble sharing, TCB information should be cached as early as
possible, sometimes before a connection is closed. Otherwise, possible, sometimes before a connection is closed. Otherwise,
opening multiple concurrent connections may not result in TCB data opening multiple concurrent connections may not result in TCB data
sharing if no connection closes before others open. The amount of sharing if no connection closes before others open. The amount of
work involved in updating the aggregate average should be minimized, work involved in updating the aggregate average should be minimized,
but the resulting value should be equivalent to having all values but the resulting value should be equivalent to having all values
measured within a single connection. The function "rtt_update" in measured within a single connection. The function "rtt_update" in
skipping to change at page 15, line 47 skipping to change at page 17, line 27
old_sendMSS Cached and shared in Linux (MSS) old_sendMSS Cached and shared in Linux (MSS)
old_PMTU Cached and shared in FreeBSD and Windows (PMTU) old_PMTU Cached and shared in FreeBSD and Windows (PMTU)
old_RTT Cached and shared in FreeBSD and Linux old_RTT Cached and shared in FreeBSD and Linux
old_RTTvar Cached and shared in FreeBSD old_RTTvar Cached and shared in FreeBSD
old_TFOinfo Cached and shared in Linux and Windows old_TFOinfo Cached and shared in Linux and Windows
old_snd_cwnd Not shared old_sendcwnd Not shared
old_ssthresh Cached and shared in FreeBSD and Linux* old_ssthresh Cached and shared in FreeBSD and Linux*
*Note: In FreeBSD, new ssthresh is the mean of curr_ssthresh and *Note: In FreeBSD, new ssthresh is the mean of curr_ssthresh and
previous value if a previous value exists; in Linux, the calculation previous value if a previous value exists; in Linux, the calculation
depends on state and is max(curr_cwnd/2, old_ssthresh) in most depends on state and is max(curr_cwnd/2, old_ssthresh) in most
cases. cases.
11. Updates to RFC 2140 11. Updates to RFC 2140
skipping to change at page 17, line 46 skipping to change at page 19, line 27
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4821] Mathis, M., Heffner, J., "Packetization Layer Path MTU [RFC4821] Mathis, M., Heffner, J., "Packetization Layer Path MTU
Discovery," RFC 4821, Mar. 2007. Discovery," RFC 4821, Mar. 2007.
[RFC5681] Allman, M., Paxson, V., Blanton, E., "TCP Congestion [RFC5681] Allman, M., Paxson, V., Blanton, E., "TCP Congestion
Control," RFC 5681 (Standards Track), Sep. 2009. Control," RFC 5681 (Standards Track), Sep. 2009.
[RFC6298] Paxson, V., Allman, M., Chu, J., Sargent, M., "Computing
TCP's Retransmission Timer," RFC 6298, June 2011.
[RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., Jain, A., "TCP Fast [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., Jain, A., "TCP Fast
Open", RFC 7413, Dec. 2014. Open", RFC 7413, Dec. 2014.
[RFC8174] Leiba., B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba., B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", RFC 8174, May 2017. 2119 Key Words", RFC 8174, May 2017.
[RFC8201] McCann, J., Deering. S., Mogul, J., Hinden, R. (Ed.), [RFC8201] McCann, J., Deering. S., Mogul, J., Hinden, R. (Ed.),
"Path MTU Discovery for IP version 6," RFC 8201, Jul. "Path MTU Discovery for IP version 6," RFC 8201, Jul.
2017. 2017.
skipping to change at page 19, line 18 skipping to change at page 20, line 48
[RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast [RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast
Retransmit, and Fast Recovery Algorithms", RFC2001 Retransmit, and Fast Recovery Algorithms", RFC2001
(Standards Track), Jan. 1997. (Standards Track), Jan. 1997.
[RFC2140] Touch, J., "TCP Control Block Interdependence", RFC 2140, [RFC2140] Touch, J., "TCP Control Block Interdependence", RFC 2140,
April 1997. April 1997.
[RFC2414] Allman, M., Floyd, S., Partridge, C., "Increasing TCP's [RFC2414] Allman, M., Floyd, S., Partridge, C., "Increasing TCP's
Initial Window", RFC 2414 (Experimental), Sept. 1998. Initial Window", RFC 2414 (Experimental), Sept. 1998.
[RFC2581] Allman, M., Paxson, V., Stevens, W., "TCP Congestion
Control," RFC2581 (Standards Track), Apr. 1999.
[RFC2663] Srisuresh, P., Holdrege, M., "IP Network Address [RFC2663] Srisuresh, P., Holdrege, M., "IP Network Address
Translator (NAT) Terminology and Considerations", RFC- Translator (NAT) Terminology and Considerations", RFC-
2663, August 1999. 2663, August 1999.
[RFC2861] Handley, M., Padhye, J., Floyd, S., "TCP Congestion Window
Validation", RFC2861 (Experimental), June 2000.
[RFC3390] Allman, M., Floyd, S., Partridge, C., "Increasing TCP's [RFC3390] Allman, M., Floyd, S., Partridge, C., "Increasing TCP's
Initial Window," RFC 3390, Oct. 2002. Initial Window," RFC 3390, Oct. 2002.
[RFC3124] Balakrishnan, H., Seshan, S., "The Congestion Manager," [RFC3124] Balakrishnan, H., Seshan, S., "The Congestion Manager,"
RFC 3124, June 2001. RFC 3124, June 2001.
[RFC4340] Kohler, E., Handley, M., Floyd, S., "Datagram Congestion [RFC4340] Kohler, E., Handley, M., Floyd, S., "Datagram Congestion
Control Protocol (DCCP)," RFC 4340, Mar. 2006. Control Protocol (DCCP)," RFC 4340, Mar. 2006.
[RFC4960] Stewart, R., (Ed.), "Stream Control Transmission [RFC4960] Stewart, R., (Ed.), "Stream Control Transmission
skipping to change at page 26, line 14 skipping to change at page 28, line 14
Appendix C: Automating the Initial Window in TCP over Long Timescales Appendix C: Automating the Initial Window in TCP over Long Timescales
Note: this section is imported from [To12], updated only to refer to Note: this section is imported from [To12], updated only to refer to
itself as an appendix. itself as an appendix.
C.1. Introduction C.1. Introduction
TCP's congestion control algorithm uses an initial window value TCP's congestion control algorithm uses an initial window value
(IW), both as a starting point for new connections and after one RTO (IW), both as a starting point for new connections and after one RTO
or more [RFC2581][RFC2861]. This value has evolved over time, or more [RFC5681][RFC7661]. This value has evolved over time,
originally one maximum segment size (MSS), and increased to the originally one maximum segment size (MSS), and increased to the
lesser of four MSS or 4,380 bytes [RFC3390][RFC5681]. For typical lesser of four MSS or 4,380 bytes [RFC3390][RFC5681]. For typical
Internet connections with an maximum transmission units (MTUs) of Internet connections with an maximum transmission units (MTUs) of
1500 bytes, this permits three segments of 1,460 bytes each. 1500 bytes, this permits three segments of 1,460 bytes each.
The IW value was originally implied in the original TCP congestion The IW value was originally implied in the original TCP congestion
control description, and documented as a standard in 1997 control description, and documented as a standard in 1997
[RFC2001][Ja88]. The value was last updated in 1998 experimentally, [RFC2001][Ja88]. The value was last updated in 1998 experimentally,
and moved to the standards track in 2002 [RFC2414][RFC3390]. There and moved to the standards track in 2002 [RFC2414][RFC3390]. There
have been recent proposals to update the IW based on further have been recent proposals to update the IW based on further
 End of changes. 24 change blocks. 
56 lines changed or deleted 110 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/