draft-ietf-mptcp-rfc6824bis-13.txt | draft-ietf-mptcp-rfc6824bis-14.txt | |||
---|---|---|---|---|
Internet Engineering Task Force A. Ford | Internet Engineering Task Force A. Ford | |||
Internet-Draft Pexip | Internet-Draft Pexip | |||
Obsoletes: 6824 (if approved) C. Raiciu | Obsoletes: 6824 (if approved) C. Raiciu | |||
Intended status: Standards Track U. Politechnica of Bucharest | Intended status: Standards Track U. Politechnica of Bucharest | |||
Expires: August 21, 2019 M. Handley | Expires: November 4, 2019 M. Handley | |||
U. College London | U. College London | |||
O. Bonaventure | O. Bonaventure | |||
U. catholique de Louvain | U. catholique de Louvain | |||
C. Paasch | C. Paasch | |||
Apple, Inc. | Apple, Inc. | |||
February 17, 2019 | May 3, 2019 | |||
TCP Extensions for Multipath Operation with Multiple Addresses | TCP Extensions for Multipath Operation with Multiple Addresses | |||
draft-ietf-mptcp-rfc6824bis-13 | draft-ietf-mptcp-rfc6824bis-14 | |||
Abstract | Abstract | |||
TCP/IP communication is currently restricted to a single path per | TCP/IP communication is currently restricted to a single path per | |||
connection, yet multiple paths often exist between peers. The | connection, yet multiple paths often exist between peers. The | |||
simultaneous use of these multiple paths for a TCP/IP session would | simultaneous use of these multiple paths for a TCP/IP session would | |||
improve resource usage within the network and, thus, improve user | improve resource usage within the network and, thus, improve user | |||
experience through higher throughput and improved resilience to | experience through higher throughput and improved resilience to | |||
network failure. | network failure. | |||
skipping to change at page 2, line 7 ¶ | skipping to change at page 2, line 7 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on August 21, 2019. | This Internet-Draft will expire on November 4, 2019. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2019 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 3, line 5 ¶ | skipping to change at page 3, line 5 ¶ | |||
3.2. Starting a New Subflow . . . . . . . . . . . . . . . . . 23 | 3.2. Starting a New Subflow . . . . . . . . . . . . . . . . . 23 | |||
3.3. General MPTCP Operation . . . . . . . . . . . . . . . . . 28 | 3.3. General MPTCP Operation . . . . . . . . . . . . . . . . . 28 | |||
3.3.1. Data Sequence Mapping . . . . . . . . . . . . . . . . 30 | 3.3.1. Data Sequence Mapping . . . . . . . . . . . . . . . . 30 | |||
3.3.2. Data Acknowledgments . . . . . . . . . . . . . . . . 33 | 3.3.2. Data Acknowledgments . . . . . . . . . . . . . . . . 33 | |||
3.3.3. Closing a Connection . . . . . . . . . . . . . . . . 34 | 3.3.3. Closing a Connection . . . . . . . . . . . . . . . . 34 | |||
3.3.4. Receiver Considerations . . . . . . . . . . . . . . . 35 | 3.3.4. Receiver Considerations . . . . . . . . . . . . . . . 35 | |||
3.3.5. Sender Considerations . . . . . . . . . . . . . . . . 37 | 3.3.5. Sender Considerations . . . . . . . . . . . . . . . . 37 | |||
3.3.6. Reliability and Retransmissions . . . . . . . . . . . 37 | 3.3.6. Reliability and Retransmissions . . . . . . . . . . . 37 | |||
3.3.7. Congestion Control Considerations . . . . . . . . . . 39 | 3.3.7. Congestion Control Considerations . . . . . . . . . . 39 | |||
3.3.8. Subflow Policy . . . . . . . . . . . . . . . . . . . 39 | 3.3.8. Subflow Policy . . . . . . . . . . . . . . . . . . . 39 | |||
3.4. Address Knowledge Exchange (Path Management) . . . . . . 41 | 3.4. Address Knowledge Exchange (Path Management) . . . . . . 40 | |||
3.4.1. Address Advertisement . . . . . . . . . . . . . . . . 42 | 3.4.1. Address Advertisement . . . . . . . . . . . . . . . . 42 | |||
3.4.2. Remove Address . . . . . . . . . . . . . . . . . . . 45 | 3.4.2. Remove Address . . . . . . . . . . . . . . . . . . . 45 | |||
3.5. Fast Close . . . . . . . . . . . . . . . . . . . . . . . 46 | 3.5. Fast Close . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
3.6. Subflow Reset . . . . . . . . . . . . . . . . . . . . . . 48 | 3.6. Subflow Reset . . . . . . . . . . . . . . . . . . . . . . 47 | |||
3.7. Fallback . . . . . . . . . . . . . . . . . . . . . . . . 49 | 3.7. Fallback . . . . . . . . . . . . . . . . . . . . . . . . 49 | |||
3.8. Error Handling . . . . . . . . . . . . . . . . . . . . . 53 | 3.8. Error Handling . . . . . . . . . . . . . . . . . . . . . 53 | |||
3.9. Heuristics . . . . . . . . . . . . . . . . . . . . . . . 53 | 3.9. Heuristics . . . . . . . . . . . . . . . . . . . . . . . 53 | |||
3.9.1. Port Usage . . . . . . . . . . . . . . . . . . . . . 54 | 3.9.1. Port Usage . . . . . . . . . . . . . . . . . . . . . 53 | |||
3.9.2. Delayed Subflow Start and Subflow Symmetry . . . . . 54 | 3.9.2. Delayed Subflow Start and Subflow Symmetry . . . . . 54 | |||
3.9.3. Failure Handling . . . . . . . . . . . . . . . . . . 55 | 3.9.3. Failure Handling . . . . . . . . . . . . . . . . . . 55 | |||
4. Semantic Issues . . . . . . . . . . . . . . . . . . . . . . . 56 | 4. Semantic Issues . . . . . . . . . . . . . . . . . . . . . . . 55 | |||
5. Security Considerations . . . . . . . . . . . . . . . . . . . 57 | 5. Security Considerations . . . . . . . . . . . . . . . . . . . 57 | |||
6. Interactions with Middleboxes . . . . . . . . . . . . . . . . 60 | 6. Interactions with Middleboxes . . . . . . . . . . . . . . . . 60 | |||
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 63 | 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 63 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 64 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 63 | |||
8.1. MPTCP Option Subtypes . . . . . . . . . . . . . . . . . . 64 | 8.1. MPTCP Option Subtypes . . . . . . . . . . . . . . . . . . 64 | |||
8.2. MPTCP Handshake Algorithms . . . . . . . . . . . . . . . 65 | 8.2. MPTCP Handshake Algorithms . . . . . . . . . . . . . . . 65 | |||
8.3. MP_TCPRST Reason Codes . . . . . . . . . . . . . . . . . 66 | 8.3. MP_TCPRST Reason Codes . . . . . . . . . . . . . . . . . 66 | |||
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 67 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 66 | |||
9.1. Normative References . . . . . . . . . . . . . . . . . . 67 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 66 | |||
9.2. Informative References . . . . . . . . . . . . . . . . . 67 | 9.2. Informative References . . . . . . . . . . . . . . . . . 67 | |||
Appendix A. Notes on Use of TCP Options . . . . . . . . . . . . 71 | Appendix A. Notes on Use of TCP Options . . . . . . . . . . . . 70 | |||
Appendix B. TCP Fast Open and MPTCP . . . . . . . . . . . . . . 72 | Appendix B. TCP Fast Open and MPTCP . . . . . . . . . . . . . . 71 | |||
B.1. TFO cookie request with MPTCP . . . . . . . . . . . . . . 72 | B.1. TFO cookie request with MPTCP . . . . . . . . . . . . . . 71 | |||
B.2. Data sequence mapping under TFO . . . . . . . . . . . . . 73 | B.2. Data sequence mapping under TFO . . . . . . . . . . . . . 72 | |||
B.3. Connection establishment examples . . . . . . . . . . . . 74 | B.3. Connection establishment examples . . . . . . . . . . . . 73 | |||
Appendix C. Control Blocks . . . . . . . . . . . . . . . . . . . 76 | Appendix C. Control Blocks . . . . . . . . . . . . . . . . . . . 75 | |||
C.1. MPTCP Control Block . . . . . . . . . . . . . . . . . . . 76 | C.1. MPTCP Control Block . . . . . . . . . . . . . . . . . . . 75 | |||
C.1.1. Authentication and Metadata . . . . . . . . . . . . . 76 | C.1.1. Authentication and Metadata . . . . . . . . . . . . . 75 | |||
C.1.2. Sending Side . . . . . . . . . . . . . . . . . . . . 77 | C.1.2. Sending Side . . . . . . . . . . . . . . . . . . . . 76 | |||
C.1.3. Receiving Side . . . . . . . . . . . . . . . . . . . 77 | C.1.3. Receiving Side . . . . . . . . . . . . . . . . . . . 76 | |||
C.2. TCP Control Blocks . . . . . . . . . . . . . . . . . . . 77 | C.2. TCP Control Blocks . . . . . . . . . . . . . . . . . . . 76 | |||
C.2.1. Sending Side . . . . . . . . . . . . . . . . . . . . 78 | C.2.1. Sending Side . . . . . . . . . . . . . . . . . . . . 77 | |||
C.2.2. Receiving Side . . . . . . . . . . . . . . . . . . . 78 | C.2.2. Receiving Side . . . . . . . . . . . . . . . . . . . 77 | |||
Appendix D. Finite State Machine . . . . . . . . . . . . . . . . 78 | Appendix D. Finite State Machine . . . . . . . . . . . . . . . . 77 | |||
Appendix E. Changes from RFC6184 . . . . . . . . . . . . . . . . 79 | Appendix E. Changes from RFC6184 . . . . . . . . . . . . . . . . 78 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 81 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 80 | |||
1. Introduction | 1. Introduction | |||
Multipath TCP (MPTCP) is a set of extensions to regular TCP [RFC0793] | Multipath TCP (MPTCP) is a set of extensions to regular TCP [RFC0793] | |||
to provide a Multipath TCP [RFC6182] service, which enables a | to provide a Multipath TCP [RFC6182] service, which enables a | |||
transport connection to operate across multiple paths simultaneously. | transport connection to operate across multiple paths simultaneously. | |||
This document presents the protocol changes required to add multipath | This document presents the protocol changes required to add multipath | |||
capability to TCP; specifically, those for signaling and setting up | capability to TCP; specifically, those for signaling and setting up | |||
multiple paths ("subflows"), managing these subflows, reassembly of | multiple paths ("subflows"), managing these subflows, reassembly of | |||
data, and termination of sessions. This is not the only information | data, and termination of sessions. This is not the only information | |||
skipping to change at page 4, line 24 ¶ | skipping to change at page 4, line 24 ¶ | |||
o Congestion control [RFC6356] presents a safe congestion control | o Congestion control [RFC6356] presents a safe congestion control | |||
algorithm for coupling the behavior of the multiple paths in order | algorithm for coupling the behavior of the multiple paths in order | |||
to "do no harm" to other network users. | to "do no harm" to other network users. | |||
o Application considerations [RFC6897] discusses what impact MPTCP | o Application considerations [RFC6897] discusses what impact MPTCP | |||
will have on applications, what applications will want to do with | will have on applications, what applications will want to do with | |||
MPTCP, and as a consequence of these factors, what API extensions | MPTCP, and as a consequence of these factors, what API extensions | |||
an MPTCP implementation should present. | an MPTCP implementation should present. | |||
This document is an update to, and obsoletes, the v0 specification of | This document is an update to, and obsoletes, the v0 specification of | |||
Multipath TCP [RFC6824]. This document specifies MPTCP v1, which is | Multipath TCP (RFC6824). This document specifies MPTCP v1, which is | |||
not backward compatible with MPTCP v0. This document additionally | not backward compatible with MPTCP v0. This document additionally | |||
defines version negotiation procedures for implementations that | defines version negotiation procedures for implementations that | |||
support both versions. | support both versions. | |||
1.1. Design Assumptions | 1.1. Design Assumptions | |||
In order to limit the potentially huge design space, the working | In order to limit the potentially huge design space, the mptcp | |||
group imposed two key constraints on the Multipath TCP design | working group imposed two key constraints on the Multipath TCP design | |||
presented in this document: | presented in this document: | |||
o It must be backwards-compatible with current, regular TCP, to | o It must be backwards-compatible with current, regular TCP, to | |||
increase its chances of deployment. | increase its chances of deployment. | |||
o It can be assumed that one or both hosts are multihomed and | o It can be assumed that one or both hosts are multihomed and | |||
multiaddressed. | multiaddressed. | |||
To simplify the design, we assume that the presence of multiple | To simplify the design, we assume that the presence of multiple | |||
addresses at a host is sufficient to indicate the existence of | addresses at a host is sufficient to indicate the existence of | |||
skipping to change at page 7, line 34 ¶ | skipping to change at page 7, line 34 ¶ | |||
with the existing session, which continues to appear as a single | with the existing session, which continues to appear as a single | |||
connection to the applications at both ends. The creation of the | connection to the applications at both ends. The creation of the | |||
additional TCP session is illustrated between Address A2 on Host A | additional TCP session is illustrated between Address A2 on Host A | |||
and Address B1 on Host B. | and Address B1 on Host B. | |||
o MPTCP identifies multiple paths by the presence of multiple | o MPTCP identifies multiple paths by the presence of multiple | |||
addresses at hosts. Combinations of these multiple addresses | addresses at hosts. Combinations of these multiple addresses | |||
equate to the additional paths. In the example, other potential | equate to the additional paths. In the example, other potential | |||
paths that could be set up are A1<->B2 and A2<->B2. Although this | paths that could be set up are A1<->B2 and A2<->B2. Although this | |||
additional session is shown as being initiated from A2, it could | additional session is shown as being initiated from A2, it could | |||
equally have been initiated from B1. | equally have been initiated from B1 or B2. | |||
o The discovery and setup of additional subflows will be achieved | o The discovery and setup of additional subflows will be achieved | |||
through a path management method; this document describes a | through a path management method; this document describes a | |||
mechanism by which a host can initiate new subflows by using its | mechanism by which a host can initiate new subflows by using its | |||
own additional addresses, or by signaling its available addresses | own additional addresses, or by signaling its available addresses | |||
to the other host. | to the other host. | |||
o MPTCP adds connection-level sequence numbers to allow the | o MPTCP adds connection-level sequence numbers to allow the | |||
reassembly of segments arriving on multiple subflows with | reassembly of segments arriving on multiple subflows with | |||
differing network delays. | differing network delays. | |||
skipping to change at page 14, line 44 ¶ | skipping to change at page 14, line 44 ¶ | |||
o MPTCP falls back to ordinary TCP if MPTCP operation is not | o MPTCP falls back to ordinary TCP if MPTCP operation is not | |||
possible, for example, if one host is not MPTCP capable or if a | possible, for example, if one host is not MPTCP capable or if a | |||
middlebox alters the payload. This is discussed in Section 3.7. | middlebox alters the payload. This is discussed in Section 3.7. | |||
o To address the threats identified in [RFC6181], the following | o To address the threats identified in [RFC6181], the following | |||
steps are taken: keys are sent in the clear in the MP_CAPABLE | steps are taken: keys are sent in the clear in the MP_CAPABLE | |||
messages; MP_JOIN messages are secured with HMAC-SHA256 | messages; MP_JOIN messages are secured with HMAC-SHA256 | |||
([RFC2104], [SHS]) using those keys; and standard TCP validity | ([RFC2104], [SHS]) using those keys; and standard TCP validity | |||
checks are made on the other messages (ensuring sequence numbers | checks are made on the other messages (ensuring sequence numbers | |||
are in-window [RFC5961]). Residual threats to MPTCP v0 [RFC6824] | are in-window [RFC5961]). Residual threats to MPTCP v0 were | |||
were identified in [RFC7430], and those affecting the protocol | identified in [RFC7430], and those affecting the protocol (i.e. | |||
(i.e. modification to ADD_ADDR) have been incorporated in this | modification to ADD_ADDR) have been incorporated in this document. | |||
document. Further discussion of security can be found in | Further discussion of security can be found in Section 5. | |||
Section 5. | ||||
3. MPTCP Protocol | 3. MPTCP Protocol | |||
This section describes the operation of the MPTCP protocol, and is | This section describes the operation of the MPTCP protocol, and is | |||
subdivided into sections for each key part of the protocol operation. | subdivided into sections for each key part of the protocol operation. | |||
All MPTCP operations are signaled using optional TCP header fields. | All MPTCP operations are signaled using optional TCP header fields. | |||
A single TCP option number ("Kind") has been assigned by IANA for | A single TCP option number ("Kind") has been assigned by IANA for | |||
MPTCP (see Section 8), and then individual messages will be | MPTCP (see Section 8), and then individual messages will be | |||
determined by a "subtype", the values of which are also stored in an | determined by a "subtype", the values of which are also stored in an | |||
skipping to change at page 16, line 23 ¶ | skipping to change at page 16, line 23 ¶ | |||
3.1. Connection Initiation | 3.1. Connection Initiation | |||
Connection initiation begins with a SYN, SYN/ACK, ACK exchange on a | Connection initiation begins with a SYN, SYN/ACK, ACK exchange on a | |||
single path. Each packet contains the Multipath Capable (MP_CAPABLE) | single path. Each packet contains the Multipath Capable (MP_CAPABLE) | |||
MPTCP option (Figure 4). This option declares its sender is capable | MPTCP option (Figure 4). This option declares its sender is capable | |||
of performing Multipath TCP and wishes to do so on this particular | of performing Multipath TCP and wishes to do so on this particular | |||
connection. | connection. | |||
The MP_CAPABLE exchange in this specification (v1) is different to | The MP_CAPABLE exchange in this specification (v1) is different to | |||
that specified in v0 [RFC6824]. If a host supports multiple versions | that specified in v0. If a host supports multiple versions of MPTCP, | |||
of MPTCP, the sender of the MP_CAPABLE option SHOULD signal the | the sender of the MP_CAPABLE option SHOULD signal the highest version | |||
highest version number it supports. In return, in its MP_CAPABLE | number it supports. In return, in its MP_CAPABLE option, the | |||
option, the receiver will signal the version number it wishes to use, | receiver will signal the version number it wishes to use, which MUST | |||
which MUST be equal to or lower than the version number indicated in | be equal to or lower than the version number indicated in the initial | |||
the initial MP_CAPABLE. There is a caveat though with respect to | MP_CAPABLE. There is a caveat though with respect to this version | |||
this version negotiation with old listeners that only support v0. A | negotiation with old listeners that only support v0. A listener that | |||
listener that supports v0 expects that the MP_CAPABLE option in the | supports v0 expects that the MP_CAPABLE option in the SYN-segment | |||
SYN-segment includes the initiator's key. If the initiator however | includes the initiator's key. If the initiator however already | |||
already upgraded to v1, it won't include the key in the SYN-segment. | upgraded to v1, it won't include the key in the SYN-segment. Thus, | |||
Thus, the listener will ignore the MP_CAPABLE of this SYN-segment and | the listener will ignore the MP_CAPABLE of this SYN-segment and reply | |||
reply with a SYN/ACK that does not include an MP_CAPABLE, thus | with a SYN/ACK that does not include an MP_CAPABLE. The initiator | |||
leading to a fallback to regular TCP. An initiator MAY cache this | MAY choose to immediately fall back to TCP or MAY choose to attempt a | |||
information about a peer and for future connections, MAY choose to | connection using MPTCP v0 (if the initiator supports v0), in order to | |||
attempt using MPTCP v0, if supported, before recording the host as | discover whether the listener supports the earlier version of MPTCP. | |||
not supporting MPTCP. | In general a MPTCP v0 connection is likely to be preferred to a TCP | |||
one, however in a particular deployment scenario it may be known that | ||||
the listener is unlikely to support MPTCPv0 and so the initiator may | ||||
prefer not to attempt a v0 connection. An initiator MAY cache | ||||
information for a peer about what version of MPTCP it supports if | ||||
any, and use this information for future connection attempts. | ||||
The MP_CAPABLE option is variable-length, with different fields | The MP_CAPABLE option is variable-length, with different fields | |||
included depending on which packet the option is used on. The full | included depending on which packet the option is used on. The full | |||
MP_CAPABLE option is shown in Figure 4. | MP_CAPABLE option is shown in Figure 4. | |||
1 2 3 | 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+---------------+---------------+-------+-------+---------------+ | +---------------+---------------+-------+-------+---------------+ | |||
| Kind | Length |Subtype|Version|A|B|C|D|E|F|G|H| | | Kind | Length |Subtype|Version|A|B|C|D|E|F|G|H| | |||
+---------------+---------------+-------+-------+---------------+ | +---------------+---------------+-------+-------+---------------+ | |||
skipping to change at page 22, line 6 ¶ | skipping to change at page 22, line 6 ¶ | |||
For crypto negotiation, the responder has the choice. The initiator | For crypto negotiation, the responder has the choice. The initiator | |||
creates a proposal setting a bit for each algorithm it supports to 1 | creates a proposal setting a bit for each algorithm it supports to 1 | |||
(in this version of the specification, there is only one proposal, so | (in this version of the specification, there is only one proposal, so | |||
bit "H" will be always set to 1). The responder responds with only 1 | bit "H" will be always set to 1). The responder responds with only 1 | |||
bit set -- this is the chosen algorithm. The rationale for this | bit set -- this is the chosen algorithm. The rationale for this | |||
behavior is that the responder will typically be a server with | behavior is that the responder will typically be a server with | |||
potentially many thousands of connections, so it may wish to choose | potentially many thousands of connections, so it may wish to choose | |||
an algorithm with minimal computational complexity, depending on the | an algorithm with minimal computational complexity, depending on the | |||
load. If a responder does not support (or does not want to support) | load. If a responder does not support (or does not want to support) | |||
any of the initiator's proposals, it can respond without an | any of the initiator's proposals, it MUST respond without an | |||
MP_CAPABLE option, thus forcing a fallback to regular TCP. | MP_CAPABLE option, thus forcing a fallback to regular TCP. | |||
The MP_CAPABLE option is only used in the first subflow of a | The MP_CAPABLE option is only used in the first subflow of a | |||
connection, in order to identify the connection; all following | connection, in order to identify the connection; all following | |||
subflows will use the "Join" option (see Section 3.2) to join the | subflows will use the "Join" option (see Section 3.2) to join the | |||
existing connection. | existing connection. | |||
If a SYN contains an MP_CAPABLE option but the SYN/ACK does not, it | If a SYN contains an MP_CAPABLE option but the SYN/ACK does not, it | |||
is assumed that sender of the SYN/ACK is not multipath capable; thus, | is assumed that sender of the SYN/ACK is not multipath capable; thus, | |||
the MPTCP session MUST operate as a regular, single-path TCP. If a | the MPTCP session MUST operate as a regular, single-path TCP. If a | |||
skipping to change at page 24, line 34 ¶ | skipping to change at page 24, line 34 ¶ | |||
path (B=1) in the event of failure of other paths, or whether it | path (B=1) in the event of failure of other paths, or whether it | |||
wants it to be used as part of the connection immediately. By | wants it to be used as part of the connection immediately. By | |||
setting B=1, the sender of the option is requesting the other host to | setting B=1, the sender of the option is requesting the other host to | |||
only send data on this subflow if there are no available subflows | only send data on this subflow if there are no available subflows | |||
where B=0. Subflow policy is discussed in more detail in | where B=0. Subflow policy is discussed in more detail in | |||
Section 3.3.8. | Section 3.3.8. | |||
1 2 3 | 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+---------------+---------------+-------+-----+-+---------------+ | +---------------+---------------+-------+-----+-+---------------+ | |||
| Kind | Length = 12 |Subtype| |B| Address ID | | | Kind | Length = 12 |Subtype|(rsv)|B| Address ID | | |||
+---------------+---------------+-------+-----+-+---------------+ | +---------------+---------------+-------+-----+-+---------------+ | |||
| Receiver's Token (32 bits) | | | Receiver's Token (32 bits) | | |||
+---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
| Sender's Random Number (32 bits) | | | Sender's Random Number (32 bits) | | |||
+---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
Figure 5: Join Connection (MP_JOIN) Option (for Initial SYN) | Figure 5: Join Connection (MP_JOIN) Option (for Initial SYN) | |||
When receiving a SYN with an MP_JOIN option that contains a valid | When receiving a SYN with an MP_JOIN option that contains a valid | |||
token for an existing MPTCP connection, the recipient SHOULD respond | token for an existing MPTCP connection, the recipient SHOULD respond | |||
skipping to change at page 26, line 8 ¶ | skipping to change at page 26, line 8 ¶ | |||
transmitted by Host A, will be Key-A followed by Key-B, and in the | transmitted by Host A, will be Key-A followed by Key-B, and in the | |||
case of Host B, Key-B followed by Key-A. These are the keys that | case of Host B, Key-B followed by Key-A. These are the keys that | |||
were exchanged in the original MP_CAPABLE handshake. The "message" | were exchanged in the original MP_CAPABLE handshake. The "message" | |||
for the HMAC algorithm in each case is the concatenations of random | for the HMAC algorithm in each case is the concatenations of random | |||
number for each host (denoted by R): for Host A, R-A followed by R-B; | number for each host (denoted by R): for Host A, R-A followed by R-B; | |||
and for Host B, R-B followed by R-A. | and for Host B, R-B followed by R-A. | |||
1 2 3 | 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+---------------+---------------+-------+-----+-+---------------+ | +---------------+---------------+-------+-----+-+---------------+ | |||
| Kind | Length = 16 |Subtype| |B| Address ID | | | Kind | Length = 16 |Subtype|(rsv)|B| Address ID | | |||
+---------------+---------------+-------+-----+-+---------------+ | +---------------+---------------+-------+-----+-+---------------+ | |||
| | | | | | |||
| Sender's Truncated HMAC (64 bits) | | | Sender's Truncated HMAC (64 bits) | | |||
| | | | | | |||
+---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
| Sender's Random Number (32 bits) | | | Sender's Random Number (32 bits) | | |||
+---------------------------------------------------------------+ | +---------------------------------------------------------------+ | |||
Figure 6: Join Connection (MP_JOIN) Option (for Responding SYN/ACK) | Figure 6: Join Connection (MP_JOIN) Option (for Responding SYN/ACK) | |||
skipping to change at page 30, line 43 ¶ | skipping to change at page 30, line 43 ¶ | |||
the data sequence number after the mapping has been processed. A | the data sequence number after the mapping has been processed. A | |||
sender MUST NOT change this mapping after it has been declared; | sender MUST NOT change this mapping after it has been declared; | |||
however, the same data sequence number can be mapped to by different | however, the same data sequence number can be mapped to by different | |||
subflows for retransmission purposes (see Section 3.3.6). This would | subflows for retransmission purposes (see Section 3.3.6). This would | |||
also permit the same data to be sent simultaneously on multiple | also permit the same data to be sent simultaneously on multiple | |||
subflows for resilience or efficiency purposes, especially in the | subflows for resilience or efficiency purposes, especially in the | |||
case of lossy links. Although the detailed specification of such | case of lossy links. Although the detailed specification of such | |||
operation is outside the scope of this document, an implementation | operation is outside the scope of this document, an implementation | |||
SHOULD treat the first data that is received at a subflow for the | SHOULD treat the first data that is received at a subflow for the | |||
data sequence space as that which should be delivered to the | data sequence space as that which should be delivered to the | |||
application, and any later data for that sequence space should be | application, and any later data for that sequence space SHOULD be | |||
ignored. | ignored. | |||
The data sequence number is specified as an absolute value, whereas | The data sequence number is specified as an absolute value, whereas | |||
the subflow sequence numbering is relative (the SYN at the start of | the subflow sequence numbering is relative (the SYN at the start of | |||
the subflow has relative subflow sequence number 0). This is to | the subflow has relative subflow sequence number 0). This is to | |||
allow middleboxes to change the initial sequence number of a subflow, | allow middleboxes to change the initial sequence number of a subflow, | |||
such as firewalls that undertake Initial Sequence Number (ISN) | such as firewalls that undertake Initial Sequence Number (ISN) | |||
randomization. | randomization. | |||
The data sequence mapping also contains a checksum of the data that | The data sequence mapping also contains a checksum of the data that | |||
skipping to change at page 33, line 32 ¶ | skipping to change at page 33, line 32 ¶ | |||
standard TCP cumulative ACK -- indicating how much data has been | standard TCP cumulative ACK -- indicating how much data has been | |||
successfully received (with no holes). This is in comparison to the | successfully received (with no holes). This is in comparison to the | |||
subflow-level ACK, which acts analogous to TCP SACK, given that there | subflow-level ACK, which acts analogous to TCP SACK, given that there | |||
may still be holes in the data stream at the connection level. The | may still be holes in the data stream at the connection level. The | |||
Data ACK specifies the next data sequence number it expects to | Data ACK specifies the next data sequence number it expects to | |||
receive. | receive. | |||
The Data ACK, as for the DSN, can be sent as the full 64-bit value, | The Data ACK, as for the DSN, can be sent as the full 64-bit value, | |||
or as the lower 32 bits. If data is received with a 64-bit DSN, it | or as the lower 32 bits. If data is received with a 64-bit DSN, it | |||
MUST be acknowledged with a 64-bit Data ACK. If the DSN received is | MUST be acknowledged with a 64-bit Data ACK. If the DSN received is | |||
32 bits, it is valid for the implementation to choose whether to send | 32 bits, an implementation can choose whether to send a 32-bit or | |||
a 32-bit or 64-bit Data ACK. | 64-bit Data ACK, and an implementation MUST accept either in this | |||
situation. | ||||
The Data ACK proves that the data, and all required MPTCP signaling, | The Data ACK proves that the data, and all required MPTCP signaling, | |||
has been received and accepted by the remote end. One key use of the | has been received and accepted by the remote end. One key use of the | |||
Data ACK signal is that it is used to indicate the left edge of the | Data ACK signal is that it is used to indicate the left edge of the | |||
advertised receive window. As explained in Section 3.3.4, the | advertised receive window. As explained in Section 3.3.4, the | |||
receive window is shared by all subflows and is relative to the Data | receive window is shared by all subflows and is relative to the Data | |||
ACK. Because of this, an implementation MUST NOT use the RCV.WND | ACK. Because of this, an implementation MUST NOT use the RCV.WND | |||
field of a TCP segment at the connection level if it does not also | field of a TCP segment at the connection level if it does not also | |||
carry a DSS option with a Data ACK field. Furthermore, separating | carry a DSS option with a Data ACK field. Furthermore, separating | |||
the connection-level acknowledgments from the subflow level allows | the connection-level acknowledgments from the subflow level allows | |||
skipping to change at page 38, line 8 ¶ | skipping to change at page 38, line 8 ¶ | |||
The data sequence mapping allows senders to resend data with the same | The data sequence mapping allows senders to resend data with the same | |||
data sequence number on a different subflow. When doing this, a host | data sequence number on a different subflow. When doing this, a host | |||
MUST still retransmit the original data on the original subflow, in | MUST still retransmit the original data on the original subflow, in | |||
order to preserve the subflow integrity (middleboxes could replay old | order to preserve the subflow integrity (middleboxes could replay old | |||
data, and/or could reject holes in subflows), and a receiver will | data, and/or could reject holes in subflows), and a receiver will | |||
ignore these retransmissions. While this is clearly suboptimal, for | ignore these retransmissions. While this is clearly suboptimal, for | |||
compatibility reasons this is sensible behavior. Optimizations could | compatibility reasons this is sensible behavior. Optimizations could | |||
be negotiated in future versions of this protocol. Note also that | be negotiated in future versions of this protocol. Note also that | |||
this property would also permit a sender to always send the same | this property would also permit a sender to always send the same | |||
data, with the same data sequence number, on multiple subflows, if it | data, with the same data sequence number, on multiple subflows, if | |||
so desired for reliability reasons. | desired for reliability reasons. | |||
This protocol specification does not mandate any mechanisms for | This protocol specification does not mandate any mechanisms for | |||
handling retransmissions, and much will be dependent upon local | handling retransmissions, and much will be dependent upon local | |||
policy (as discussed in Section 3.3.8). One can imagine aggressive | policy (as discussed in Section 3.3.8). One can imagine aggressive | |||
connection-level retransmissions policies where every packet lost at | connection-level retransmissions policies where every packet lost at | |||
subflow level is retransmitted on a different subflow (hence, wasting | subflow level is retransmitted on a different subflow (hence, wasting | |||
bandwidth but possibly reducing application-to-application delays), | bandwidth but possibly reducing application-to-application delays), | |||
or conservative retransmission policies where connection-level | or conservative retransmission policies where connection-level | |||
retransmits are only used after a few subflow-level retransmission | retransmits are only used after a few subflow-level retransmission | |||
timeouts occur. | timeouts occur. | |||
skipping to change at page 38, line 41 ¶ | skipping to change at page 38, line 41 ¶ | |||
which it has been sent. In this way, the sender can always | which it has been sent. In this way, the sender can always | |||
retransmit the data if needed, on the same subflow or on a different | retransmit the data if needed, on the same subflow or on a different | |||
one. A special case is when a subflow fails: the sender will | one. A special case is when a subflow fails: the sender will | |||
typically resend the data on other working subflows after a timeout, | typically resend the data on other working subflows after a timeout, | |||
and will keep trying to retransmit the data on the failed subflow | and will keep trying to retransmit the data on the failed subflow | |||
too. The sender will declare the subflow failed after a predefined | too. The sender will declare the subflow failed after a predefined | |||
upper bound on retransmissions is reached (which MAY be lower than | upper bound on retransmissions is reached (which MAY be lower than | |||
the usual TCP limits of the Maximum Segment Life), or on the receipt | the usual TCP limits of the Maximum Segment Life), or on the receipt | |||
of an ICMP error, and only then delete the outstanding data segments. | of an ICMP error, and only then delete the outstanding data segments. | |||
Multiple retransmissions are triggers that will indicate that a | If multiple retransmissions are triggered that indicate that a | |||
subflow performs badly and could lead to a host resetting the subflow | subflow performs badly, this MAY lead to a host resetting the subflow | |||
with a RST. However, additional research is required to understand | with a RST. However, additional research is required to understand | |||
the heuristics of how and when to reset underperforming subflows. | the heuristics of how and when to reset underperforming subflows. | |||
For example, a highly asymmetric path may be misdiagnosed as | For example, a highly asymmetric path may be misdiagnosed as | |||
underperforming. A RST for this purpose SHOULD be accompanied with | underperforming. A RST for this purpose SHOULD be accompanied with | |||
an "Unacceptable performance" MP_TCPRST option (Section 3.6). | an "Unacceptable performance" MP_TCPRST option (Section 3.6). | |||
3.3.7. Congestion Control Considerations | 3.3.7. Congestion Control Considerations | |||
Different subflows in an MPTCP connection have different congestion | Different subflows in an MPTCP connection have different congestion | |||
windows. To achieve fairness at bottlenecks and resource pooling, it | windows. To achieve fairness at bottlenecks and resource pooling, it | |||
skipping to change at page 40, line 7 ¶ | skipping to change at page 40, line 7 ¶ | |||
where stability (of delay or bandwidth) is more important than | where stability (of delay or bandwidth) is more important than | |||
throughput. Application requirements such as these are discussed in | throughput. Application requirements such as these are discussed in | |||
detail in [RFC6897]. | detail in [RFC6897]. | |||
The ability to make effective choices at the sender requires full | The ability to make effective choices at the sender requires full | |||
knowledge of the path "cost", which is unlikely to be the case. It | knowledge of the path "cost", which is unlikely to be the case. It | |||
would be desirable for a receiver to be able to signal their own | would be desirable for a receiver to be able to signal their own | |||
preferences for paths, since they will often be the multihomed party, | preferences for paths, since they will often be the multihomed party, | |||
and may have to pay for metered incoming bandwidth. | and may have to pay for metered incoming bandwidth. | |||
Whilst fine-grained control may be the most powerful solution, that | To enable this, the MP_JOIN option (see Section 3.2) contains the 'B' | |||
would require some mechanism such as overloading the Explicit | bit, which allows a host to indicate to its peer that this path | |||
Congestion Notification (ECN) signal [RFC3168], which is undesirable, | should be treated as a backup path to use only in the event of | |||
and it is felt that there would not be sufficient benefit to justify | failure of other working subflows (i.e., a subflow where the receiver | |||
an entirely new signal. Therefore, the MP_JOIN option (see | has indicated B=1 SHOULD NOT be used to send data unless there are no | |||
Section 3.2) contains the 'B' bit, which allows a host to indicate to | usable subflows where B=0). | |||
its peer that this path should be treated as a backup path to use | ||||
only in the event of failure of other working subflows (i.e., a | ||||
subflow where the receiver has indicated B=1 SHOULD NOT be used to | ||||
send data unless there are no usable subflows where B=0). | ||||
In the event that the available set of paths changes, a host may wish | In the event that the available set of paths changes, a host may wish | |||
to signal a change in priority of subflows to the peer (e.g., a | to signal a change in priority of subflows to the peer (e.g., a | |||
subflow that was previously set as backup should now take priority | subflow that was previously set as backup should now take priority | |||
over all remaining subflows). Therefore, the MP_PRIO option, shown | over all remaining subflows). Therefore, the MP_PRIO option, shown | |||
in Figure 11, can be used to change the 'B' flag of the subflow on | in Figure 11, can be used to change the 'B' flag of the subflow on | |||
which it is sent. | which it is sent. | |||
Another use of the MP_PRIO option is to set the 'B' flag on a subflow | Another use of the MP_PRIO option is to set the 'B' flag on a subflow | |||
to cleanly retire its use before closing it and removing it with | to cleanly retire its use before closing it and removing it with | |||
REMOVE_ADDR Section 3.4.2, for example to support make-before-break | REMOVE_ADDR Section 3.4.2, for example to support make-before-break | |||
session continuity, where new subflows are added before the | session continuity, where new subflows are added before the | |||
previously used ones are closed. | previously used ones are closed. | |||
1 2 3 | 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+---------------+---------------+-------+-----+-+ | +---------------+---------------+-------+-----+-+ | |||
| Kind | Length |Subtype| |B| | | Kind | Length |Subtype|(rsv)|B| | |||
+---------------+---------------+-------+-----+-+ | +---------------+---------------+-------+-----+-+ | |||
Figure 11: Change Subflow Priority (MP_PRIO) Option | Figure 11: Change Subflow Priority (MP_PRIO) Option | |||
It should be noted that the backup flag is a request from a data | It should be noted that the backup flag is a request from a data | |||
receiver to a data sender only, and the data sender SHOULD adhere to | receiver to a data sender only, and the data sender SHOULD adhere to | |||
these requests. A host cannot assume that the data sender will do | these requests. A host cannot assume that the data sender will do | |||
so, however, since local policies -- or technical difficulties -- may | so, however, since local policies -- or technical difficulties -- may | |||
override MP_PRIO requests. Note also that this signal applies to a | override MP_PRIO requests. Note also that this signal applies to a | |||
single direction, and so the sender of this option could choose to | single direction, and so the sender of this option could choose to | |||
skipping to change at page 50, line 38 ¶ | skipping to change at page 50, line 20 ¶ | |||
with a DSS option containing a Data ACK. Upon reception of the | with a DSS option containing a Data ACK. Upon reception of the | |||
acknowledgment, the sender has the confirmation that the DSS option | acknowledgment, the sender has the confirmation that the DSS option | |||
passes in both directions and may choose to send fewer DSS options | passes in both directions and may choose to send fewer DSS options | |||
than once per segment. | than once per segment. | |||
If, however, an ACK is received for data (not just for the SYN) | If, however, an ACK is received for data (not just for the SYN) | |||
without a DSS option containing a Data ACK, the sender determines the | without a DSS option containing a Data ACK, the sender determines the | |||
path is not MPTCP capable. In the case of this occurring on an | path is not MPTCP capable. In the case of this occurring on an | |||
additional subflow (i.e., one started with MP_JOIN), the host MUST | additional subflow (i.e., one started with MP_JOIN), the host MUST | |||
close the subflow with a RST, which SHOULD contain a MP_TCPRST option | close the subflow with a RST, which SHOULD contain a MP_TCPRST option | |||
(Section 3.6) with a "Middlebox interferance" reason code. | (Section 3.6) with a "Middlebox interference" reason code. | |||
In the case of such an ACK being received on the first subflow (i.e., | In the case of such an ACK being received on the first subflow (i.e., | |||
that started with MP_CAPABLE), before any additional subflows are | that started with MP_CAPABLE), before any additional subflows are | |||
added, the implementation MUST drop out of an MPTCP mode, back to | added, the implementation MUST drop out of an MPTCP mode, back to | |||
regular TCP. The sender will send one final data sequence mapping, | regular TCP. The sender will send one final data sequence mapping, | |||
with the Data-Level Length value of 0 indicating an infinite mapping | with the Data-Level Length value of 0 indicating an infinite mapping | |||
(to inform the other end in case the path drops options in one | (to inform the other end in case the path drops options in one | |||
direction only), and then revert to sending data on the single | direction only), and then revert to sending data on the single | |||
subflow without any MPTCP options. | subflow without any MPTCP options. | |||
If a subflow breaks during operation, e.g. if it is re-routed and | If a subflow breaks during operation, e.g. if it is re-routed and | |||
MPTCP options are no longer permitted, then once this is detected (by | MPTCP options are no longer permitted, then once this is detected (by | |||
the subflow-level receive buffer filling up), the subflow SHOULD be | the subflow-level receive buffer filling up, since there is no | |||
treated as broken and closed with a RST, since no data can be | mapping available in order to DATA_ACK this data), the subflow SHOULD | |||
be treated as broken and closed with a RST, since no data can be | ||||
delivered to the application layer, and no fallback signal can be | delivered to the application layer, and no fallback signal can be | |||
reliably sent. This RST SHOULD include the MP_TCPRST option | reliably sent. This RST SHOULD include the MP_TCPRST option | |||
(Section 3.6) with a "Middlebox interferance" reason code. | (Section 3.6) with a "Middlebox interference" reason code. | |||
These rules should cover all cases where such a failure could happen: | These rules should cover all cases where such a failure could happen: | |||
whether it's on the forward or reverse path and whether the server or | whether it's on the forward or reverse path and whether the server or | |||
the client first sends data. If lost options on data packets occur | the client first sends data. | |||
on any other subflow apart from the initial subflow, it should be | ||||
treated as a standard path failure. The data would not be DATA_ACKed | ||||
(since there is no mapping for the data), and the subflow can be | ||||
closed with a RST, containing a MP_TCPRST option (Section 3.6) with a | ||||
"Middlebox interferance" reason code. | ||||
So far this section has discussed the lost of MPTCP options, either | So far this section has discussed the lost of MPTCP options, either | |||
initially, or during the course of the connection. As described in | initially, or during the course of the connection. As described in | |||
Section 3.3, each portion of data for which there is a mapping is | Section 3.3, each portion of data for which there is a mapping is | |||
protected by a checksum, if checksums have been negotiated. This | protected by a checksum, if checksums have been negotiated. This | |||
mechanism is used to detect if middleboxes have made any adjustments | mechanism is used to detect if middleboxes have made any adjustments | |||
to the payload (added, removed, or changed data). A checksum will | to the payload (added, removed, or changed data). A checksum will | |||
fail if the data has been changed in any way. This will also detect | fail if the data has been changed in any way. This will also detect | |||
if the length of data on the subflow is increased or decreased, and | if the length of data on the subflow is increased or decreased, and | |||
this means the data sequence mapping is no longer valid. The sender | this means the data sequence mapping is no longer valid. The sender | |||
skipping to change at page 54, line 34 ¶ | skipping to change at page 54, line 12 ¶ | |||
feasible to allow multiple subflows between the same two addresses | feasible to allow multiple subflows between the same two addresses | |||
but using different port pairs, and such a facility could be used to | but using different port pairs, and such a facility could be used to | |||
allow load balancing within the network based on 5-tuples (e.g., some | allow load balancing within the network based on 5-tuples (e.g., some | |||
ECMP implementations [RFC2992]). | ECMP implementations [RFC2992]). | |||
3.9.2. Delayed Subflow Start and Subflow Symmetry | 3.9.2. Delayed Subflow Start and Subflow Symmetry | |||
Many TCP connections are short-lived and consist only of a few | Many TCP connections are short-lived and consist only of a few | |||
segments, and so the overheads of using MPTCP outweigh any benefits. | segments, and so the overheads of using MPTCP outweigh any benefits. | |||
A heuristic is required, therefore, to decide when to start using | A heuristic is required, therefore, to decide when to start using | |||
additional subflows in an MPTCP connection. We expect that | additional subflows in an MPTCP connection. Experimental deployments | |||
experience gathered from deployments will provide further guidance on | have shown that MPTCP can be applied in a range of scenarios so an | |||
this, and will be affected by particular application characteristics | implementation is likely to need to take into account factors | |||
(which are likely to change over time). However, a suggested | including the type of traffic being sent and duration of session, and | |||
general-purpose heuristic that an implementation MAY choose to employ | this information MAY be signalled by the application layer. | |||
is as follows. Results from experimental deployments are needed in | ||||
order to verify the correctness of this proposal. | However, for standard TCP traffic, a suggested general-purpose | |||
heuristic that an implementation MAY choose to employ is as follows. | ||||
If a host has data buffered for its peer (which implies that the | If a host has data buffered for its peer (which implies that the | |||
application has received a request for data), the host opens one | application has received a request for data), the host opens one | |||
subflow for each initial window's worth of data that is buffered. | subflow for each initial window's worth of data that is buffered. | |||
Consideration should also be given to limiting the rate of adding new | Consideration should also be given to limiting the rate of adding new | |||
subflows, as well as limiting the total number of subflows open for a | subflows, as well as limiting the total number of subflows open for a | |||
particular connection. A host may choose to vary these values based | particular connection. A host may choose to vary these values based | |||
on its load or knowledge of traffic and path characteristics. | on its load or knowledge of traffic and path characteristics. | |||
skipping to change at page 55, line 21 ¶ | skipping to change at page 54, line 46 ¶ | |||
An additional time-based heuristic could be applied, opening | An additional time-based heuristic could be applied, opening | |||
additional subflows after a given period of time has passed. This | additional subflows after a given period of time has passed. This | |||
would alleviate the above issue, and also provide resilience for low- | would alleviate the above issue, and also provide resilience for low- | |||
bandwidth but long-lived applications. | bandwidth but long-lived applications. | |||
Another issue is that both communicating hosts may simultaneously try | Another issue is that both communicating hosts may simultaneously try | |||
to set up a subflow between the same pair of addresses. This leads | to set up a subflow between the same pair of addresses. This leads | |||
to an inefficient use of resources. | to an inefficient use of resources. | |||
If the the same ports are used on all subflows, as recommended above, | If the same ports are used on all subflows, as recommended above, | |||
then standard TCP simultaneous open logic should take care of this | then standard TCP simultaneous open logic should take care of this | |||
situation and only one subflow will be established between the | situation and only one subflow will be established between the | |||
address pairs. However, this relies on the same ports being used at | address pairs. However, this relies on the same ports being used at | |||
both end hosts. If a host does not support TCP simultaneous open, it | both end hosts. If a host does not support TCP simultaneous open, it | |||
is RECOMMENDED that some element of randomization is applied to the | is RECOMMENDED that some element of randomization is applied to the | |||
time to wait before opening new subflows, so that only one subflow is | time to wait before opening new subflows, so that only one subflow is | |||
created between a given address pair. If, however, hosts signal | created between a given address pair. If, however, hosts signal | |||
additional ports to use (for example, for leveraging ECMP on-path), | additional ports to use (for example, for leveraging ECMP on-path), | |||
this heuristic is not appropriate. | this heuristic is not appropriate. | |||
skipping to change at page 55, line 47 ¶ | skipping to change at page 55, line 24 ¶ | |||
Requirements for MPTCP's handling of unexpected signals have been | Requirements for MPTCP's handling of unexpected signals have been | |||
given in Section 3.8. There are other failure cases, however, where | given in Section 3.8. There are other failure cases, however, where | |||
a hosts can choose appropriate behavior. | a hosts can choose appropriate behavior. | |||
For example, Section 3.1 suggests that a host SHOULD fall back to | For example, Section 3.1 suggests that a host SHOULD fall back to | |||
trying regular TCP SYNs after one or more failures of MPTCP SYNs for | trying regular TCP SYNs after one or more failures of MPTCP SYNs for | |||
a connection. A host may keep a system-wide cache of such | a connection. A host may keep a system-wide cache of such | |||
information, so that it can back off from using MPTCP, firstly for | information, so that it can back off from using MPTCP, firstly for | |||
that particular destination host, and eventually on a whole | that particular destination host, and eventually on a whole | |||
interface, if MPTCP connections continue failing. | interface, if MPTCP connections continue failing. The duration of | |||
such a cache would be implementation-specific. | ||||
Another failure could occur when the MP_JOIN handshake fails. | Another failure could occur when the MP_JOIN handshake fails. | |||
Section 3.8 specifies that an incorrect handshake MUST lead to the | Section 3.8 specifies that an incorrect handshake MUST lead to the | |||
subflow being closed with a RST. A host operating an active | subflow being closed with a RST. A host operating an active | |||
intrusion detection system may choose to start blocking MP_JOIN | intrusion detection system may choose to start blocking MP_JOIN | |||
packets from the source host if multiple failed MP_JOIN attempts are | packets from the source host if multiple failed MP_JOIN attempts are | |||
seen. From the connection initiator's point of view, if an MP_JOIN | seen. From the connection initiator's point of view, if an MP_JOIN | |||
fails, it SHOULD NOT attempt to connect to the same IP address and | fails, it SHOULD NOT attempt to connect to the same IP address and | |||
port during the lifetime of the connection, unless the other host | port during the lifetime of the connection, unless the other host | |||
refreshes the information with another ADD_ADDR option. Note that | refreshes the information with another ADD_ADDR option. Note that | |||
skipping to change at page 58, line 44 ¶ | skipping to change at page 58, line 23 ¶ | |||
mechanism presented in this document should therefore protect against | mechanism presented in this document should therefore protect against | |||
all forms of flooding and hijacking attacks discussed in [RFC6181]. | all forms of flooding and hijacking attacks discussed in [RFC6181]. | |||
The version negotiation specified in Section 3.1, if differing MPTCP | The version negotiation specified in Section 3.1, if differing MPTCP | |||
versions shared a common negotiation format, would allow an on-path | versions shared a common negotiation format, would allow an on-path | |||
attacker to apply a theoretical bid-down attack. Since the v1 and v0 | attacker to apply a theoretical bid-down attack. Since the v1 and v0 | |||
protocols have a different handshake, such an attack would require | protocols have a different handshake, such an attack would require | |||
the client to re-establish the connection using v0, and this being | the client to re-establish the connection using v0, and this being | |||
supported by the server. Note that an on-path attacker would have | supported by the server. Note that an on-path attacker would have | |||
access to the raw data, negating any other TCP-level security | access to the raw data, negating any other TCP-level security | |||
mechanisms. Also a change from [RFC6824] has removed the subflow | mechanisms. Also a change from RFC6824 has removed the subflow | |||
identifier from the MP_PRIO option (Section 3.3.8), to remove the | identifier from the MP_PRIO option (Section 3.3.8), to remove the | |||
theoretical attack where a subflow could be placed in "backup" mode | theoretical attack where a subflow could be placed in "backup" mode | |||
by an attacker. | by an attacker. | |||
During normal operation, regular TCP protection mechanisms (such as | During normal operation, regular TCP protection mechanisms (such as | |||
ensuring sequence numbers are in-window) will provide the same level | ensuring sequence numbers are in-window) will provide the same level | |||
of protection against attacks on individual TCP subflows as exists | of protection against attacks on individual TCP subflows as exists | |||
for regular TCP today. Implementations will introduce additional | for regular TCP today. Implementations will introduce additional | |||
buffers compared to regular TCP, to reassemble data at the connection | buffers compared to regular TCP, to reassemble data at the connection | |||
level. The application of window sizing will minimize the risk of | level. The application of window sizing will minimize the risk of | |||
skipping to change at page 64, line 7 ¶ | skipping to change at page 63, line 41 ¶ | |||
Iljitsch van Beijnum, Lars Eggert, Marcelo Bagnulo, Robert Hancock, | Iljitsch van Beijnum, Lars Eggert, Marcelo Bagnulo, Robert Hancock, | |||
Pasi Sarolahti, Toby Moncaster, Philip Eardley, Sergio Lembo, | Pasi Sarolahti, Toby Moncaster, Philip Eardley, Sergio Lembo, | |||
Lawrence Conroy, Yoshifumi Nishida, Bob Briscoe, Stein Gjessing, | Lawrence Conroy, Yoshifumi Nishida, Bob Briscoe, Stein Gjessing, | |||
Andrew McGregor, Georg Hampel, Anumita Biswas, Wes Eddy, Alexey | Andrew McGregor, Georg Hampel, Anumita Biswas, Wes Eddy, Alexey | |||
Melnikov, Francis Dupont, Adrian Farrel, Barry Leiba, Robert Sparks, | Melnikov, Francis Dupont, Adrian Farrel, Barry Leiba, Robert Sparks, | |||
Sean Turner, Stephen Farrell, Martin Stiemerling, Gregory Detal, | Sean Turner, Stephen Farrell, Martin Stiemerling, Gregory Detal, | |||
Fabien Duchene, Xavier de Foy, Rahul Jadhav, and Klemens Schragel. | Fabien Duchene, Xavier de Foy, Rahul Jadhav, and Klemens Schragel. | |||
8. IANA Considerations | 8. IANA Considerations | |||
This document obsoletes [RFC6824] and as such IANA is requested to | This document obsoletes RFC6824 and as such IANA is requested to | |||
update the TCP option space registry to point to this document for | update the TCP option space registry to point to this document for | |||
Multipath TCP, as follows: | Multipath TCP, as follows: | |||
+------+--------+-----------------------+---------------+ | +------+--------+-----------------------+---------------+ | |||
| Kind | Length | Meaning | Reference | | | Kind | Length | Meaning | Reference | | |||
+------+--------+-----------------------+---------------+ | +------+--------+-----------------------+---------------+ | |||
| 30 | N | Multipath TCP (MPTCP) | This document | | | 30 | N | Multipath TCP (MPTCP) | This document | | |||
+------+--------+-----------------------+---------------+ | +------+--------+-----------------------+---------------+ | |||
Table 1: TCP Option Kind Numbers | Table 1: TCP Option Kind Numbers | |||
8.1. MPTCP Option Subtypes | 8.1. MPTCP Option Subtypes | |||
The 4-bit MPTCP subtype sub-registry ("MPTCP Option Subtypes" under | The 4-bit MPTCP subtype sub-registry ("MPTCP Option Subtypes" under | |||
the "Transmission Control Protocol (TCP) Parameters" registry) was | the "Transmission Control Protocol (TCP) Parameters" registry) was | |||
defined in [RFC6824]. This document defines one additional subtype | defined in RFC6824. Since RFC6824 was an Experimental not Standards | |||
(ADD_ADDR) and updates the references to this document for all sub- | Track RFC, and since no further entries have occurred beyond those | |||
types except ADD_ADDR, which is deprecated. The updates are listed | pointing to RFC6824, IANA is requested to replace the existing | |||
in the following table. | registry with Table 2 and with the following explanatory note. | |||
Note: This registry specifies the MPTCP Option Subtypes for MPTCP v1, | ||||
which obsoletes the Experimental MPTCP v0. For the MPTCP v0 | ||||
subtypes, please refer to RFC6824. | ||||
+-------+-----------------+-------------------------+---------------+ | +-------+-----------------+-------------------------+---------------+ | |||
| Value | Symbol | Name | Reference | | | Value | Symbol | Name | Reference | | |||
+-------+-----------------+-------------------------+---------------+ | +-------+-----------------+-------------------------+---------------+ | |||
| 0x0 | MP_CAPABLE | Multipath Capable | This | | | 0x0 | MP_CAPABLE | Multipath Capable | This | | |||
| | | | document, | | | | | | document, | | |||
| | | | Section 3.1 | | | | | | Section 3.1 | | |||
| 0x1 | MP_JOIN | Join Connection | This | | | 0x1 | MP_JOIN | Join Connection | This | | |||
| | | | document, | | | | | | document, | | |||
| | | | Section 3.2 | | | | | | Section 3.2 | | |||
skipping to change at page 65, line 43 ¶ | skipping to change at page 65, line 7 ¶ | |||
| | | | document, | | | | | | document, | | |||
| | | | Section 3.6 | | | | | | Section 3.6 | | |||
| 0xf | MP_EXPERIMENTAL | Reserved for private | | | | 0xf | MP_EXPERIMENTAL | Reserved for private | | | |||
| | | experiments | | | | | | experiments | | | |||
+-------+-----------------+-------------------------+---------------+ | +-------+-----------------+-------------------------+---------------+ | |||
Table 2: MPTCP Option Subtypes | Table 2: MPTCP Option Subtypes | |||
Values 0x9 through 0xe are currently unassigned. Option 0xf is | Values 0x9 through 0xe are currently unassigned. Option 0xf is | |||
reserved for use by private experiments. Its use may be formalized | reserved for use by private experiments. Its use may be formalized | |||
in a future specification. | in a future specification. Future assignments in this registry are | |||
to be defined by Standards Action as defined by [RFC8126]. | ||||
Assignments consist of the MPTCP subtype's symbolic name and its | ||||
associated value, and a reference to its specification. | ||||
8.2. MPTCP Handshake Algorithms | 8.2. MPTCP Handshake Algorithms | |||
IANA has created another sub-registry, "MPTCP Handshake Algorithms" | The "MPTCP Handshake Algorithms" sub-registry under the "Transmission | |||
under the "Transmission Control Protocol (TCP) Parameters" registry, | Control Protocol (TCP) Parameters" registry was defined in RFC6824. | |||
based on the flags in MP_CAPABLE (Section 3.1). IANA is requested to | Since RFC6824 was an Experimental not Standards Track RFC, and since | |||
update the references of this table to this document, as follows: | no further entries have occurred beyond those pointing to RFC6824, | |||
IANA is requested to replace the existing registry with Table 3 and | ||||
with the following explanatory note. | ||||
Note: This registry specifies the MPTCP Handshake Algorithms for | ||||
MPTCP v1, which obsoletes the Experimental MPTCP v0. For the MPTCP | ||||
v0 subtypes, please refer to RFC6824. | ||||
+-------+----------------------------------------+------------------+ | +-------+----------------------------------------+------------------+ | |||
| Flag | Meaning | Reference | | | Flag | Meaning | Reference | | |||
| Bit | | | | | Bit | | | | |||
+-------+----------------------------------------+------------------+ | +-------+----------------------------------------+------------------+ | |||
| A | Checksum required | This document, | | | A | Checksum required | This document, | | |||
| | | Section 3.1 | | | | | Section 3.1 | | |||
| B | Extensibility | This document, | | | B | Extensibility | This document, | | |||
| | | Section 3.1 | | | | | Section 3.1 | | |||
| C | Do not attempt to establish new | This document, | | | C | Do not attempt to establish new | This document, | | |||
skipping to change at page 66, line 33 ¶ | skipping to change at page 66, line 7 ¶ | |||
B, depending on how Extensibility is defined in future | B, depending on how Extensibility is defined in future | |||
specifications; see Section 3.1 for more information. | specifications; see Section 3.1 for more information. | |||
Future assignments in this registry are also to be defined by | Future assignments in this registry are also to be defined by | |||
Standards Action as defined by [RFC8126]. Assignments consist of the | Standards Action as defined by [RFC8126]. Assignments consist of the | |||
value of the flags, a symbolic name for the algorithm, and a | value of the flags, a symbolic name for the algorithm, and a | |||
reference to its specification. | reference to its specification. | |||
8.3. MP_TCPRST Reason Codes | 8.3. MP_TCPRST Reason Codes | |||
IANA is requested to create a further sub-registry, "MP_TCPRST Reason | IANA is requested to create a further sub-registry, "MPTCP MP_TCPRST | |||
Codes" under the "Transmission Control Protocol (TCP) Parameters" | Reason Codes" under the "Transmission Control Protocol (TCP) | |||
registry, based on the reason code in MP_TCPRST (Section 3.6): | Parameters" registry, based on the reason code in MP_TCPRST | |||
(Section 3.6) message. Initial values for this registry are give in | ||||
Table 4; future assignments are to be defined by Specification | ||||
Required as defined by [RFC8126]. Assignments consist of the value | ||||
of the code, a short description of its meaning, and a reference to | ||||
its specification. The maximum value is 0xff. | ||||
+------+-----------------------------+----------------------------+ | +------+-----------------------------+----------------------------+ | |||
| Code | Meaning | Reference | | | Code | Meaning | Reference | | |||
+------+-----------------------------+----------------------------+ | +------+-----------------------------+----------------------------+ | |||
| 0x00 | Unspecified TCP error | This document, Section 3.6 | | | 0x00 | Unspecified TCP error | This document, Section 3.6 | | |||
| 0x01 | MPTCP specific error | This document, Section 3.6 | | | 0x01 | MPTCP specific error | This document, Section 3.6 | | |||
| 0x02 | Lack of resources | This document, Section 3.6 | | | 0x02 | Lack of resources | This document, Section 3.6 | | |||
| 0x03 | Administratively prohibited | This document, Section 3.6 | | | 0x03 | Administratively prohibited | This document, Section 3.6 | | |||
| 0x04 | Too much outstanding data | This document, Section 3.6 | | | 0x04 | Too much outstanding data | This document, Section 3.6 | | |||
| 0x05 | Unacceptable performance | This document, Section 3.6 | | | 0x05 | Unacceptable performance | This document, Section 3.6 | | |||
skipping to change at page 67, line 18 ¶ | skipping to change at page 66, line 43 ¶ | |||
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7, | [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, | |||
RFC 793, DOI 10.17487/RFC0793, September 1981, | RFC 793, DOI 10.17487/RFC0793, September 1981, | |||
<https://www.rfc-editor.org/info/rfc793>. | <https://www.rfc-editor.org/info/rfc793>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, <https://www.rfc- | DOI 10.17487/RFC2119, March 1997, <https://www.rfc- | |||
editor.org/info/rfc2119>. | editor.org/info/rfc2119>. | |||
[RFC6182] Ford, A., Raiciu, C., Handley, M., Barre, S., and J. | ||||
Iyengar, "Architectural Guidelines for Multipath TCP | ||||
Development", RFC 6182, DOI 10.17487/RFC6182, March 2011, | ||||
<https://www.rfc-editor.org/info/rfc6182>. | ||||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
[SHS] National Institute of Science and Technology, "Secure Hash | [SHS] National Institute of Science and Technology, "Secure Hash | |||
Standard", Federal Information Processing Standard | Standard", Federal Information Processing Standard | |||
(FIPS) 180-4, August 2015, | (FIPS) 180-4, August 2015, | |||
<http://nvlpubs.nist.gov/nistpubs/FIPS/ | <http://nvlpubs.nist.gov/nistpubs/FIPS/ | |||
NIST.FIPS.180-4.pdf>. | NIST.FIPS.180-4.pdf>. | |||
skipping to change at page 69, line 5 ¶ | skipping to change at page 68, line 20 ¶ | |||
Address Translator (Traditional NAT)", RFC 3022, | Address Translator (Traditional NAT)", RFC 3022, | |||
DOI 10.17487/RFC3022, January 2001, <https://www.rfc- | DOI 10.17487/RFC3022, January 2001, <https://www.rfc- | |||
editor.org/info/rfc3022>. | editor.org/info/rfc3022>. | |||
[RFC3135] Border, J., Kojo, M., Griner, J., Montenegro, G., and Z. | [RFC3135] Border, J., Kojo, M., Griner, J., Montenegro, G., and Z. | |||
Shelby, "Performance Enhancing Proxies Intended to | Shelby, "Performance Enhancing Proxies Intended to | |||
Mitigate Link-Related Degradations", RFC 3135, | Mitigate Link-Related Degradations", RFC 3135, | |||
DOI 10.17487/RFC3135, June 2001, <https://www.rfc- | DOI 10.17487/RFC3135, June 2001, <https://www.rfc- | |||
editor.org/info/rfc3135>. | editor.org/info/rfc3135>. | |||
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | ||||
of Explicit Congestion Notification (ECN) to IP", | ||||
RFC 3168, DOI 10.17487/RFC3168, September 2001, | ||||
<https://www.rfc-editor.org/info/rfc3168>. | ||||
[RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, | [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, | |||
"Randomness Requirements for Security", BCP 106, RFC 4086, | "Randomness Requirements for Security", BCP 106, RFC 4086, | |||
DOI 10.17487/RFC4086, June 2005, <https://www.rfc- | DOI 10.17487/RFC4086, June 2005, <https://www.rfc- | |||
editor.org/info/rfc4086>. | editor.org/info/rfc4086>. | |||
[RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common | [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common | |||
Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, | Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, | |||
<https://www.rfc-editor.org/info/rfc4987>. | <https://www.rfc-editor.org/info/rfc4987>. | |||
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | |||
skipping to change at page 69, line 33 ¶ | skipping to change at page 68, line 43 ¶ | |||
[RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's | [RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's | |||
Robustness to Blind In-Window Attacks", RFC 5961, | Robustness to Blind In-Window Attacks", RFC 5961, | |||
DOI 10.17487/RFC5961, August 2010, <https://www.rfc- | DOI 10.17487/RFC5961, August 2010, <https://www.rfc- | |||
editor.org/info/rfc5961>. | editor.org/info/rfc5961>. | |||
[RFC6181] Bagnulo, M., "Threat Analysis for TCP Extensions for | [RFC6181] Bagnulo, M., "Threat Analysis for TCP Extensions for | |||
Multipath Operation with Multiple Addresses", RFC 6181, | Multipath Operation with Multiple Addresses", RFC 6181, | |||
DOI 10.17487/RFC6181, March 2011, <https://www.rfc- | DOI 10.17487/RFC6181, March 2011, <https://www.rfc- | |||
editor.org/info/rfc6181>. | editor.org/info/rfc6181>. | |||
[RFC6182] Ford, A., Raiciu, C., Handley, M., Barre, S., and J. | ||||
Iyengar, "Architectural Guidelines for Multipath TCP | ||||
Development", RFC 6182, DOI 10.17487/RFC6182, March 2011, | ||||
<https://www.rfc-editor.org/info/rfc6182>. | ||||
[RFC6234] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms | [RFC6234] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms | |||
(SHA and SHA-based HMAC and HKDF)", RFC 6234, | (SHA and SHA-based HMAC and HKDF)", RFC 6234, | |||
DOI 10.17487/RFC6234, May 2011, <https://www.rfc- | DOI 10.17487/RFC6234, May 2011, <https://www.rfc- | |||
editor.org/info/rfc6234>. | editor.org/info/rfc6234>. | |||
[RFC6356] Raiciu, C., Handley, M., and D. Wischik, "Coupled | [RFC6356] Raiciu, C., Handley, M., and D. Wischik, "Coupled | |||
Congestion Control for Multipath Transport Protocols", | Congestion Control for Multipath Transport Protocols", | |||
RFC 6356, DOI 10.17487/RFC6356, October 2011, | RFC 6356, DOI 10.17487/RFC6356, October 2011, | |||
<https://www.rfc-editor.org/info/rfc6356>. | <https://www.rfc-editor.org/info/rfc6356>. | |||
[RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence | [RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence | |||
Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February | Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February | |||
2012, <https://www.rfc-editor.org/info/rfc6528>. | 2012, <https://www.rfc-editor.org/info/rfc6528>. | |||
[RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, | ||||
"TCP Extensions for Multipath Operation with Multiple | ||||
Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, | ||||
<https://www.rfc-editor.org/info/rfc6824>. | ||||
[RFC6897] Scharf, M. and A. Ford, "Multipath TCP (MPTCP) Application | [RFC6897] Scharf, M. and A. Ford, "Multipath TCP (MPTCP) Application | |||
Interface Considerations", RFC 6897, DOI 10.17487/RFC6897, | Interface Considerations", RFC 6897, DOI 10.17487/RFC6897, | |||
March 2013, <https://www.rfc-editor.org/info/rfc6897>. | March 2013, <https://www.rfc-editor.org/info/rfc6897>. | |||
[RFC7323] Borman, D., Braden, B., Jacobson, V., and R. | [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. | |||
Scheffenegger, Ed., "TCP Extensions for High Performance", | Scheffenegger, Ed., "TCP Extensions for High Performance", | |||
RFC 7323, DOI 10.17487/RFC7323, September 2014, | RFC 7323, DOI 10.17487/RFC7323, September 2014, | |||
<https://www.rfc-editor.org/info/rfc7323>. | <https://www.rfc-editor.org/info/rfc7323>. | |||
[RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP | [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP | |||
skipping to change at page 79, line 36 ¶ | skipping to change at page 78, line 36 ¶ | |||
------------------------>|M_TIME WAIT|----------------->| M_CLOSED| | ------------------------>|M_TIME WAIT|----------------->| M_CLOSED| | |||
+-----------+ +---------+ | +-----------+ +---------+ | |||
All subflows in CLOSED | All subflows in CLOSED | |||
------------ | ------------ | |||
delete MPTCP PCB | delete MPTCP PCB | |||
Figure 22: Finite State Machine for Connection Closure | Figure 22: Finite State Machine for Connection Closure | |||
Appendix E. Changes from RFC6184 | Appendix E. Changes from RFC6184 | |||
This section lists the key technical changes between RFC6824 | This section lists the key technical changes between RFC6824, | |||
[RFC6824], specifying MPTCP v0, and this document, which obsoletes | specifying MPTCP v0, and this document, which obsoletes RFC6824 and | |||
RFC6824 and specifies MPTCP v1. Note that this specification is not | specifies MPTCP v1. Note that this specification is not backwards | |||
backwards compatible with RFC6824. | compatible with RFC6824. | |||
o The document incorporates lessons learnt from the various | o The document incorporates lessons learnt from the various | |||
implementations, deployments and experiments gathered in the | implementations, deployments and experiments gathered in the | |||
documents "Use Cases and Operational Experience with Multipath | documents "Use Cases and Operational Experience with Multipath | |||
TCP" [RFC8041] and the IETF Journal article "Multipath TCP | TCP" [RFC8041] and the IETF Journal article "Multipath TCP | |||
Deployments" [deployments]. | Deployments" [deployments]. | |||
o Connection initiation, through the exchange of the MP_CAPABLE | o Connection initiation, through the exchange of the MP_CAPABLE | |||
MPTCP option, is different from RFC6824. In order to permit | MPTCP option, is different from RFC6824. The SYN no longer | |||
servers to act statelessly, the SYN doesn't include A's key (it is | includes the initiator's key, allowing the MP_CAPABLE option on | |||
still sent in the ACK). | the SYN to be shorter in length, and to avoid duplicating the | |||
sending of keying material. | ||||
o This requires MP_CAPABLE to also be sent reliably on the third | o This requires MP_CAPABLE to also be sent reliably on the third | |||
ACK. If safe receipt of the third ACK cannot be inferred, the | ACK. If safe receipt of the third ACK cannot be inferred, the | |||
MP_CAPABLE option must be repeated on the first data packet. | MP_CAPABLE option must be repeated on the first data packet. | |||
o In the Flags field of MP_CAPABLE, C is now assigned to mean that | o In the Flags field of MP_CAPABLE, C is now assigned to mean that | |||
the sender of this option will not accept additional MPTCP | the sender of this option will not accept additional MPTCP | |||
subflows to the source address and port. This is an efficiency | subflows to the source address and port. This is an efficiency | |||
improvement, for example where the sender is behind a strict NAT. | improvement, for example where the sender is behind a strict NAT. | |||
End of changes. 44 change blocks. | ||||
128 lines changed or deleted | 136 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |