draft-ietf-mptcp-rfc6824bis-07.txt | draft-ietf-mptcp-rfc6824bis-08.txt | |||
---|---|---|---|---|
Internet Engineering Task Force A. Ford | Internet Engineering Task Force A. Ford | |||
Internet-Draft Pexip | Internet-Draft Pexip | |||
Obsoletes: 6824 (if approved) C. Raiciu | Obsoletes: 6824 (if approved) C. Raiciu | |||
Intended status: Experimental U. Politechnica of Bucharest | Intended status: Experimental U. Politechnica of Bucharest | |||
Expires: May 1, 2017 M. Handley | Expires: January 4, 2018 M. Handley | |||
U. College London | U. College London | |||
O. Bonaventure | O. Bonaventure | |||
U. catholique de Louvain | U. catholique de Louvain | |||
C. Paasch | C. Paasch | |||
Apple, Inc. | Apple, Inc. | |||
October 28, 2016 | July 3, 2017 | |||
TCP Extensions for Multipath Operation with Multiple Addresses | TCP Extensions for Multipath Operation with Multiple Addresses | |||
draft-ietf-mptcp-rfc6824bis-07 | draft-ietf-mptcp-rfc6824bis-08 | |||
Abstract | Abstract | |||
TCP/IP communication is currently restricted to a single path per | TCP/IP communication is currently restricted to a single path per | |||
connection, yet multiple paths often exist between peers. The | connection, yet multiple paths often exist between peers. The | |||
simultaneous use of these multiple paths for a TCP/IP session would | simultaneous use of these multiple paths for a TCP/IP session would | |||
improve resource usage within the network and, thus, improve user | improve resource usage within the network and, thus, improve user | |||
experience through higher throughput and improved resilience to | experience through higher throughput and improved resilience to | |||
network failure. | network failure. | |||
skipping to change at page 2, line 7 ¶ | skipping to change at page 2, line 7 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on May 1, 2017. | This Internet-Draft will expire on January 4, 2018. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2016 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
skipping to change at page 2, line 43 ¶ | skipping to change at page 2, line 43 ¶ | |||
2.1. Initiating an MPTCP Connection . . . . . . . . . . . . . 8 | 2.1. Initiating an MPTCP Connection . . . . . . . . . . . . . 8 | |||
2.2. Associating a New Subflow with an Existing MPTCP | 2.2. Associating a New Subflow with an Existing MPTCP | |||
Connection . . . . . . . . . . . . . . . . . . . . . . . 9 | Connection . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
2.3. Informing the Other Host about Another Potential Address 9 | 2.3. Informing the Other Host about Another Potential Address 9 | |||
2.4. Data Transfer Using MPTCP . . . . . . . . . . . . . . . . 10 | 2.4. Data Transfer Using MPTCP . . . . . . . . . . . . . . . . 10 | |||
2.5. Requesting a Change in a Path's Priority . . . . . . . . 11 | 2.5. Requesting a Change in a Path's Priority . . . . . . . . 11 | |||
2.6. Closing an MPTCP Connection . . . . . . . . . . . . . . . 11 | 2.6. Closing an MPTCP Connection . . . . . . . . . . . . . . . 11 | |||
2.7. Notable Features . . . . . . . . . . . . . . . . . . . . 12 | 2.7. Notable Features . . . . . . . . . . . . . . . . . . . . 12 | |||
3. MPTCP Protocol . . . . . . . . . . . . . . . . . . . . . . . 12 | 3. MPTCP Protocol . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
3.1. Connection Initiation . . . . . . . . . . . . . . . . . . 13 | 3.1. Connection Initiation . . . . . . . . . . . . . . . . . . 13 | |||
3.2. Starting a New Subflow . . . . . . . . . . . . . . . . . 19 | 3.2. Starting a New Subflow . . . . . . . . . . . . . . . . . 20 | |||
3.3. General MPTCP Operation . . . . . . . . . . . . . . . . . 25 | 3.3. General MPTCP Operation . . . . . . . . . . . . . . . . . 25 | |||
3.3.1. Data Sequence Mapping . . . . . . . . . . . . . . . . 26 | 3.3.1. Data Sequence Mapping . . . . . . . . . . . . . . . . 27 | |||
3.3.2. Data Acknowledgments . . . . . . . . . . . . . . . . 30 | 3.3.2. Data Acknowledgments . . . . . . . . . . . . . . . . 30 | |||
3.3.3. Closing a Connection . . . . . . . . . . . . . . . . 31 | 3.3.3. Closing a Connection . . . . . . . . . . . . . . . . 31 | |||
3.3.4. Receiver Considerations . . . . . . . . . . . . . . . 32 | 3.3.4. Receiver Considerations . . . . . . . . . . . . . . . 32 | |||
3.3.5. Sender Considerations . . . . . . . . . . . . . . . . 33 | 3.3.5. Sender Considerations . . . . . . . . . . . . . . . . 33 | |||
3.3.6. Reliability and Retransmissions . . . . . . . . . . . 34 | 3.3.6. Reliability and Retransmissions . . . . . . . . . . . 34 | |||
3.3.7. Congestion Control Considerations . . . . . . . . . . 35 | 3.3.7. Congestion Control Considerations . . . . . . . . . . 35 | |||
3.3.8. Subflow Policy . . . . . . . . . . . . . . . . . . . 36 | 3.3.8. Subflow Policy . . . . . . . . . . . . . . . . . . . 36 | |||
3.4. Address Knowledge Exchange (Path Management) . . . . . . 37 | 3.4. Address Knowledge Exchange (Path Management) . . . . . . 37 | |||
3.4.1. Address Advertisement . . . . . . . . . . . . . . . . 38 | 3.4.1. Address Advertisement . . . . . . . . . . . . . . . . 39 | |||
3.4.2. Remove Address . . . . . . . . . . . . . . . . . . . 42 | 3.4.2. Remove Address . . . . . . . . . . . . . . . . . . . 42 | |||
3.5. Fast Close . . . . . . . . . . . . . . . . . . . . . . . 43 | 3.5. Fast Close . . . . . . . . . . . . . . . . . . . . . . . 43 | |||
3.6. Subflow Reset . . . . . . . . . . . . . . . . . . . . . . 44 | 3.6. Subflow Reset . . . . . . . . . . . . . . . . . . . . . . 44 | |||
3.7. MPTCP Experimental Option . . . . . . . . . . . . . . . . 46 | 3.7. MPTCP Experimental Option . . . . . . . . . . . . . . . . 46 | |||
3.8. Fallback . . . . . . . . . . . . . . . . . . . . . . . . 47 | 3.8. Fallback . . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
3.9. Error Handling . . . . . . . . . . . . . . . . . . . . . 51 | 3.9. Error Handling . . . . . . . . . . . . . . . . . . . . . 51 | |||
3.10. Heuristics . . . . . . . . . . . . . . . . . . . . . . . 51 | 3.10. Heuristics . . . . . . . . . . . . . . . . . . . . . . . 51 | |||
3.10.1. Port Usage . . . . . . . . . . . . . . . . . . . . . 51 | 3.10.1. Port Usage . . . . . . . . . . . . . . . . . . . . . 52 | |||
3.10.2. Delayed Subflow Start and Subflow Symmetry . . . . . 52 | 3.10.2. Delayed Subflow Start and Subflow Symmetry . . . . . 52 | |||
3.10.3. Failure Handling . . . . . . . . . . . . . . . . . . 53 | 3.10.3. Failure Handling . . . . . . . . . . . . . . . . . . 53 | |||
4. Semantic Issues . . . . . . . . . . . . . . . . . . . . . . . 54 | 4. Semantic Issues . . . . . . . . . . . . . . . . . . . . . . . 54 | |||
5. Security Considerations . . . . . . . . . . . . . . . . . . . 55 | 5. Security Considerations . . . . . . . . . . . . . . . . . . . 55 | |||
6. Interactions with Middleboxes . . . . . . . . . . . . . . . . 57 | 6. Interactions with Middleboxes . . . . . . . . . . . . . . . . 58 | |||
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 61 | 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 61 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 61 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 61 | |||
8.1. MPTCP Option Subtypes . . . . . . . . . . . . . . . . . . 62 | 8.1. MPTCP Option Subtypes . . . . . . . . . . . . . . . . . . 62 | |||
8.2. MPTCP Handshake Algorithms . . . . . . . . . . . . . . . 63 | 8.2. MPTCP Handshake Algorithms . . . . . . . . . . . . . . . 63 | |||
8.3. MP_TCPRST Reason Codes . . . . . . . . . . . . . . . . . 63 | 8.3. MP_TCPRST Reason Codes . . . . . . . . . . . . . . . . . 63 | |||
8.4. Experimental option registry . . . . . . . . . . . . . . 64 | 8.4. Experimental option registry . . . . . . . . . . . . . . 64 | |||
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 64 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 64 | |||
9.1. Normative References . . . . . . . . . . . . . . . . . . 64 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 64 | |||
9.2. Informative References . . . . . . . . . . . . . . . . . 65 | 9.2. Informative References . . . . . . . . . . . . . . . . . 65 | |||
Appendix A. Notes on Use of TCP Options . . . . . . . . . . . . 68 | Appendix A. Notes on Use of TCP Options . . . . . . . . . . . . 68 | |||
skipping to change at page 12, line 22 ¶ | skipping to change at page 12, line 22 ¶ | |||
by a NAT. Setting up a new TCP flow is not possible if the | by a NAT. Setting up a new TCP flow is not possible if the | |||
passive opener is behind a NAT; to allow subflows to be created | passive opener is behind a NAT; to allow subflows to be created | |||
when either end is behind a NAT, MPTCP uses the ADD_ADDR message. | when either end is behind a NAT, MPTCP uses the ADD_ADDR message. | |||
o MPTCP falls back to ordinary TCP if MPTCP operation is not | o MPTCP falls back to ordinary TCP if MPTCP operation is not | |||
possible, for example, if one host is not MPTCP capable or if a | possible, for example, if one host is not MPTCP capable or if a | |||
middlebox alters the payload. | middlebox alters the payload. | |||
o To meet the threats identified in [RFC6181], the following steps | o To meet the threats identified in [RFC6181], the following steps | |||
are taken: keys are sent in the clear in the MP_CAPABLE messages; | are taken: keys are sent in the clear in the MP_CAPABLE messages; | |||
MP_JOIN messages are secured with HMAC-SHA1 ([RFC2104], [sha1]) | MP_JOIN messages are secured with HMAC-SHA256 ([RFC2104], [SHS]) | |||
using those keys; and standard TCP validity checks are made on the | using those keys; and standard TCP validity checks are made on the | |||
other messages (ensuring sequence numbers are in-window | other messages (ensuring sequence numbers are in-window | |||
[RFC5961]). | [RFC5961]). | |||
3. MPTCP Protocol | 3. MPTCP Protocol | |||
This section describes the operation of the MPTCP protocol, and is | This section describes the operation of the MPTCP protocol, and is | |||
subdivided into sections for each key part of the protocol operation. | subdivided into sections for each key part of the protocol operation. | |||
All MPTCP operations are signaled using optional TCP header fields. | All MPTCP operations are signaled using optional TCP header fields. | |||
skipping to change at page 16, line 29 ¶ | skipping to change at page 16, line 29 ¶ | |||
back to using regular TCP by not sending a MP_CAPABLE in the SYN/ACK. | back to using regular TCP by not sending a MP_CAPABLE in the SYN/ACK. | |||
The ACK carries both A's key and B's key. This is the first time | The ACK carries both A's key and B's key. This is the first time | |||
that A's key is seen on the wire, although it is expected that A will | that A's key is seen on the wire, although it is expected that A will | |||
have generated a key locally before the initial SYN. The echoing of | have generated a key locally before the initial SYN. The echoing of | |||
B's key allows B to operate statelessly, as described above. | B's key allows B to operate statelessly, as described above. | |||
Therefore, A's key must be delivered reliably to B, and in order to | Therefore, A's key must be delivered reliably to B, and in order to | |||
do this, the transmission of this packet must be made reliable. | do this, the transmission of this packet must be made reliable. | |||
If B has data to send first, then the reliable delivery of the ACK | If B has data to send first, then the reliable delivery of the ACK | |||
can be inferred by the receipt of this data with an appropriate MPTCP | can be inferred by the receipt of this data with a MPTCP Data | |||
Data Sequence Signal (DSS) option (Section 3.3). If, however, A | Sequence Signal (DSS) option (Section 3.3). If, however, A wishes to | |||
wishes to send data first, it would not know whether the ACK has | send data first, it would not know whether the ACK has successfully | |||
successfully been received, and thus whether the MPTCP is | been received, and thus whether the MPTCP is successfully | |||
successfully established. Therefore, on the first data A has to send | established. Therefore, on the first data A has to send (if it has | |||
(if it has not received any data from B), it MUST also include a | not received any data from B), it MUST also include a MP_CAPABLE | |||
MP_CAPABLE option, with additional data parameters. This packet may | option, with additional data parameters (the Data-Level Length and | |||
be the third ACK if data is ready to be sent by the application, or | optional Checksum as shown in Figure 4). This packet may be the | |||
may be a later packet if the application only later has data to send. | third ACK if data is ready to be sent by the application, or may be a | |||
This MP_CAPABLE option is in place of the DSS, and simply specifies | later packet if the application only later has data to send. This | |||
the data-level length of the payload, and the checksum (if the use of | MP_CAPABLE option is in place of the DSS, and simply specifies the | |||
data-level length of the payload, and the checksum (if the use of | ||||
checksums is negotiated). This is the minimal data required to | checksums is negotiated). This is the minimal data required to | |||
establish a MPTCP connection - it allows validation of the payload, | establish a MPTCP connection - it allows validation of the payload, | |||
and given it is the first data, the Initial Data Sequence Number | and given it is the first data, the Initial Data Sequence Number | |||
(IDSN) is also known (as it is generated from the key, as described | (IDSN) is also known (as it is generated from the key, as described | |||
below). Conveying the keys on the first data packet allows the TCP | below). Conveying the keys on the first data packet allows the TCP | |||
reliability mechanisms to ensure the packet is successfully | reliability mechanisms to ensure the packet is successfully | |||
delivered. The receiver will acknowledge this data a the connection | delivered. The receiver will acknowledge this data a the connection | |||
level with a Data ACK, as if a DSS option has been received. | level with a Data ACK, as if a DSS option has been received. | |||
There could be situations where both A and B attempt to transmit | ||||
initial data at the same time. For example, if A did not initially | ||||
have data to send, but then needed to transmit data before it had | ||||
received anything from B, it would use a MP_CAPABLE option with data | ||||
parameters (since it would not know if the MP_CAPABLE on the ACK was | ||||
received). In such a situation, B may also have transmitted data | ||||
with a DSS option, but it had not yet been received at A. Therefore, | ||||
B has received data with a MP_CAPABLE mapping after it has sent data | ||||
with a DSS option. To ensure these situations can be handled, it | ||||
follows that the data parameters in a MP_CAPABLE are semantically | ||||
equivalent to those in a DSS option and can be used interchangeably. | ||||
Similar situations could occur when the MP_CAPABLE with data is lost | ||||
and retransmitted. Furthermore, in the case of TCP Segmentation | ||||
Offloading, the MP_CAPABLE with data parameters may be duplicated | ||||
across multiple packets, and implementations must also be able to | ||||
cope with duplicate MP_CAPABLE mappings as well as duplicate DSS | ||||
mappings. | ||||
Additionally, the MP_CAPABLE exchange allows the safe passage of | Additionally, the MP_CAPABLE exchange allows the safe passage of | |||
MPTCP options on SYN packets to be determined. If any of these | MPTCP options on SYN packets to be determined. If any of these | |||
options are dropped, MPTCP will gracefully fall back to regular | options are dropped, MPTCP will gracefully fall back to regular | |||
single-path TCP, as documented in Section 3.8. Note that new | single-path TCP, as documented in Section 3.8. Note that new | |||
subflows MUST NOT be established (using the process documented in | subflows MUST NOT be established (using the process documented in | |||
Section 3.2) until a Data Sequence Signal (DSS) option has been | Section 3.2) until a Data Sequence Signal (DSS) option has been | |||
successfully received across the path (as documented in Section 3.3). | successfully received across the path (as documented in Section 3.3). | |||
The first 4 bits of the first octet in the MP_CAPABLE option | The first 4 bits of the first octet in the MP_CAPABLE option | |||
(Figure 4) define the MPTCP option subtype (see Section 8; for | (Figure 4) define the MPTCP option subtype (see Section 8; for | |||
skipping to change at page 17, line 41 ¶ | skipping to change at page 18, line 14 ¶ | |||
C: The third bit, labeled "C", is set to "1" to indicate that the | C: The third bit, labeled "C", is set to "1" to indicate that the | |||
sender of this option will not accept additional MPTCP subflows to | sender of this option will not accept additional MPTCP subflows to | |||
the source address and port, and therefore the receiver MUST NOT | the source address and port, and therefore the receiver MUST NOT | |||
try to open any additional subflows towards this address and port. | try to open any additional subflows towards this address and port. | |||
This is an efficiency improvement for situations where the sender | This is an efficiency improvement for situations where the sender | |||
knows a restriction is in place, for example if the sender is | knows a restriction is in place, for example if the sender is | |||
behind a strict NAT, or operating behind a legacy Layer 4 load | behind a strict NAT, or operating behind a legacy Layer 4 load | |||
balancer. | balancer. | |||
D through H: The remaining bits, labeled "C" through "H", are used | D through H: The remaining bits, labeled "D" through "H", are used | |||
for crypto algorithm negotiation. Currently only the rightmost | for crypto algorithm negotiation. Currently only the rightmost | |||
bit, labeled "H", is assigned. Bit "H" indicates the use of HMAC- | bit, labeled "H", is assigned. Bit "H" indicates the use of HMAC- | |||
SHA1 (as defined in Section 3.2). An implementation that only | SHA1 (as defined in Section 3.2). An implementation that only | |||
supports this method MUST set bit "H" to 1, and bits "C" through | supports this method MUST set bit "H" to 1, and bits "D" through | |||
"G" to 0. | "G" to 0. | |||
A crypto algorithm MUST be specified. If flag bits C through H are | A crypto algorithm MUST be specified. If flag bits D through H are | |||
all 0, the MP_CAPABLE option MUST be treated as invalid and ignored | all 0, the MP_CAPABLE option MUST be treated as invalid and ignored | |||
(that is, it must be treated as a regular TCP handshake). | (that is, it must be treated as a regular TCP handshake). | |||
The selection of the authentication algorithm also impacts the | The selection of the authentication algorithm also impacts the | |||
algorithm used to generate the token and the Initial Data Sequence | algorithm used to generate the token and the Initial Data Sequence | |||
Number (IDSN). In this specification, with only the SHA-1 algorithm | Number (IDSN). In this specification, with only the SHA-256 | |||
(bit "H") specified and selected, the token MUST be a truncated (most | algorithm (bit "H") specified and selected, the token MUST be a | |||
significant 32 bits) SHA-1 hash ([sha1], [RFC6234]) of the key. A | truncated (most significant 32 bits) SHA-256 hash ([SHS], [RFC6234]) | |||
different, 64-bit truncation (the least significant 64 bits) of the | of the key. A different, 64-bit truncation (the least significant 64 | |||
SHA-1 hash of the key MUST be used as the IDSN. Note that the key | bits) of the SHA-256 hash of the key MUST be used as the IDSN. Note | |||
MUST be hashed in network byte order. Also note that the "least | that the key MUST be hashed in network byte order. Also note that | |||
significant" bits MUST be the rightmost bits of the SHA-1 digest, as | the "least significant" bits MUST be the rightmost bits of the | |||
per [sha1]. Future specifications of the use of the crypto bits may | SHA-256 digest, as per [SHS]. Future specifications of the use of | |||
choose to specify different algorithms for token and IDSN generation. | the crypto bits may choose to specify different algorithms for token | |||
and IDSN generation. | ||||
Both the crypto and checksum bits negotiate capabilities in similar | Both the crypto and checksum bits negotiate capabilities in similar | |||
ways. For the Checksum Required bit (labeled "A"), if either host | ways. For the Checksum Required bit (labeled "A"), if either host | |||
requires the use of checksums, checksums MUST be used. In other | requires the use of checksums, checksums MUST be used. In other | |||
words, the only way for checksums not to be used is if both hosts in | words, the only way for checksums not to be used is if both hosts in | |||
their SYNs set A=0. This decision is confirmed by the setting of the | their SYNs set A=0. This decision is confirmed by the setting of the | |||
"A" bit in the third packet (the ACK) of the handshake. For example, | "A" bit in the third packet (the ACK) of the handshake. For example, | |||
if the initiator sets A=0 in the SYN, but the responder sets A=1 in | if the initiator sets A=0 in the SYN, but the responder sets A=1 in | |||
the SYN/ACK, checksums MUST be used in both directions, and the | the SYN/ACK, checksums MUST be used in both directions, and the | |||
initiator will set A=1 in the ACK. The decision whether to use | initiator will set A=1 in the ACK. The decision whether to use | |||
skipping to change at page 19, line 26 ¶ | skipping to change at page 19, line 49 ¶ | |||
made first will be up to local policy. It is possible that MPTCP and | made first will be up to local policy. It is possible that MPTCP and | |||
non-MPTCP SYNs could get reordered in the network. Therefore, the | non-MPTCP SYNs could get reordered in the network. Therefore, the | |||
final state is inferred from the presence or absence of the | final state is inferred from the presence or absence of the | |||
MP_CAPABLE option in the third packet of the TCP handshake. If this | MP_CAPABLE option in the third packet of the TCP handshake. If this | |||
option is not present, the connection SHOULD fall back to regular | option is not present, the connection SHOULD fall back to regular | |||
TCP, as documented in Section 3.8. | TCP, as documented in Section 3.8. | |||
The initial data sequence number on an MPTCP connection is generated | The initial data sequence number on an MPTCP connection is generated | |||
from the key. The algorithm for IDSN generation is also determined | from the key. The algorithm for IDSN generation is also determined | |||
from the negotiated authentication algorithm. In this specification, | from the negotiated authentication algorithm. In this specification, | |||
with only the SHA-1 algorithm specified and selected, the IDSN of a | with only the SHA-256 algorithm specified and selected, the IDSN of a | |||
host MUST be the least significant 64 bits of the SHA-1 hash of its | host MUST be the least significant 64 bits of the SHA-256 hash of its | |||
key, i.e., IDSN-A = Hash(Key-A) and IDSN-B = Hash(Key-B). This | key, i.e., IDSN-A = Hash(Key-A) and IDSN-B = Hash(Key-B). This | |||
deterministic generation of the IDSN allows a receiver to ensure that | deterministic generation of the IDSN allows a receiver to ensure that | |||
there are no gaps in sequence space at the start of the connection. | there are no gaps in sequence space at the start of the connection. | |||
The SYN with MP_CAPABLE occupies the first octet of data sequence | The SYN with MP_CAPABLE occupies the first octet of data sequence | |||
space, although this does not need to be acknowledged at the | space, although this does not need to be acknowledged at the | |||
connection level until the first data is sent (see Section 3.3). | connection level until the first data is sent (see Section 3.3). | |||
3.2. Starting a New Subflow | 3.2. Starting a New Subflow | |||
Once an MPTCP connection has begun with the MP_CAPABLE exchange, | Once an MPTCP connection has begun with the MP_CAPABLE exchange, | |||
skipping to change at page 20, line 18 ¶ | skipping to change at page 20, line 39 ¶ | |||
algorithm. An MP_JOIN option is present in the SYN, SYN/ACK, and ACK | algorithm. An MP_JOIN option is present in the SYN, SYN/ACK, and ACK | |||
of the three-way handshake, although in each case with a different | of the three-way handshake, although in each case with a different | |||
format. | format. | |||
In the first MP_JOIN on the SYN packet, illustrated in Figure 5, the | In the first MP_JOIN on the SYN packet, illustrated in Figure 5, the | |||
initiator sends a token, random number, and address ID. | initiator sends a token, random number, and address ID. | |||
The token is used to identify the MPTCP connection and is a | The token is used to identify the MPTCP connection and is a | |||
cryptographic hash of the receiver's key, as exchanged in the initial | cryptographic hash of the receiver's key, as exchanged in the initial | |||
MP_CAPABLE handshake (Section 3.1). In this specification, the | MP_CAPABLE handshake (Section 3.1). In this specification, the | |||
tokens presented in this option are generated by the SHA-1 ([sha1], | tokens presented in this option are generated by the SHA-256 ([SHS], | |||
[RFC6234]) algorithm, truncated to the most significant 32 bits. The | [RFC6234]) algorithm, truncated to the most significant 32 bits. The | |||
token included in the MP_JOIN option is the token that the receiver | token included in the MP_JOIN option is the token that the receiver | |||
of the packet uses to identify this connection; i.e., Host A will | of the packet uses to identify this connection; i.e., Host A will | |||
send Token-B (which is generated from Key-B). Note that the hash | send Token-B (which is generated from Key-B). Note that the hash | |||
generation algorithm can be overridden by the choice of cryptographic | generation algorithm can be overridden by the choice of cryptographic | |||
handshake algorithm, as defined in Section 3.1. | handshake algorithm, as defined in Section 3.1. | |||
The MP_JOIN SYN sends not only the token (which is static for a | The MP_JOIN SYN sends not only the token (which is static for a | |||
connection) but also random numbers (nonces) that are used to prevent | connection) but also random numbers (nonces) that are used to prevent | |||
replay attacks on the authentication method. Recommendations for the | replay attacks on the authentication method. Recommendations for the | |||
skipping to change at page 21, line 46 ¶ | skipping to change at page 22, line 19 ¶ | |||
that the 32-bit token in the MP_JOIN SYN gives sufficient protection | that the 32-bit token in the MP_JOIN SYN gives sufficient protection | |||
against blind state exhaustion attacks; therefore, there is no need | against blind state exhaustion attacks; therefore, there is no need | |||
to provide mechanisms to allow a responder to operate statelessly at | to provide mechanisms to allow a responder to operate statelessly at | |||
the MP_JOIN stage. | the MP_JOIN stage. | |||
An HMAC is sent by both hosts -- by the initiator (Host A) in the | An HMAC is sent by both hosts -- by the initiator (Host A) in the | |||
third packet (the ACK) and by the responder (Host B) in the second | third packet (the ACK) and by the responder (Host B) in the second | |||
packet (the SYN/ACK). Doing the HMAC exchange at this stage allows | packet (the SYN/ACK). Doing the HMAC exchange at this stage allows | |||
both hosts to have first exchanged random data (in the first two SYN | both hosts to have first exchanged random data (in the first two SYN | |||
packets) that is used as the "message". This specification defines | packets) that is used as the "message". This specification defines | |||
that HMAC as defined in [RFC2104] is used, along with the SHA-1 hash | that HMAC as defined in [RFC2104] is used, along with the SHA-256 | |||
algorithm [sha1] (potentially implemented as in [RFC6234]), thus | hash algorithm [SHS] (potentially implemented as in [RFC6234]), thus | |||
generating a 160-bit / 20-octet HMAC. Due to option space | generating a 160-bit / 20-octet HMAC. Due to option space | |||
limitations, the HMAC included in the SYN/ACK is truncated to the | limitations, the HMAC included in the SYN/ACK is truncated to the | |||
leftmost 64 bits, but this is acceptable since random numbers are | leftmost 64 bits, but this is acceptable since random numbers are | |||
used; thus, an attacker only has one chance to guess the HMAC | used; thus, an attacker only has one chance to guess the HMAC | |||
correctly (if the HMAC is incorrect, the TCP connection is closed, so | correctly (if the HMAC is incorrect, the TCP connection is closed, so | |||
a new MP_JOIN negotiation with a new random number is required). | a new MP_JOIN negotiation with a new random number is required). | |||
The initiator's authentication information is sent in its first ACK | The initiator's authentication information is sent in its first ACK | |||
(the third packet of the handshake), as shown in Figure 7. This data | (the third packet of the handshake), as shown in Figure 7. This data | |||
needs to be sent reliably, since it is the only time this HMAC is | needs to be sent reliably, since it is the only time this HMAC is | |||
skipping to change at page 29, line 32 ¶ | skipping to change at page 29, line 42 ¶ | |||
incrementing the upper 32 bits of sequence number each time the lower | incrementing the upper 32 bits of sequence number each time the lower | |||
32 bits wrap. A sanity check MUST be implemented to ensure that a | 32 bits wrap. A sanity check MUST be implemented to ensure that a | |||
wrap occurs at an expected time (e.g., the sequence number jumps from | wrap occurs at an expected time (e.g., the sequence number jumps from | |||
a very high number to a very low number) and is not triggered by out- | a very high number to a very low number) and is not triggered by out- | |||
of-order packets. | of-order packets. | |||
As with the standard TCP sequence number, the data sequence number | As with the standard TCP sequence number, the data sequence number | |||
should not start at zero, but at a random value to make blind session | should not start at zero, but at a random value to make blind session | |||
hijacking harder. This specification requires setting the initial | hijacking harder. This specification requires setting the initial | |||
data sequence number (IDSN) of each host to the least significant 64 | data sequence number (IDSN) of each host to the least significant 64 | |||
bits of the SHA-1 hash of the host's key, as described in | bits of the SHA-256 hash of the host's key, as described in | |||
Section 3.1. This is required also in order for the receiver to know | Section 3.1. This is required also in order for the receiver to know | |||
what the expected IDSN is, and thus determine if any initial | what the expected IDSN is, and thus determine if any initial | |||
connection-level packets are missing; this is particularly relevant | connection-level packets are missing; this is particularly relevant | |||
if two subflows start transmitting simultaneously. | if two subflows start transmitting simultaneously. | |||
A data sequence mapping does not need to be included in every MPTCP | A data sequence mapping does not need to be included in every MPTCP | |||
packet, as long as the subflow sequence space in that packet is | packet, as long as the subflow sequence space in that packet is | |||
covered by a mapping known at the receiver. This can be used to | covered by a mapping known at the receiver. This can be used to | |||
reduce overhead in cases where the mapping is known in advance; one | reduce overhead in cases where the mapping is known in advance; one | |||
such case is when there is a single subflow between the hosts, | such case is when there is a single subflow between the hosts, | |||
skipping to change at page 39, line 45 ¶ | skipping to change at page 40, line 6 ¶ | |||
the explicit specification of a different port is required. If no | the explicit specification of a different port is required. If no | |||
port is specified, MPTCP SHOULD attempt to connect to the specified | port is specified, MPTCP SHOULD attempt to connect to the specified | |||
address on the same port as is already in use by the subflow on which | address on the same port as is already in use by the subflow on which | |||
the ADD_ADDR signal was sent; this is discussed in more detail in | the ADD_ADDR signal was sent; this is discussed in more detail in | |||
Section 3.10. | Section 3.10. | |||
The Truncated HMAC present in this Option is the rightmost 64 bits of | The Truncated HMAC present in this Option is the rightmost 64 bits of | |||
an HMAC, negotiated and calculated in the same way as for MP_JOIN as | an HMAC, negotiated and calculated in the same way as for MP_JOIN as | |||
described in Section 3.2. For this specification of MPTCP, as there | described in Section 3.2. For this specification of MPTCP, as there | |||
is only one hash algorithm option specified, this will be HMAC as | is only one hash algorithm option specified, this will be HMAC as | |||
defined in [RFC2104], using the SHA-1 hash algorithm [sha1], | defined in [RFC2104], using the SHA-256 hash algorithm [SHS], | |||
implemented as in [RFC6234]. In the same way as for MP_JOIN, the key | implemented as in [RFC6234]. In the same way as for MP_JOIN, the key | |||
for the HMAC algorithm, in the case of the message transmitted by | for the HMAC algorithm, in the case of the message transmitted by | |||
Host A, will be Key-A followed by Key-B, and in the case of Host B, | Host A, will be Key-A followed by Key-B, and in the case of Host B, | |||
Key-B followed by Key-A. These are the keys that were exchanged in | Key-B followed by Key-A. These are the keys that were exchanged in | |||
the original MP_CAPABLE handshake. The message for the HMAC is the | the original MP_CAPABLE handshake. The message for the HMAC is the | |||
Address ID, IP Address, and Port which precede the HMAC in the | Address ID, IP Address, and Port which precede the HMAC in the | |||
ADD_ADDR option. If the port is not present in the ADD_ADDR option, | ADD_ADDR option. If the port is not present in the ADD_ADDR option, | |||
the HMAC message will nevertheless include two octets of value zero. | the HMAC message will nevertheless include two octets of value zero. | |||
The rationale for the HMAC is to prevent unauthorized entities from | The rationale for the HMAC is to prevent unauthorized entities from | |||
injecting ADD_ADDR signals in an attempt to hijack a connection. | injecting ADD_ADDR signals in an attempt to hijack a connection. | |||
skipping to change at page 44, line 15 ¶ | skipping to change at page 44, line 25 ¶ | |||
Host A receives an MP_FASTCLOSE instead of a TCP RST, both hosts | Host A receives an MP_FASTCLOSE instead of a TCP RST, both hosts | |||
attempted fast closure simultaneously. Host A should reply with a | attempted fast closure simultaneously. Host A should reply with a | |||
TCP RST and tear down the connection. | TCP RST and tear down the connection. | |||
o If Host A does not receive a TCP RST in reply to its MP_FASTCLOSE | o If Host A does not receive a TCP RST in reply to its MP_FASTCLOSE | |||
after one retransmission timeout (RTO) (the RTO of the subflow | after one retransmission timeout (RTO) (the RTO of the subflow | |||
where the MPTCP_RST has been sent), it SHOULD retransmit the | where the MPTCP_RST has been sent), it SHOULD retransmit the | |||
MP_FASTCLOSE. The number of retransmissions SHOULD be limited to | MP_FASTCLOSE. The number of retransmissions SHOULD be limited to | |||
avoid this connection from being retained for a long time, but | avoid this connection from being retained for a long time, but | |||
this limit is implementation specific. A RECOMMENDED number is 3. | this limit is implementation specific. A RECOMMENDED number is 3. | |||
If no TCP RST is received in response, Host A SHOULD send a TCP | ||||
RST itself when it releases state in order to clear any remaining | ||||
state at middleboxes. | ||||
3.6. Subflow Reset | 3.6. Subflow Reset | |||
As discussed in Section 3.5 above, the MP_FASTCLOSE option provides a | As discussed in Section 3.5 above, the MP_FASTCLOSE option provides a | |||
connection-level reset roughly analagous to a TCP RST. Regular TCP | connection-level reset roughly analagous to a TCP RST. Regular TCP | |||
RST options remain used to at the subflow-level to indicate the | RST options remain used to at the subflow-level to indicate the | |||
receiving host has no knowledge of the MPTCP subflow or TCP | receiving host has no knowledge of the MPTCP subflow or TCP | |||
connection to which the packet belongs. | connection to which the packet belongs. | |||
However, in MPTCP, there may be many reasons for rejecting the | However, in MPTCP, there may be many reasons for rejecting the | |||
skipping to change at page 63, line 12 ¶ | skipping to change at page 63, line 12 ¶ | |||
Values 0x9 through 0xe are currently unassigned. | Values 0x9 through 0xe are currently unassigned. | |||
8.2. MPTCP Handshake Algorithms | 8.2. MPTCP Handshake Algorithms | |||
IANA has created another sub-registry, "MPTCP Handshake Algorithms" | IANA has created another sub-registry, "MPTCP Handshake Algorithms" | |||
under the "Transmission Control Protocol (TCP) Parameters" registry, | under the "Transmission Control Protocol (TCP) Parameters" registry, | |||
based on the flags in MP_CAPABLE (Section 3.1). IANA is requested to | based on the flags in MP_CAPABLE (Section 3.1). IANA is requested to | |||
update the references of this table to this document, as follows: | update the references of this table to this document, as follows: | |||
+----------+-------------------+----------------------------+ | +---------+----------------------------------+----------------------+ | |||
| Flag Bit | Meaning | Reference | | | Flag | Meaning | Reference | | |||
+----------+-------------------+----------------------------+ | | Bit | | | | |||
| A | Checksum required | This document, Section 3.1 | | +---------+----------------------------------+----------------------+ | |||
| B | Extensibility | This document, Section 3.1 | | | A | Checksum required | This document, | | |||
| C-G | Unassigned | | | | | | Section 3.1 | | |||
| H | HMAC-SHA1 | This document, Section 3.2 | | | B | Extensibility | This document, | | |||
+----------+-------------------+----------------------------+ | | | | Section 3.1 | | |||
| C | Do not attempt to connect to | This document, | | ||||
| | source address | Section 3.1 | | ||||
| D-G | Unassigned | | | ||||
| H | HMAC-SHA1 | This document, | | ||||
| | | Section 3.2 | | ||||
+---------+----------------------------------+----------------------+ | ||||
Table 3: MPTCP Handshake Algorithms | Table 3: MPTCP Handshake Algorithms | |||
Note that the meanings of bits C through H can be dependent upon bit | Note that the meanings of bits D through H can be dependent upon bit | |||
B, depending on how Extensibility is defined in future | B, depending on how Extensibility is defined in future | |||
specifications; see Section 3.1 for more information. | specifications; see Section 3.1 for more information. | |||
Future assignments in this registry are also to be defined by | Future assignments in this registry are also to be defined by | |||
Standards Action as defined by [RFC5226]. Assignments consist of the | Standards Action as defined by [RFC5226]. Assignments consist of the | |||
value of the flags, a symbolic name for the algorithm, and a | value of the flags, a symbolic name for the algorithm, and a | |||
reference to its specification. | reference to its specification. | |||
8.3. MP_TCPRST Reason Codes | 8.3. MP_TCPRST Reason Codes | |||
skipping to change at page 64, line 46 ¶ | skipping to change at page 65, line 15 ¶ | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<http://www.rfc-editor.org/info/rfc2119>. | <http://www.rfc-editor.org/info/rfc2119>. | |||
[RFC6182] Ford, A., Raiciu, C., Handley, M., Barre, S., and J. | [RFC6182] Ford, A., Raiciu, C., Handley, M., Barre, S., and J. | |||
Iyengar, "Architectural Guidelines for Multipath TCP | Iyengar, "Architectural Guidelines for Multipath TCP | |||
Development", RFC 6182, DOI 10.17487/RFC6182, March 2011, | Development", RFC 6182, DOI 10.17487/RFC6182, March 2011, | |||
<http://www.rfc-editor.org/info/rfc6182>. | <http://www.rfc-editor.org/info/rfc6182>. | |||
[sha1] National Institute of Science and Technology, "Secure Hash | [SHS] National Institute of Science and Technology, "Secure Hash | |||
Standard", Federal Information Processing Standard | Standard", Federal Information Processing Standard | |||
(FIPS) 180-3, October 2008, | (FIPS) 180-4, August 2015, | |||
<http://csrc.nist.gov/publications/fips/fips180-3/ | <http://nvlpubs.nist.gov/nistpubs/FIPS/ | |||
fips180-3_final.pdf>. | NIST.FIPS.180-4.pdf>. | |||
9.2. Informative References | 9.2. Informative References | |||
[howhard] Raiciu, C., Paasch, C., Barre, S., Ford, A., Honda, M., | [howhard] Raiciu, C., Paasch, C., Barre, S., Ford, A., Honda, M., | |||
Duchene, F., Bonaventure, O., and M. Handley, "How Hard | Duchene, F., Bonaventure, O., and M. Handley, "How Hard | |||
Can It Be? Designing and Implementing a Deployable | Can It Be? Designing and Implementing a Deployable | |||
Multipath TCP", Usenix Symposium on Networked Systems | Multipath TCP", Usenix Symposium on Networked Systems | |||
Design and Implementation 2012, 2012, | Design and Implementation 2012, 2012, | |||
<https://www.usenix.org/conference/nsdi12/how-hard-can-it- | <https://www.usenix.org/conference/nsdi12/how-hard-can-it- | |||
be-designing-and-implementing-deployable-multipath-tcp>. | be-designing-and-implementing-deployable-multipath-tcp>. | |||
skipping to change at page 66, line 31 ¶ | skipping to change at page 66, line 49 ¶ | |||
[RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, | [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, | |||
"Randomness Requirements for Security", BCP 106, RFC 4086, | "Randomness Requirements for Security", BCP 106, RFC 4086, | |||
DOI 10.17487/RFC4086, June 2005, | DOI 10.17487/RFC4086, June 2005, | |||
<http://www.rfc-editor.org/info/rfc4086>. | <http://www.rfc-editor.org/info/rfc4086>. | |||
[RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common | [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common | |||
Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, | Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, | |||
<http://www.rfc-editor.org/info/rfc4987>. | <http://www.rfc-editor.org/info/rfc4987>. | |||
[RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an | [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an | |||
IANA Considerations Section in RFCs", BCP 26, RFC 5226, | IANA Considerations Section in RFCs", RFC 5226, | |||
DOI 10.17487/RFC5226, May 2008, | DOI 10.17487/RFC5226, May 2008, | |||
<http://www.rfc-editor.org/info/rfc5226>. | <http://www.rfc-editor.org/info/rfc5226>. | |||
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | |||
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, | Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, | |||
<http://www.rfc-editor.org/info/rfc5681>. | <http://www.rfc-editor.org/info/rfc5681>. | |||
[RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's | [RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's | |||
Robustness to Blind In-Window Attacks", RFC 5961, | Robustness to Blind In-Window Attacks", RFC 5961, | |||
DOI 10.17487/RFC5961, August 2010, | DOI 10.17487/RFC5961, August 2010, | |||
End of changes. 28 change blocks. | ||||
55 lines changed or deleted | 84 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |