draft-ietf-mptcp-rfc6824bis-15.txt   draft-ietf-mptcp-rfc6824bis-16.txt 
Internet Engineering Task Force A. Ford Internet Engineering Task Force A. Ford
Internet-Draft Pexip Internet-Draft Pexip
Obsoletes: 6824 (if approved) C. Raiciu Obsoletes: 6824 (if approved) C. Raiciu
Intended status: Standards Track U. Politechnica of Bucharest Intended status: Standards Track U. Politechnica of Bucharest
Expires: November 9, 2019 M. Handley Expires: November 23, 2019 M. Handley
U. College London U. College London
O. Bonaventure O. Bonaventure
U. catholique de Louvain U. catholique de Louvain
C. Paasch C. Paasch
Apple, Inc. Apple, Inc.
May 8, 2019 May 22, 2019
TCP Extensions for Multipath Operation with Multiple Addresses TCP Extensions for Multipath Operation with Multiple Addresses
draft-ietf-mptcp-rfc6824bis-15 draft-ietf-mptcp-rfc6824bis-16
Abstract Abstract
TCP/IP communication is currently restricted to a single path per TCP/IP communication is currently restricted to a single path per
connection, yet multiple paths often exist between peers. The connection, yet multiple paths often exist between peers. The
simultaneous use of these multiple paths for a TCP/IP session would simultaneous use of these multiple paths for a TCP/IP session would
improve resource usage within the network and, thus, improve user improve resource usage within the network and, thus, improve user
experience through higher throughput and improved resilience to experience through higher throughput and improved resilience to
network failure. network failure.
skipping to change at page 2, line 7 skipping to change at page 2, line 7
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 9, 2019. This Internet-Draft will expire on November 23, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 9 skipping to change at page 3, line 9
3.3.3. Closing a Connection . . . . . . . . . . . . . . . . 34 3.3.3. Closing a Connection . . . . . . . . . . . . . . . . 34
3.3.4. Receiver Considerations . . . . . . . . . . . . . . . 35 3.3.4. Receiver Considerations . . . . . . . . . . . . . . . 35
3.3.5. Sender Considerations . . . . . . . . . . . . . . . . 37 3.3.5. Sender Considerations . . . . . . . . . . . . . . . . 37
3.3.6. Reliability and Retransmissions . . . . . . . . . . . 37 3.3.6. Reliability and Retransmissions . . . . . . . . . . . 37
3.3.7. Congestion Control Considerations . . . . . . . . . . 39 3.3.7. Congestion Control Considerations . . . . . . . . . . 39
3.3.8. Subflow Policy . . . . . . . . . . . . . . . . . . . 39 3.3.8. Subflow Policy . . . . . . . . . . . . . . . . . . . 39
3.4. Address Knowledge Exchange (Path Management) . . . . . . 40 3.4. Address Knowledge Exchange (Path Management) . . . . . . 40
3.4.1. Address Advertisement . . . . . . . . . . . . . . . . 42 3.4.1. Address Advertisement . . . . . . . . . . . . . . . . 42
3.4.2. Remove Address . . . . . . . . . . . . . . . . . . . 45 3.4.2. Remove Address . . . . . . . . . . . . . . . . . . . 45
3.5. Fast Close . . . . . . . . . . . . . . . . . . . . . . . 46 3.5. Fast Close . . . . . . . . . . . . . . . . . . . . . . . 46
3.6. Subflow Reset . . . . . . . . . . . . . . . . . . . . . . 47 3.6. Subflow Reset . . . . . . . . . . . . . . . . . . . . . . 48
3.7. Fallback . . . . . . . . . . . . . . . . . . . . . . . . 49 3.7. Fallback . . . . . . . . . . . . . . . . . . . . . . . . 49
3.8. Error Handling . . . . . . . . . . . . . . . . . . . . . 53 3.8. Error Handling . . . . . . . . . . . . . . . . . . . . . 53
3.9. Heuristics . . . . . . . . . . . . . . . . . . . . . . . 53 3.9. Heuristics . . . . . . . . . . . . . . . . . . . . . . . 53
3.9.1. Port Usage . . . . . . . . . . . . . . . . . . . . . 53 3.9.1. Port Usage . . . . . . . . . . . . . . . . . . . . . 54
3.9.2. Delayed Subflow Start and Subflow Symmetry . . . . . 54 3.9.2. Delayed Subflow Start and Subflow Symmetry . . . . . 54
3.9.3. Failure Handling . . . . . . . . . . . . . . . . . . 55 3.9.3. Failure Handling . . . . . . . . . . . . . . . . . . 55
4. Semantic Issues . . . . . . . . . . . . . . . . . . . . . . . 55 4. Semantic Issues . . . . . . . . . . . . . . . . . . . . . . . 56
5. Security Considerations . . . . . . . . . . . . . . . . . . . 57 5. Security Considerations . . . . . . . . . . . . . . . . . . . 57
6. Interactions with Middleboxes . . . . . . . . . . . . . . . . 60 6. Interactions with Middleboxes . . . . . . . . . . . . . . . . 60
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 63 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 63
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 63 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 64
8.1. MPTCP Option Subtypes . . . . . . . . . . . . . . . . . . 64 8.1. MPTCP Option Subtypes . . . . . . . . . . . . . . . . . . 64
8.2. MPTCP Handshake Algorithms . . . . . . . . . . . . . . . 65 8.2. MPTCP Handshake Algorithms . . . . . . . . . . . . . . . 65
8.3. MP_TCPRST Reason Codes . . . . . . . . . . . . . . . . . 66 8.3. MP_TCPRST Reason Codes . . . . . . . . . . . . . . . . . 66
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 66 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.1. Normative References . . . . . . . . . . . . . . . . . . 66 9.1. Normative References . . . . . . . . . . . . . . . . . . 67
9.2. Informative References . . . . . . . . . . . . . . . . . 67 9.2. Informative References . . . . . . . . . . . . . . . . . 68
Appendix A. Notes on Use of TCP Options . . . . . . . . . . . . 70 Appendix A. Notes on Use of TCP Options . . . . . . . . . . . . 71
Appendix B. TCP Fast Open and MPTCP . . . . . . . . . . . . . . 71 Appendix B. TCP Fast Open and MPTCP . . . . . . . . . . . . . . 72
B.1. TFO cookie request with MPTCP . . . . . . . . . . . . . . 71 B.1. TFO cookie request with MPTCP . . . . . . . . . . . . . . 72
B.2. Data sequence mapping under TFO . . . . . . . . . . . . . 72 B.2. Data sequence mapping under TFO . . . . . . . . . . . . . 73
B.3. Connection establishment examples . . . . . . . . . . . . 73 B.3. Connection establishment examples . . . . . . . . . . . . 74
Appendix C. Control Blocks . . . . . . . . . . . . . . . . . . . 75 Appendix C. Control Blocks . . . . . . . . . . . . . . . . . . . 76
C.1. MPTCP Control Block . . . . . . . . . . . . . . . . . . . 75 C.1. MPTCP Control Block . . . . . . . . . . . . . . . . . . . 76
C.1.1. Authentication and Metadata . . . . . . . . . . . . . 75 C.1.1. Authentication and Metadata . . . . . . . . . . . . . 76
C.1.2. Sending Side . . . . . . . . . . . . . . . . . . . . 76 C.1.2. Sending Side . . . . . . . . . . . . . . . . . . . . 77
C.1.3. Receiving Side . . . . . . . . . . . . . . . . . . . 76 C.1.3. Receiving Side . . . . . . . . . . . . . . . . . . . 77
C.2. TCP Control Blocks . . . . . . . . . . . . . . . . . . . 76 C.2. TCP Control Blocks . . . . . . . . . . . . . . . . . . . 77
C.2.1. Sending Side . . . . . . . . . . . . . . . . . . . . 77 C.2.1. Sending Side . . . . . . . . . . . . . . . . . . . . 78
C.2.2. Receiving Side . . . . . . . . . . . . . . . . . . . 77 C.2.2. Receiving Side . . . . . . . . . . . . . . . . . . . 78
Appendix D. Finite State Machine . . . . . . . . . . . . . . . . 77 Appendix D. Finite State Machine . . . . . . . . . . . . . . . . 78
Appendix E. Changes from RFC6184 . . . . . . . . . . . . . . . . 78 Appendix E. Changes from RFC6824 . . . . . . . . . . . . . . . . 79
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 80 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 81
1. Introduction 1. Introduction
Multipath TCP (MPTCP) is a set of extensions to regular TCP [RFC0793] Multipath TCP (MPTCP) is a set of extensions to regular TCP [RFC0793]
to provide a Multipath TCP [RFC6182] service, which enables a to provide a Multipath TCP [RFC6182] service, which enables a
transport connection to operate across multiple paths simultaneously. transport connection to operate across multiple paths simultaneously.
This document presents the protocol changes required to add multipath This document presents the protocol changes required to add multipath
capability to TCP; specifically, those for signaling and setting up capability to TCP; specifically, those for signaling and setting up
multiple paths ("subflows"), managing these subflows, reassembly of multiple paths ("subflows"), managing these subflows, reassembly of
data, and termination of sessions. This is not the only information data, and termination of sessions. This is not the only information
skipping to change at page 11, line 41 skipping to change at page 11, line 41
implicitly informed about the new address. implicitly informed about the new address.
In some circumstances, a host may want to advertise to the remote In some circumstances, a host may want to advertise to the remote
host the availability of an address without establishing a new host the availability of an address without establishing a new
subflow, for example, when a NAT prevents setup in one direction. In subflow, for example, when a NAT prevents setup in one direction. In
the example below, Host A informs Host B about its alternative IP the example below, Host A informs Host B about its alternative IP
address/port pair (IP#-A2). Host B may later send an MP_JOIN to this address/port pair (IP#-A2). Host B may later send an MP_JOIN to this
new address. The ADD_ADDR option contains a HMAC to authenticate the new address. The ADD_ADDR option contains a HMAC to authenticate the
address as having been sent from the originator of the connection. address as having been sent from the originator of the connection.
The receiver of this option echoes it back to the client to indicate The receiver of this option echoes it back to the client to indicate
successful reception. Further details are in Section 3.4.1. successful receipt. Further details are in Section 3.4.1.
Host A Host B Host A Host B
------ ------ ------ ------
ADD_ADDR -> ADD_ADDR ->
[Echo-flag=0, [Echo-flag=0,
IP#-A2, IP#-A2,
IP#-A2's Address ID, IP#-A2's Address ID,
HMAC of IP#-A2] HMAC of IP#-A2]
<- ADD_ADDR <- ADD_ADDR
skipping to change at page 13, line 33 skipping to change at page 13, line 33
Host A Host B Host A Host B
------ ------ ------ ------
MP_PRIO -> MP_PRIO ->
2.6. Closing an MPTCP Connection 2.6. Closing an MPTCP Connection
When a host wants to close an existing subflow, but not the whole When a host wants to close an existing subflow, but not the whole
connection, it can initiate a regular TCP FIN/ACK exchange. connection, it can initiate a regular TCP FIN/ACK exchange.
When Host A wants to inform Host B that it has no more data to send, When Host A wants to inform Host B that it has no more data to send,
it signals this "DATA_FIN" as part of the Data Sequence Signal (see it signals this "Data FIN" as part of the Data Sequence Signal (see
above). It has the same semantics and behavior as a regular TCP FIN, above). It has the same semantics and behavior as a regular TCP FIN,
but at the connection level. Once all the data on the MPTCP but at the connection level. Once all the data on the MPTCP
connection has been successfully received, then this message is connection has been successfully received, then this message is
acknowledged at the connection level with a DATA_ACK. Further acknowledged at the connection level with a Data ACK. Further
details are in Section 3.3.3. details are in Section 3.3.3.
Host A Host B Host A Host B
------ ------ ------ ------
DATA_SEQUENCE_SIGNAL -> DSS ->
[DATA_FIN] [Data FIN]
<- (MPTCP DATA_ACK) <- DSS
[Data ACK]
There is an additional method of connection closure, referred to as There is an additional method of connection closure, referred to as
"Fast Close", which is analogous to closing a single-path TCP "Fast Close", which is analogous to closing a single-path TCP
connection with a RST signal. The MP_FASTCLOSE signal is used to connection with a RST signal. The MP_FASTCLOSE signal is used to
indicate to the peer that the connection will be abruptly closed and indicate to the peer that the connection will be abruptly closed and
no data will be accepted anymore. This can be used on an ACK no data will be accepted anymore. This can be used on an ACK
(ensuring reliability of the signal), or a RST (which is not). Both (ensuring reliability of the signal), or a RST (which is not). Both
examples are shown in the following diagrams. Further details are in examples are shown in the following diagrams. Further details are in
Section 3.5. Section 3.5.
skipping to change at page 20, line 38 skipping to change at page 20, line 38
A: The leftmost bit, labeled "A", SHOULD be set to 1 to indicate A: The leftmost bit, labeled "A", SHOULD be set to 1 to indicate
"Checksum Required", unless the system administrator has decided "Checksum Required", unless the system administrator has decided
that checksums are not required (for example, if the environment that checksums are not required (for example, if the environment
is controlled and no middleboxes exist that might adjust the is controlled and no middleboxes exist that might adjust the
payload). payload).
B: The second bit, labeled "B", is an extensibility flag, and MUST be B: The second bit, labeled "B", is an extensibility flag, and MUST be
set to 0 for current implementations. This will be used for an set to 0 for current implementations. This will be used for an
extensibility mechanism in a future specification, and the impact extensibility mechanism in a future specification, and the impact
of this flag will be defined at a later date. If receiving a of this flag will be defined at a later date. It is expected, but
message with the 'B' flag set to 1, and this is not understood, not mandated, that this flag would be used as part of an
then the MP_CAPABLE in this SYN MUST be silently ignored, which alternative security mechanism that does not require a full
triggers a fallback to regular TCP; the sender is expected to version upgrade of the protocol, but does require redefining some
retry with a format compatible with this legacy specification. elements of the handshake. If receiving a message with the 'B'
Note that the length of the MP_CAPABLE option, and the meanings of flag set to 1, and this is not understood, then the MP_CAPABLE in
bits "C" through "H", may be altered by setting B=1. this SYN MUST be silently ignored, which triggers a fallback to
regular TCP; the sender is expected to retry with a format
compatible with this legacy specification. Note that the length
of the MP_CAPABLE option, and the meanings of bits "D" through
"H", may be altered by setting B=1.
C: The third bit, labeled "C", is set to "1" to indicate that the C: The third bit, labeled "C", is set to "1" to indicate that the
sender of this option will not accept additional MPTCP subflows to sender of this option will not accept additional MPTCP subflows to
the source address and port, and therefore the receiver MUST NOT the source address and port, and therefore the receiver MUST NOT
try to open any additional subflows towards this address and port. try to open any additional subflows towards this address and port.
This is an efficiency improvement for situations where the sender This is an efficiency improvement for situations where the sender
knows a restriction is in place, for example if the sender is knows a restriction is in place, for example if the sender is
behind a strict NAT, or operating behind a legacy Layer 4 load behind a strict NAT, or operating behind a legacy Layer 4 load
balancer. balancer.
skipping to change at page 29, line 36 skipping to change at page 29, line 36
o M = Data Sequence Number (DSN), Subflow Sequence Number (SSN), o M = Data Sequence Number (DSN), Subflow Sequence Number (SSN),
Data-Level Length, and Checksum (if negotiated) present Data-Level Length, and Checksum (if negotiated) present
o m = Data sequence number is 8 octets (if not set, DSN is 4 octets) o m = Data sequence number is 8 octets (if not set, DSN is 4 octets)
The flags 'a' and 'm' only have meaning if the corresponding 'A' or The flags 'a' and 'm' only have meaning if the corresponding 'A' or
'M' flags are set; otherwise, they will be ignored. The maximum 'M' flags are set; otherwise, they will be ignored. The maximum
length of this option, with all flags set, is 28 octets. length of this option, with all flags set, is 28 octets.
The 'F' flag indicates "DATA_FIN". If present, this means that this The 'F' flag indicates "Data FIN". If present, this means that this
mapping covers the final data from the sender. This is the mapping covers the final data from the sender. This is the
connection-level equivalent to the FIN flag in single-path TCP. A connection-level equivalent to the FIN flag in single-path TCP. A
connection is not closed unless there has been a DATA_FIN exchange, connection is not closed unless there has been a Data FIN exchange,
or an implementation-specific, connection-level timeout. The purpose or an implementation-specific, connection-level timeout. The purpose
of the DATA_FIN and the interactions between this flag, the subflow- of the Data FIN and the interactions between this flag, the subflow-
level FIN flag, and the data sequence mapping are described in level FIN flag, and the data sequence mapping are described in
Section 3.3.3. The remaining reserved bits MUST be set to zero by an Section 3.3.3. The remaining reserved bits MUST be set to zero by an
implementation of this specification. implementation of this specification.
Note that the checksum is only present in this option if the use of Note that the checksum is only present in this option if the use of
MPTCP checksumming has been negotiated at the MP_CAPABLE handshake MPTCP checksumming has been negotiated at the MP_CAPABLE handshake
(see Section 3.1). The presence of the checksum can be inferred from (see Section 3.1). The presence of the checksum can be inferred from
the length of the option. If a checksum is present, but its use had the length of the option. If a checksum is present, but its use had
not been negotiated in the MP_CAPABLE handshake, the checksum field not been negotiated in the MP_CAPABLE handshake, the checksum field
MUST be ignored. If a checksum is not present when its use has been MUST be ignored. If a checksum is not present when its use has been
skipping to change at page 42, line 33 skipping to change at page 42, line 33
All address IDs learned via either MP_JOIN or ADD_ADDR SHOULD be All address IDs learned via either MP_JOIN or ADD_ADDR SHOULD be
stored by the receiver in a data structure that gathers all the stored by the receiver in a data structure that gathers all the
Address ID to address mappings for a connection (identified by a Address ID to address mappings for a connection (identified by a
token pair). In this way, there is a stored mapping between Address token pair). In this way, there is a stored mapping between Address
ID, observed source address, and token pair for future processing of ID, observed source address, and token pair for future processing of
control information for a connection. Note that an implementation control information for a connection. Note that an implementation
MAY discard incoming address advertisements at will, for example, for MAY discard incoming address advertisements at will, for example, for
avoiding updating mapping state, or because advertised addresses are avoiding updating mapping state, or because advertised addresses are
of no use to it (for example, IPv6 addresses when it has IPv4 only). of no use to it (for example, IPv6 addresses when it has IPv4 only).
Therefore, a host MUST treat address advertisements as soft state, Therefore, a host MUST treat address advertisements as soft state,
and it MAY choose to refresh advertisements periodically. and it MAY choose to refresh advertisements periodically. Note also
that an implementation MAY choose to cache these address
advertisements even if they are not currently relevant but may be
relevant in the future, such as IPv4 addresses when IPv6 connectivity
is available but IPv4 is awaiting DHCP.
This option is shown in Figure 12. The illustration is sized for This option is shown in Figure 12. The illustration is sized for
IPv4 addresses. For IPv6, the length of the address will be 16 IPv4 addresses. For IPv6, the length of the address will be 16
octets (instead of 4). octets (instead of 4).
The 2 octets that specify the TCP port number to use are optional and The 2 octets that specify the TCP port number to use are optional and
their presence can be inferred from the length of the option. their presence can be inferred from the length of the option.
Although it is expected that the majority of use cases will use the Although it is expected that the majority of use cases will use the
same port pairs as used for the initial subflow (e.g., port 80 same port pairs as used for the initial subflow (e.g., port 80
remains port 80 on all subflows, as does the ephemeral port at the remains port 80 on all subflows, as does the ephemeral port at the
skipping to change at page 45, line 8 skipping to change at page 45, line 21
expected that an MPTCP implementation will send the ADD_ADDR option expected that an MPTCP implementation will send the ADD_ADDR option
on separate ACKs. As discussed earlier, however, an MPTCP on separate ACKs. As discussed earlier, however, an MPTCP
implementation MUST NOT treat duplicate ACKs with any MPTCP option, implementation MUST NOT treat duplicate ACKs with any MPTCP option,
with the exception of the DSS option, as indications of congestion with the exception of the DSS option, as indications of congestion
[RFC5681], and an MPTCP implementation SHOULD NOT send more than two [RFC5681], and an MPTCP implementation SHOULD NOT send more than two
duplicate ACKs in a row for signaling purposes. duplicate ACKs in a row for signaling purposes.
3.4.2. Remove Address 3.4.2. Remove Address
If, during the lifetime of an MPTCP connection, a previously If, during the lifetime of an MPTCP connection, a previously
announced address becomes invalid (e.g., if the interface announced address becomes invalid (e.g., if the interface disappears,
disappears), the affected host SHOULD announce this so that the peer or an IPv6 address is no longer preferred), the affected host SHOULD
can remove subflows related to this address. A host MAY also choose announce this so that the peer can remove subflows related to this
to announce that a valid IP address should not be used any longer, address. Even if an address is not in use by a MPTCP connection, if
for example for make-before-break session continuity. it has been previously announced, an implementation SHOULD announce
its removal. A host MAY also choose to announce that a valid IP
address should not be used any longer, for example for make-before-
break session continuity.
This is achieved through the Remove Address (REMOVE_ADDR) option This is achieved through the Remove Address (REMOVE_ADDR) option
(Figure 13), which will remove a previously added address (or list of (Figure 13), which will remove a previously added address (or list of
addresses) from a connection and terminate any subflows currently addresses) from a connection and terminate any subflows currently
using that address. using that address.
For security purposes, if a host receives a REMOVE_ADDR option, it For security purposes, if a host receives a REMOVE_ADDR option, it
must ensure the affected path(s) are no longer in use before it must ensure the affected path(s) are no longer in use before it
instigates closure. The receipt of REMOVE_ADDR SHOULD first trigger instigates closure. The receipt of REMOVE_ADDR SHOULD first trigger
the sending of a TCP keepalive [RFC1122] on the path, and if a the sending of a TCP keepalive [RFC1122] on the path, and if a
skipping to change at page 50, line 44 skipping to change at page 51, line 14
mapping available in order to DATA_ACK this data), the subflow SHOULD mapping available in order to DATA_ACK this data), the subflow SHOULD
be treated as broken and closed with a RST, since no data can be be treated as broken and closed with a RST, since no data can be
delivered to the application layer, and no fallback signal can be delivered to the application layer, and no fallback signal can be
reliably sent. This RST SHOULD include the MP_TCPRST option reliably sent. This RST SHOULD include the MP_TCPRST option
(Section 3.6) with a "Middlebox interference" reason code. (Section 3.6) with a "Middlebox interference" reason code.
These rules should cover all cases where such a failure could happen: These rules should cover all cases where such a failure could happen:
whether it's on the forward or reverse path and whether the server or whether it's on the forward or reverse path and whether the server or
the client first sends data. the client first sends data.
So far this section has discussed the lost of MPTCP options, either So far this section has discussed the loss of MPTCP options, either
initially, or during the course of the connection. As described in initially, or during the course of the connection. As described in
Section 3.3, each portion of data for which there is a mapping is Section 3.3, each portion of data for which there is a mapping is
protected by a checksum, if checksums have been negotiated. This protected by a checksum, if checksums have been negotiated. This
mechanism is used to detect if middleboxes have made any adjustments mechanism is used to detect if middleboxes have made any adjustments
to the payload (added, removed, or changed data). A checksum will to the payload (added, removed, or changed data). A checksum will
fail if the data has been changed in any way. This will also detect fail if the data has been changed in any way. This will also detect
if the length of data on the subflow is increased or decreased, and if the length of data on the subflow is increased or decreased, and
this means the data sequence mapping is no longer valid. The sender this means the data sequence mapping is no longer valid. The sender
no longer knows what subflow-level sequence number the receiver is no longer knows what subflow-level sequence number the receiver is
genuinely operating at (the middlebox will be faking ACKs in return), genuinely operating at (the middlebox will be faking ACKs in return),
skipping to change at page 58, line 4 skipping to change at page 58, line 23
hash of this key as the connection identification "token". The keys hash of this key as the connection identification "token". The keys
are concatenated and used as keys for creating Hash-based Message are concatenated and used as keys for creating Hash-based Message
Authentication Codes (HMACs) used on subflow setup, in order to Authentication Codes (HMACs) used on subflow setup, in order to
verify that the parties in the handshake are the same as in the verify that the parties in the handshake are the same as in the
original connection setup. It also provides verification that the original connection setup. It also provides verification that the
peer can receive traffic at this new address. Replay attacks would peer can receive traffic at this new address. Replay attacks would
still be possible when only keys are used; therefore, the handshakes still be possible when only keys are used; therefore, the handshakes
use single-use random numbers (nonces) at both ends -- this ensures use single-use random numbers (nonces) at both ends -- this ensures
the HMAC will never be the same on two handshakes. Guidance on the HMAC will never be the same on two handshakes. Guidance on
generating random numbers suitable for use as keys is given in generating random numbers suitable for use as keys is given in
[RFC4086] and discussed in Section 3.1. HMAC is also used to secure [RFC4086] and discussed in Section 3.1. HMAC is also used to secure
the ADD_ADDR option, due to the threats identified in [RFC7430]. the ADD_ADDR option, due to the threats identified in [RFC7430].
The use of crypto capability bits in the initial connection handshake The use of crypto capability bits in the initial connection handshake
to negotiate use of a particular algorithm allows the deployment of to negotiate use of a particular algorithm allows the deployment of
additional crypto mechanisms in the future. Note that this would be additional crypto mechanisms in the future. Note that this
susceptible to bid-down attacks only if the attacker was on-path (and negotiation would be susceptible to a bid-down attack by an on-path
thus would be able to modify the data anyway). The security active attacker who could modify the crypto capability bits response
mechanism presented in this document should therefore protect against from the receiver to use a less secure crypto mechanism. However, an
all forms of flooding and hijacking attacks discussed in [RFC6181]. on-path attacker would be able to man-in-the-middle the data anyway,
so the risk here is minimal. The security mechanism presented in
this document should therefore protect against all forms of flooding
and hijacking attacks discussed in [RFC6181].
The version negotiation specified in Section 3.1, if differing MPTCP The version negotiation specified in Section 3.1, if differing MPTCP
versions shared a common negotiation format, would allow an on-path versions shared a common negotiation format, would allow an on-path
attacker to apply a theoretical bid-down attack. Since the v1 and v0 attacker to apply a theoretical bid-down attack. Since the v1 and v0
protocols have a different handshake, such an attack would require protocols have a different handshake, such an attack would require
the client to re-establish the connection using v0, and this being the client to re-establish the connection using v0, and this being
supported by the server. Note that an on-path attacker would have supported by the server. Note that an on-path attacker would have
access to the raw data, negating any other TCP-level security access to the raw data, negating any other TCP-level security
mechanisms. Also a change from RFC6824 has removed the subflow mechanisms. Also a change from RFC6824 has removed the subflow
identifier from the MP_PRIO option (Section 3.3.8), to remove the identifier from the MP_PRIO option (Section 3.3.8), to remove the
skipping to change at page 62, line 34 skipping to change at page 62, line 47
the connection level on a different subflow. the connection level on a different subflow.
o Firewalls [RFC2979] might perform initial sequence number o Firewalls [RFC2979] might perform initial sequence number
randomization on TCP connections. MPTCP uses relative sequence randomization on TCP connections. MPTCP uses relative sequence
numbers in data sequence mapping to cope with this. Like NATs, numbers in data sequence mapping to cope with this. Like NATs,
firewalls will not permit many incoming connections, so MPTCP firewalls will not permit many incoming connections, so MPTCP
supports address signaling (ADD_ADDR) so that a multiaddressed supports address signaling (ADD_ADDR) so that a multiaddressed
host can invite its peer behind the firewall/NAT to connect out to host can invite its peer behind the firewall/NAT to connect out to
its additional interface. its additional interface.
o Intrusion Detection Systems look out for traffic patterns and o Intrusion Detection/Prevention Systems (IDS/IPS) observe packet
content that could threaten a network. Multipath will mean that streams for patterns and content that could threaten a network.
such data is potentially spread, so it is more difficult for an MPTCP may require the instrumentation of additional paths, and an
IDS to analyze the whole traffic, and potentially increases the MPTCP-aware IDS/IPS would need to read MPTCP tokens to correlate
risk of false positives. However, a MPTCP-aware IDS can read data from mutliple subflows to maintain comparable visibility into
tokens to correlate multiple subflows and reassemble them for all of the traffic between devices. Without such changes, an IDS
analysis. would get an incomplete view of the traffic, increasing the risk
of missing traffic of interest (false negatives), and increasing
the chances of erroneously identifying a subflow as a risk due to
only seeing partial data (false positives).
o Application-level middleboxes such as content-aware firewalls may o Application-level middleboxes such as content-aware firewalls may
alter the payload within a subflow, such as rewriting URIs in HTTP alter the payload within a subflow, such as rewriting URIs in HTTP
traffic. MPTCP will detect these using the checksum and close the traffic. MPTCP will detect these using the checksum and close the
affected subflow(s), if there are other subflows that can be used. affected subflow(s), if there are other subflows that can be used.
If all subflows are affected, multipath will fall back to TCP, If all subflows are affected, multipath will fall back to TCP,
allowing such middleboxes to change the payload. MPTCP-aware allowing such middleboxes to change the payload. MPTCP-aware
middleboxes should be able to adjust the payload and MPTCP middleboxes should be able to adjust the payload and MPTCP
metadata in order not to break the connection. metadata in order not to break the connection.
skipping to change at page 66, line 10 skipping to change at page 66, line 43
Future assignments in this registry are also to be defined by Future assignments in this registry are also to be defined by
Standards Action as defined by [RFC8126]. Assignments consist of the Standards Action as defined by [RFC8126]. Assignments consist of the
value of the flags, a symbolic name for the algorithm, and a value of the flags, a symbolic name for the algorithm, and a
reference to its specification. reference to its specification.
8.3. MP_TCPRST Reason Codes 8.3. MP_TCPRST Reason Codes
IANA is requested to create a further sub-registry, "MPTCP MP_TCPRST IANA is requested to create a further sub-registry, "MPTCP MP_TCPRST
Reason Codes" under the "Transmission Control Protocol (TCP) Reason Codes" under the "Transmission Control Protocol (TCP)
Parameters" registry, based on the reason code in MP_TCPRST Parameters" registry, based on the reason code in MP_TCPRST
(Section 3.6) message. Initial values for this registry are give in (Section 3.6) message. Initial values for this registry are given in
Table 4; future assignments are to be defined by Specification Table 4; future assignments are to be defined by Specification
Required as defined by [RFC8126]. Assignments consist of the value Required as defined by [RFC8126]. Assignments consist of the value
of the code, a short description of its meaning, and a reference to of the code, a short description of its meaning, and a reference to
its specification. The maximum value is 0xff. its specification. The maximum value is 0xff.
As guidance to the Designated Expert [RFC8126], assignments should
not normally be refused unless codepoint space is becoming scarce,
providing that there is a clear distinction from other, already-
existing codes, and also providing there is sufficient guidance for
implementors both sending and receiving these codes.
+------+-----------------------------+----------------------------+ +------+-----------------------------+----------------------------+
| Code | Meaning | Reference | | Code | Meaning | Reference |
+------+-----------------------------+----------------------------+ +------+-----------------------------+----------------------------+
| 0x00 | Unspecified TCP error | This document, Section 3.6 | | 0x00 | Unspecified TCP error | This document, Section 3.6 |
| 0x01 | MPTCP specific error | This document, Section 3.6 | | 0x01 | MPTCP specific error | This document, Section 3.6 |
| 0x02 | Lack of resources | This document, Section 3.6 | | 0x02 | Lack of resources | This document, Section 3.6 |
| 0x03 | Administratively prohibited | This document, Section 3.6 | | 0x03 | Administratively prohibited | This document, Section 3.6 |
| 0x04 | Too much outstanding data | This document, Section 3.6 | | 0x04 | Too much outstanding data | This document, Section 3.6 |
| 0x05 | Unacceptable performance | This document, Section 3.6 | | 0x05 | Unacceptable performance | This document, Section 3.6 |
| 0x06 | Middlebox interference | This document, Section 3.6 | | 0x06 | Middlebox interference | This document, Section 3.6 |
skipping to change at page 78, line 34 skipping to change at page 79, line 34
| snd DATA_ACK[DFIN] V delete MPTCP PCB V | snd DATA_ACK[DFIN] V delete MPTCP PCB V
\ +-----------+ +---------+ \ +-----------+ +---------+
------------------------>|M_TIME WAIT|----------------->| M_CLOSED| ------------------------>|M_TIME WAIT|----------------->| M_CLOSED|
+-----------+ +---------+ +-----------+ +---------+
All subflows in CLOSED All subflows in CLOSED
------------ ------------
delete MPTCP PCB delete MPTCP PCB
Figure 22: Finite State Machine for Connection Closure Figure 22: Finite State Machine for Connection Closure
Appendix E. Changes from RFC6184 Appendix E. Changes from RFC6824
This section lists the key technical changes between RFC6824, This section lists the key technical changes between RFC6824,
specifying MPTCP v0, and this document, which obsoletes RFC6824 and specifying MPTCP v0, and this document, which obsoletes RFC6824 and
specifies MPTCP v1. Note that this specification is not backwards specifies MPTCP v1. Note that this specification is not backwards
compatible with RFC6824. compatible with RFC6824.
o The document incorporates lessons learnt from the various o The document incorporates lessons learnt from the various
implementations, deployments and experiments gathered in the implementations, deployments and experiments gathered in the
documents "Use Cases and Operational Experience with Multipath documents "Use Cases and Operational Experience with Multipath
TCP" [RFC8041] and the IETF Journal article "Multipath TCP TCP" [RFC8041] and the IETF Journal article "Multipath TCP
 End of changes. 26 change blocks. 
65 lines changed or deleted 88 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/