draft-ietf-taps-impl-03.txt | draft-ietf-taps-impl-04.txt | |||
---|---|---|---|---|
TAPS Working Group A. Brunstrom, Ed. | TAPS Working Group A. Brunstrom, Ed. | |||
Internet-Draft Karlstad University | Internet-Draft Karlstad University | |||
Intended status: Informational T. Pauly, Ed. | Intended status: Informational T. Pauly, Ed. | |||
Expires: September 12, 2019 Apple Inc. | Expires: January 9, 2020 Apple Inc. | |||
T. Enghardt | T. Enghardt | |||
TU Berlin | TU Berlin | |||
K-J. Grinnemo | K-J. Grinnemo | |||
Karlstad University | Karlstad University | |||
T. Jones | T. Jones | |||
University of Aberdeen | University of Aberdeen | |||
P. Tiesel | P. Tiesel | |||
TU Berlin | TU Berlin | |||
C. Perkins | C. Perkins | |||
University of Glasgow | University of Glasgow | |||
M. Welzl | M. Welzl | |||
University of Oslo | University of Oslo | |||
March 11, 2019 | July 08, 2019 | |||
Implementing Interfaces to Transport Services | Implementing Interfaces to Transport Services | |||
draft-ietf-taps-impl-03 | draft-ietf-taps-impl-04 | |||
Abstract | Abstract | |||
The Transport Services architecture [I-D.ietf-taps-arch] defines a | The Transport Services architecture [I-D.ietf-taps-arch] defines a | |||
system that allows applications to use transport networking protocols | system that allows applications to use transport networking protocols | |||
flexibly. This document serves as a guide to implementation on how | flexibly. This document serves as a guide to implementation on how | |||
to build such a system. | to build such a system. | |||
Status of This Memo | Status of This Memo | |||
skipping to change at page 1, line 46 ¶ | skipping to change at page 1, line 46 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on September 12, 2019. | This Internet-Draft will expire on January 9, 2020. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2019 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 29 ¶ | skipping to change at page 2, line 29 ¶ | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Implementing Basic Objects . . . . . . . . . . . . . . . . . 3 | 2. Implementing Basic Objects . . . . . . . . . . . . . . . . . 3 | |||
3. Implementing Pre-Establishment . . . . . . . . . . . . . . . 4 | 3. Implementing Pre-Establishment . . . . . . . . . . . . . . . 4 | |||
3.1. Configuration-time errors . . . . . . . . . . . . . . . . 5 | 3.1. Configuration-time errors . . . . . . . . . . . . . . . . 5 | |||
3.2. Role of system policy . . . . . . . . . . . . . . . . . . 5 | 3.2. Role of system policy . . . . . . . . . . . . . . . . . . 5 | |||
4. Implementing Connection Establishment . . . . . . . . . . . . 6 | 4. Implementing Connection Establishment . . . . . . . . . . . . 6 | |||
4.1. Candidate Gathering . . . . . . . . . . . . . . . . . . . 7 | 4.1. Candidate Gathering . . . . . . . . . . . . . . . . . . . 7 | |||
4.1.1. Structuring Options as a Tree . . . . . . . . . . . . 7 | 4.1.1. Gathering Endpoint Candidates . . . . . . . . . . . . 7 | |||
4.1.2. Branch Types . . . . . . . . . . . . . . . . . . . . 9 | 4.1.2. Structuring Options as a Tree . . . . . . . . . . . . 9 | |||
4.2. Branching Order-of-Operations . . . . . . . . . . . . . . 11 | 4.1.3. Branch Types . . . . . . . . . . . . . . . . . . . . 10 | |||
4.3. Sorting Branches . . . . . . . . . . . . . . . . . . . . 12 | 4.2. Branching Order-of-Operations . . . . . . . . . . . . . . 13 | |||
4.4. Candidate Racing . . . . . . . . . . . . . . . . . . . . 13 | 4.3. Sorting Branches . . . . . . . . . . . . . . . . . . . . 14 | |||
4.4.1. Delayed . . . . . . . . . . . . . . . . . . . . . . . 14 | 4.4. Candidate Racing . . . . . . . . . . . . . . . . . . . . 15 | |||
4.4.2. Failover . . . . . . . . . . . . . . . . . . . . . . 15 | 4.4.1. Delayed . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
4.5. Completing Establishment . . . . . . . . . . . . . . . . 15 | 4.4.2. Failover . . . . . . . . . . . . . . . . . . . . . . 16 | |||
4.5.1. Determining Successful Establishment . . . . . . . . 16 | 4.5. Completing Establishment . . . . . . . . . . . . . . . . 17 | |||
4.6. Establishing multiplexed connections . . . . . . . . . . 17 | 4.5.1. Determining Successful Establishment . . . . . . . . 17 | |||
4.7. Handling racing with "unconnected" protocols . . . . . . 17 | 4.6. Establishing multiplexed connections . . . . . . . . . . 18 | |||
4.8. Implementing listeners . . . . . . . . . . . . . . . . . 18 | 4.7. Handling racing with "unconnected" protocols . . . . . . 19 | |||
4.8.1. Implementing listeners for Connected Protocols . . . 18 | 4.8. Implementing listeners . . . . . . . . . . . . . . . . . 19 | |||
4.8.2. Implementing listeners for Unconnected Protocols . . 18 | 4.8.1. Implementing listeners for Connected Protocols . . . 20 | |||
4.8.3. Implementing listeners for Multiplexed Protocols . . 18 | 4.8.2. Implementing listeners for Unconnected Protocols . . 20 | |||
5. Implementing Data Transfer . . . . . . . . . . . . . . . . . 19 | 4.8.3. Implementing listeners for Multiplexed Protocols . . 20 | |||
5.1. Data transfer for streams, datagrams, and frames . . . . 19 | 5. Implementing Data Transfer . . . . . . . . . . . . . . . . . 20 | |||
5.1.1. Sending Messages . . . . . . . . . . . . . . . . . . 19 | 5.1. Data transfer for streams, datagrams, and frames . . . . 20 | |||
5.1.2. Receiving Messages . . . . . . . . . . . . . . . . . 21 | 5.1.1. Sending Messages . . . . . . . . . . . . . . . . . . 21 | |||
5.2. Handling of data for fast-open protocols . . . . . . . . 22 | 5.1.2. Receiving Messages . . . . . . . . . . . . . . . . . 23 | |||
6. Implementing Maintenance . . . . . . . . . . . . . . . . . . 23 | 5.2. Handling of data for fast-open protocols . . . . . . . . 23 | |||
6.1. Managing Connections . . . . . . . . . . . . . . . . . . 23 | 6. Implementing Maintenance . . . . . . . . . . . . . . . . . . 24 | |||
6.2. Handling Path Changes . . . . . . . . . . . . . . . . . . 24 | 6.1. Managing Connections . . . . . . . . . . . . . . . . . . 24 | |||
7. Implementing Termination . . . . . . . . . . . . . . . . . . 24 | 6.2. Handling Path Changes . . . . . . . . . . . . . . . . . . 26 | |||
8. Cached State . . . . . . . . . . . . . . . . . . . . . . . . 25 | ||||
8.1. Protocol state caches . . . . . . . . . . . . . . . . . . 26 | 7. Implementing Termination . . . . . . . . . . . . . . . . . . 26 | |||
8.2. Performance caches . . . . . . . . . . . . . . . . . . . 26 | 8. Cached State . . . . . . . . . . . . . . . . . . . . . . . . 27 | |||
9. Specific Transport Protocol Considerations . . . . . . . . . 27 | 8.1. Protocol state caches . . . . . . . . . . . . . . . . . . 27 | |||
9.1. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 | 8.2. Performance caches . . . . . . . . . . . . . . . . . . . 28 | |||
9.2. UDP . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 | 9. Specific Transport Protocol Considerations . . . . . . . . . 29 | |||
9.3. SCTP . . . . . . . . . . . . . . . . . . . . . . . . . . 28 | 9.1. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 | |||
9.4. TLS . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 | 9.2. UDP . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 | |||
9.5. HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . 29 | 9.3. TLS . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 | |||
9.6. QUIC . . . . . . . . . . . . . . . . . . . . . . . . . . 29 | 9.4. DTLS . . . . . . . . . . . . . . . . . . . . . . . . . . 34 | |||
9.7. HTTP/2 transport . . . . . . . . . . . . . . . . . . . . 30 | 9.5. HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . 34 | |||
10. Rendezvous and Environment Discovery . . . . . . . . . . . . 30 | 9.6. QUIC . . . . . . . . . . . . . . . . . . . . . . . . . . 35 | |||
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 | 9.7. HTTP/2 transport . . . . . . . . . . . . . . . . . . . . 36 | |||
12. Security Considerations . . . . . . . . . . . . . . . . . . . 32 | 9.8. SCTP . . . . . . . . . . . . . . . . . . . . . . . . . . 36 | |||
12.1. Considerations for Candidate Gathering . . . . . . . . . 32 | 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37 | |||
12.2. Considerations for Candidate Racing . . . . . . . . . . 32 | 11. Security Considerations . . . . . . . . . . . . . . . . . . . 37 | |||
13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 33 | 11.1. Considerations for Candidate Gathering . . . . . . . . . 37 | |||
14. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 | 11.2. Considerations for Candidate Racing . . . . . . . . . . 37 | |||
14.1. Normative References . . . . . . . . . . . . . . . . . . 33 | 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38 | |||
14.2. Informative References . . . . . . . . . . . . . . . . . 34 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 | |||
Appendix A. Additional Properties . . . . . . . . . . . . . . . 35 | 13.1. Normative References . . . . . . . . . . . . . . . . . . 38 | |||
A.1. Properties Affecting Sorting of Branches . . . . . . . . 35 | 13.2. Informative References . . . . . . . . . . . . . . . . . 39 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 35 | Appendix A. Additional Properties . . . . . . . . . . . . . . . 40 | |||
A.1. Properties Affecting Sorting of Branches . . . . . . . . 40 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40 | ||||
1. Introduction | 1. Introduction | |||
The Transport Services architecture [I-D.ietf-taps-arch] defines a | The Transport Services architecture [I-D.ietf-taps-arch] defines a | |||
system that allows applications to use transport networking protocols | system that allows applications to use transport networking protocols | |||
flexibly. The interface such a system exposes to applications is | flexibly. The interface such a system exposes to applications is | |||
defined as the Transport Services API [I-D.ietf-taps-interface]. | defined as the Transport Services API [I-D.ietf-taps-interface]. | |||
This API is designed to be generic across multiple transport | This API is designed to be generic across multiple transport | |||
protocols and sets of protocols features. | protocols and sets of protocols features. | |||
skipping to change at page 6, line 44 ¶ | skipping to change at page 6, line 44 ¶ | |||
calling Initiate. (At this point, any constraints or requirements | calling Initiate. (At this point, any constraints or requirements | |||
the application may have on the connection are available from pre- | the application may have on the connection are available from pre- | |||
establishment.) The process can be considered complete once there is | establishment.) The process can be considered complete once there is | |||
at least one Protocol Stack that has completed any required setup to | at least one Protocol Stack that has completed any required setup to | |||
the point that it can transmit and receive the application's data. | the point that it can transmit and receive the application's data. | |||
Connection establishment is divided into two top-level steps: | Connection establishment is divided into two top-level steps: | |||
Candidate Gathering, to identify the paths, protocols, and endpoints | Candidate Gathering, to identify the paths, protocols, and endpoints | |||
to use, and Candidate Racing, in which the necessary protocol | to use, and Candidate Racing, in which the necessary protocol | |||
handshakes are conducted so that the transport system can select | handshakes are conducted so that the transport system can select | |||
which set to use. | which set to use. This document structures candidates for racing as | |||
a tree. | ||||
The most simple example of this process might involve identifying the | The most simple example of this process might involve identifying the | |||
single IP address to which the implementation wishes to connect, | single IP address to which the implementation wishes to connect, | |||
using the system's current default interface or path, and starting a | using the system's current default interface or path, and starting a | |||
TCP handshake to establish a stream to the specified IP address. | TCP handshake to establish a stream to the specified IP address. | |||
However, each step may also vary depending on the requirements of the | However, each step may also vary depending on the requirements of the | |||
connection: if the endpoint is defined as a hostname and port, then | connection: if the endpoint is defined as a hostname and port, then | |||
there may be multiple resolved addresses that are available; there | there may be multiple resolved addresses that are available; there | |||
may also be multiple interfaces or paths available, other than the | may also be multiple interfaces or paths available, other than the | |||
default system interface; and some protocols may not need any | default system interface; and some protocols may not need any | |||
skipping to change at page 7, line 40 ¶ | skipping to change at page 7, line 41 ¶ | |||
section is the algorithm defining which of these options to try, | section is the algorithm defining which of these options to try, | |||
when, and in what order. | when, and in what order. | |||
4.1. Candidate Gathering | 4.1. Candidate Gathering | |||
The step of gathering candidates involves identifying which paths, | The step of gathering candidates involves identifying which paths, | |||
protocols, and endpoints may be used for a given Connection. This | protocols, and endpoints may be used for a given Connection. This | |||
list is determined by the requirements, prohibitions, and preferences | list is determined by the requirements, prohibitions, and preferences | |||
of the application as specified in the Selection Properties. | of the application as specified in the Selection Properties. | |||
4.1.1. Structuring Options as a Tree | 4.1.1. Gathering Endpoint Candidates | |||
Both Local and Remote Endpoint Candidates must be discovered during | ||||
connection establishment. To support ICE, or similar protocols, that | ||||
involve out-of-band indirect signalling to exchange candidates with | ||||
the Remote Endpoint, it's important to be able to query the set of | ||||
candidate Local Endpoints, and give the protocol stack a set of | ||||
candidate Remote Endpoints, before it attempts to establish | ||||
connections. | ||||
4.1.1.1. Local Endpoint candidates | ||||
The set of possible Local Endpoints is gathered. In the simple case, | ||||
this merely enumerates the local interfaces and protocols, allocates | ||||
ephemeral source ports. For example, a system that has WiFi and | ||||
Ethernet and supports IPv4 and IPv6 might gather four candidate | ||||
locals (IPv4 on Ethernet, IPv6 on Ethernet, IPv4 on WiFi, and IPv6 on | ||||
WiFi) that can form the source for a transient. | ||||
If NAT traversal is required, the process of gathering Local | ||||
Endpoints becomes broadly equivalent to the ICE candidate gathering | ||||
phase [RFC5245]. The endpoint determines its server reflexive Local | ||||
Endpoints (i.e., the translated address of a local, on the other side | ||||
of a NAT) and relayed locals (e.g., via a TURN server or other | ||||
relay), for each interface and network protocol. These are added to | ||||
the set of candidate Local Endpoints for this connection. | ||||
Gathering Local Endpoints is primarily a local operation, although it | ||||
might involve exchanges with a STUN server to derive server reflexive | ||||
locals, or with a TURN server or other relay to derive relayed | ||||
locals. It does not involve communication with the Remote Endpoint. | ||||
4.1.1.2. Remote Endpoint Candidates | ||||
The Remote Endpoint is typically a name that needs to be resolved | ||||
into a set of possible addresses that can be used for communication. | ||||
Resolving the Remote Endpoint is the process of recursively | ||||
performing such name lookups, until fully resolved, to return the set | ||||
of candidates for the remote of this connection. | ||||
How this is done will depend on the type of the Remote Endpoint, and | ||||
can also be specific to each Local Endpoint. A common case is when | ||||
the Remote Endpoint is a DNS name, in which case it is resolved to | ||||
give a set of IPv4 and IPv6 addresses representing that name. Some | ||||
types of remote might require more complex resolution. Resolving the | ||||
Remote Endpoint for a peer-to-peer connection might involve | ||||
communication with a rendezvous server, which in turn contacts the | ||||
peer to gain consent to communicate and retrieve its set of candidate | ||||
locals, which are returned and form the candidate remote addresses | ||||
for contacting that peer. | ||||
Resolving the remote is not a local operation. It will involve a | ||||
directory service, and can require communication with the remote to | ||||
rendezvous and exchange peer addresses. This can expose some or all | ||||
of the candidate locals to the remote. | ||||
4.1.2. Structuring Options as a Tree | ||||
When an implementation responsible for connection establishment needs | When an implementation responsible for connection establishment needs | |||
to consider multiple options, it should logically structure these | to consider multiple options, it should logically structure these | |||
options as a hierarchical tree. Each leaf node of the tree | options as a hierarchical tree. Each leaf node of the tree | |||
represents a single, coherent connection attempt, with an Endpoint, a | represents a single, coherent connection attempt, with an Endpoint, a | |||
Path, and a set of protocols that can directly negotiate and send | Path, and a set of protocols that can directly negotiate and send | |||
data on the network. Each node in the tree that is not a leaf | data on the network. Each node in the tree that is not a leaf | |||
represents a connection attempt that is either underspecified, or | represents a connection attempt that is either underspecified, or | |||
else includes multiple distinct options. For example. when | else includes multiple distinct options. For example. when | |||
connecting on an IP network, a connection attempt to a hostname and | connecting on an IP network, a connection attempt to a hostname and | |||
skipping to change at page 9, line 19 ¶ | skipping to change at page 10, line 33 ¶ | |||
a single interface with a single protocol. | a single interface with a single protocol. | |||
1 [192.0.2.1:80, Wi-Fi, TCP] | 1 [192.0.2.1:80, Wi-Fi, TCP] | |||
A parent node may also only have one child (or leaf) node, such as a | A parent node may also only have one child (or leaf) node, such as a | |||
when a hostname resolves to only a single IP address. | when a hostname resolves to only a single IP address. | |||
1 [www.example.com:80, Wi-Fi, TCP] | 1 [www.example.com:80, Wi-Fi, TCP] | |||
1.1 [192.0.2.1:80, Wi-Fi, TCP] | 1.1 [192.0.2.1:80, Wi-Fi, TCP] | |||
4.1.2. Branch Types | 4.1.3. Branch Types | |||
There are three types of branching from a parent node into one or | There are three types of branching from a parent node into one or | |||
more child nodes. Any parent node of the tree must only use one type | more child nodes. Any parent node of the tree must only use one type | |||
of branching. | of branching. | |||
4.1.2.1. Derived Endpoints | 4.1.3.1. Derived Endpoints | |||
If a connection originally targets a single endpoint, there may be | If a connection originally targets a single endpoint, there may be | |||
multiple endpoints of different types that can be derived from the | multiple endpoints of different types that can be derived from the | |||
original. The connection library should order the derived endpoints | original. The connection library should order the derived endpoints | |||
according to application preference, system policy and expected | according to application preference, system policy and expected | |||
performance. | performance. | |||
DNS hostname-to-address resolution is the most common method of | DNS hostname-to-address resolution is the most common method of | |||
endpoint derivation. When trying to connect to a hostname endpoint | endpoint derivation. When trying to connect to a hostname endpoint | |||
on a traditional IP network, the implementation should send DNS | on a traditional IP network, the implementation should send DNS | |||
skipping to change at page 10, line 4 ¶ | skipping to change at page 11, line 18 ¶ | |||
1.1 [2001:DB8::1.80, Wi-Fi, TCP] | 1.1 [2001:DB8::1.80, Wi-Fi, TCP] | |||
1.2 [192.0.2.1:80, Wi-Fi, TCP] | 1.2 [192.0.2.1:80, Wi-Fi, TCP] | |||
1.3 [2001:DB8::2.80, Wi-Fi, TCP] | 1.3 [2001:DB8::2.80, Wi-Fi, TCP] | |||
1.4 [2001:DB8::3.80, Wi-Fi, TCP] | 1.4 [2001:DB8::3.80, Wi-Fi, TCP] | |||
DNS-Based Service Discovery can also provide an endpoint derivation | DNS-Based Service Discovery can also provide an endpoint derivation | |||
step. When trying to connect to a named service, the client may | step. When trying to connect to a named service, the client may | |||
discover one or more hostname and port pairs on the local network | discover one or more hostname and port pairs on the local network | |||
using multicast DNS. These hostnames should each be treated as a | using multicast DNS. These hostnames should each be treated as a | |||
branch which can be attempted independently from other hostnames. | branch which can be attempted independently from other hostnames. | |||
Each of these hostnames may also resolve to one or more addresses, | Each of these hostnames may also resolve to one or more addresses, | |||
thus creating multiple layers of branching. | thus creating multiple layers of branching. | |||
1 [term-printer._ipp._tcp.meeting.ietf.org, Wi-Fi, TCP] | 1 [term-printer._ipp._tcp.meeting.ietf.org, Wi-Fi, TCP] | |||
1.1 [term-printer.meeting.ietf.org:631, Wi-Fi, TCP] | 1.1 [term-printer.meeting.ietf.org:631, Wi-Fi, TCP] | |||
1.1.1 [31.133.160.18.631, Wi-Fi, TCP] | 1.1.1 [31.133.160.18.631, Wi-Fi, TCP] | |||
4.1.2.2. Alternate Paths | 4.1.3.2. Alternate Paths | |||
If a client has multiple network interfaces available to it, such as | If a client has multiple network interfaces available to it, such as | |||
mobile client with both Wi-Fi and Cellular connectivity, it can | mobile client with both Wi-Fi and Cellular connectivity, it can | |||
attempt a connection over either interface. This represents a branch | attempt a connection over either interface. This represents a branch | |||
point in the connection establishment. Like with derived endpoints, | point in the connection establishment. Like with derived endpoints, | |||
the interfaces should be ranked based on preference, system policy, | the interfaces should be ranked based on preference, system policy, | |||
and performance. Attempts should be started on one interface, and | and performance. Attempts should be started on one interface, and | |||
then on other interfaces successively after delays based on expected | then on other interfaces successively after delays based on expected | |||
round-trip-time or other available metrics. | round-trip-time or other available metrics. | |||
skipping to change at page 10, line 36 ¶ | skipping to change at page 12, line 5 ¶ | |||
This same approach applies to any situation in which the client is | This same approach applies to any situation in which the client is | |||
aware of multiple links or views of the network. Multiple Paths, | aware of multiple links or views of the network. Multiple Paths, | |||
each with a coherent set of addresses, routes, DNS server, and more, | each with a coherent set of addresses, routes, DNS server, and more, | |||
may share a single interface. A path may also represent a virtual | may share a single interface. A path may also represent a virtual | |||
interface service such as a Virtual Private Network (VPN). | interface service such as a Virtual Private Network (VPN). | |||
The list of available paths should be constrained by any requirements | The list of available paths should be constrained by any requirements | |||
or prohibitions the application sets, as well as system policy. | or prohibitions the application sets, as well as system policy. | |||
4.1.2.3. Protocol Options | 4.1.3.3. Protocol Options | |||
Differences in possible protocol compositions and options can also | Differences in possible protocol compositions and options can also | |||
provide a branching point in connection establishment. This allows | provide a branching point in connection establishment. This allows | |||
clients to be resilient to situations in which a certain protocol is | clients to be resilient to situations in which a certain protocol is | |||
not functioning on a server or network. | not functioning on a server or network. | |||
This approach is commonly used for connections with optional proxy | This approach is commonly used for connections with optional proxy | |||
server configurations. A single connection may be allowed to use an | server configurations. A single connection may be allowed to use an | |||
HTTP-based proxy, a SOCKS-based proxy, or connect directly. These | HTTP-based proxy, a SOCKS-based proxy, or connect directly. These | |||
options should be ranked and attempted in succession. | options should be ranked and attempted in succession. | |||
skipping to change at page 27, line 33 ¶ | skipping to change at page 29, line 18 ¶ | |||
given protocol stack, can be stored for a long period of time (hours | given protocol stack, can be stored for a long period of time (hours | |||
or longer), since it is expected that the capabilities of the Remote | or longer), since it is expected that the capabilities of the Remote | |||
Endpoint are not changing very quickly. On the other hand, Round | Endpoint are not changing very quickly. On the other hand, Round | |||
Trip Time observed by TCP over a particular network path may vary | Trip Time observed by TCP over a particular network path may vary | |||
over a relatively short time interval. For such values, the | over a relatively short time interval. For such values, the | |||
implementation should remove them from the cache more quickly, or | implementation should remove them from the cache more quickly, or | |||
treat older values with less confidence/weight. | treat older values with less confidence/weight. | |||
9. Specific Transport Protocol Considerations | 9. Specific Transport Protocol Considerations | |||
Each protocol that can run as part of a Transport Services | ||||
implementation defines both its API mapping as well as implementation | ||||
details. | ||||
API mappings for a protocol apply most to Connections in which the | ||||
given protocol is the "top" of the Protocol Stack. For example, the | ||||
mapping of the "Send" function for TCP applies to Connections in | ||||
which the application directly sends over TCP. If HTTP/2 is used on | ||||
top of TCP, the HTTP/2 mappings take precendence. | ||||
Each protocol has a notion of Connectedness. Possible values for | ||||
Connectedness are: | ||||
o Unconnected. Unconnected protocols do not establish explicit | ||||
state between endpoints, and do not perform a handshake during | ||||
Connection establishment. | ||||
o Connected. Connected protocols establish state between endpoints, | ||||
and perform a handshake during Connection establishment. The | ||||
handshake may be 0-RTT to send data or resume a session, but | ||||
bidirectional traffic is required to confirm connectedness. | ||||
o Multiplexing Connected. Multiplexing Connected protocols share | ||||
properties with Connected protocols, but also explictly support | ||||
opening multiple application-level flows. This means that they | ||||
can support cloning new Connection objects without a new explicit | ||||
handshake. | ||||
Protocols also define a notion of Data Unit. Possible values for | ||||
Data Unit are: | ||||
o Byte-stream. Byte-stream protocols do not define any Message | ||||
boundaries of their own apart from the end of a stream in each | ||||
direction. | ||||
o Datagram. Datagram protocols define Message boundaries at the | ||||
same level of transmission, such that only complete (not partial) | ||||
Messages are supported. | ||||
o Message. Message protocols support Message boundaries that can be | ||||
sent and received either as complete or partial Messages. Maximum | ||||
Message lengths can be defined, and Messages can be partially | ||||
reliable. | ||||
9.1. TCP | 9.1. TCP | |||
Connection lifetime for TCP translates fairly simply into the the | Connectedness: Connected | |||
abstraction presented to an application. When the TCP three-way | ||||
handshake is complete, its layer of the Protocol Stack can be | ||||
considered Ready (established). This event will cause racing of | ||||
Protocol Stack options to complete if TCP is the top-level protocol, | ||||
at which point the application can be notified that the Connection is | ||||
Ready to send and receive. | ||||
If the application sends a Close, that can translate to a graceful | Data Unit: Byte-stream | |||
termination of the TCP connection, which is performed by sending a | ||||
FIN to the remote endpoint. If the application sends an Abort, then | ||||
the TCP state can be closed abruptly, leading to a RST being sent to | ||||
the peer. | ||||
Without a layer of framing (a top-level protocol in the established | API mappings for TCP are as follows: | |||
Protocol Stack that preserves message boundaries, or an application- | ||||
supplied deframer) on top of TCP, the receiver side of the transport | Connection Object: TCP connections between two hosts map directly to | |||
system implementation can only treat the incoming stream of bytes as | Connection objects. | |||
a single Message, terminated by a FIN when the Remote Endpoint closes | ||||
the Connection. | Initiate: Calling "Initiate" on a TCP Connection causes it to | |||
reserve a local port, and send a SYN to the Remote Endpoint. | ||||
InitiateWithSend: Early idempotent data is sent on a TCP Connection | ||||
in the SYN, as TCP Fast Open data. | ||||
Ready: A TCP Connection is ready once the three-way handshake is | ||||
complete. | ||||
InitiateError: TCP can throw various errors during connection setup. | ||||
Specifically, it is important to handle a RST being sent by the | ||||
peer during the handshake. | ||||
ConnectionError: Once established, TCP throws errors whenever the | ||||
connection is disconnected, such as due to receive a RST from the | ||||
peer; or hitting a TCP retransmission timeout. | ||||
Listen: Calling "Listen" for TCP binds a local port and prepares it | ||||
to receive inbound SYN packets from peers. | ||||
ConnectionReceived: TCP Listeners will deliver new connections once | ||||
they have replied to an inbound SYN with a SYN-ACK. | ||||
Clone: Calling "Clone" on a TCP Connection creates a new Connection | ||||
with equivalent parameters. The two Connections are otherwise | ||||
independent. | ||||
Send: TCP does not on its own preserve Message boundaries. Calling | ||||
"Send" on a TCP connection lays out the bytes on the TCP send | ||||
stream without any other delineation. Any Message marked as Final | ||||
will cause TCP to send a FIN once the Message has been completely | ||||
written. | ||||
Receive: TCP delivers a stream of bytes without any Message | ||||
delineation. All data delivered in the "Received" or | ||||
"ReceivedPartial" event will be part of a single stream-wide | ||||
Message that is marked Final (unless a MessageFramer is used). | ||||
EndOfMessage will be delivered when the TCP Connection has | ||||
received a FIN from the peer. | ||||
Close: Calling "Close" on a TCP Connection indicates that the | ||||
Connection should be gracefully closed by sending a FIN to the | ||||
peer and waiting for a FIN-ACK before delivering the "Closed" | ||||
event. | ||||
Abort: Calling "Abort" on a TCP Connection indicates that the | ||||
Connection should be immediately closed by sending a RST to the | ||||
peer. | ||||
9.2. UDP | 9.2. UDP | |||
UDP as a direct transport does not provide any handshake or | Connectedness: Unconnected | |||
connectivity state, so the notion of the transport protocol becoming | ||||
Ready or established is degenerate. Once the system has validated | ||||
that there is a route on which to send and receive UDP datagrams, the | ||||
protocol is considered Ready. Similarly, a Close or Abort has no | ||||
meaning to the on-the-wire protocol, but simply leads to the local | ||||
state being torn down. | ||||
When sending and receiving messages over UDP, each Message should | Data Unit: Datagram | |||
correspond to a single UDP datagram. The Message can contain | ||||
metadata about the packet, such as the ECN bits applied to the | ||||
packet. | ||||
9.3. SCTP | API mappings for UDP are as follows: | |||
To support sender-side stream schedulers (which are implemented on | Connection Object: UDP connections represent a pair of specific IP | |||
the sender side), a receiver-side Transport System should always | addresses and ports on two hosts. | |||
support message interleaving [RFC8260]. | ||||
SCTP messages can be very large. To allow the reception of large | Initiate: Calling "Initiate" on a UDP Connection causes it to | |||
messages in pieces, a "partial flag" can be used to inform a (native | reserve a local port, but does not generate any traffic. | |||
SCTP) receiving application that a message is incomplete. After | ||||
receiving the "partial flag", this application would know that the | ||||
next receive calls will only deliver remaining parts of the same | ||||
message (i.e., no messages or partial messages will arrive on other | ||||
streams until the message is complete) (see Section 8.1.20 in | ||||
[RFC6458]). The "partial flag" can therefore facilitate the | ||||
implementation of the receiver buffer in the receiving application, | ||||
at the cost of limiting multiplexing and temporarily creating head- | ||||
of-line blocking delay at the receiver. | ||||
When a Transport System transfers a Message, it seems natural to map | InitiateWithSend: Early data on a UDP Connection does not have any | |||
the Message object to SCTP messages in order to support properties | special meaning. The data is sent whenever the Connection is | |||
such as "Ordered" or "Lifetime" (which maps onto partially reliable | Ready. | |||
delivery with a SCTP_PR_SCTP_TTL policy [RFC6458]). However, since | ||||
multiplexing of Connections onto SCTP streams may happen, and would | ||||
be hidden from the application, the Transport System requires a per- | ||||
stream receiver buffer anyway, so this potential benefit is lost and | ||||
the "partial flag" becomes unnecessary for the system. | ||||
The problem of long messages either requiring large receiver-side | Ready: A UDP Connection is ready once the system has reserved a | |||
buffers or getting in the way of multiplexing is addressed by message | local port and has a path to send to the Remote Endpoint. | |||
interleaving [RFC8260], which is yet another reason why a receivers- | ||||
side transport system supporting SCTP should implement this | ||||
mechanism. | ||||
9.4. TLS | InitiateError: UDP Connections can only generate errors on | |||
initiation due to port conflicts on the local system. | ||||
ConnectionError: Once in use, UDP throws errors upon receiving ICMP | ||||
notifications indicating failures in the network. | ||||
Listen: Calling "Listen" for UDP binds a local port and prepares it | ||||
to receive inbound UDP datagrams from peers. | ||||
ConnectionReceived: UDP Listeners will deliver new connections once | ||||
they have received traffic from a new Remote Endpoint. | ||||
Clone: Calling "Clone" on a UDP Connection creates a new Connection | ||||
with equivalent parameters. The two Connections are otherwise | ||||
independent. | ||||
Send: Calling "Send" on a UDP connection sends the data as the | ||||
payload of a complete UDP datagram. Marking Messages as Final | ||||
does not change anything in the datagram's contents. | ||||
Receive: UDP only delivers complete Messages to "Received", each of | ||||
which represents a single datagram received in a UDP packet. | ||||
Close: Calling "Close" on a UDP Connection releases the local port | ||||
reservation. | ||||
Abort: Calling "Abort" on a UDP Connection is identical to calling | ||||
"Close". | ||||
9.3. TLS | ||||
The mapping of a TLS stream abstraction into the application is | The mapping of a TLS stream abstraction into the application is | |||
equivalent to the contract provided by TCP (see Section 9.1). The | equivalent to the contract provided by TCP (see Section 9.1), and | |||
Ready state should be determined by the completion of the TLS | builds upon many of the actions of TCP connections. | |||
handshake, which involves potentially several more round trips beyond | ||||
the TCP handshake. The application should not be notified that the | Connectedness: Connected | |||
Connection is Ready until TLS is established. | ||||
Data Unit: Byte-stream | ||||
Connection Object: Connection objects represent a single TLS | ||||
connection running over a TCP connection between two hosts. | ||||
Initiate: Calling "Initiate" on a TLS Connection causes it to first | ||||
initiate a TCP connection. Once the TCP protocol is Ready, the | ||||
TLS handshake will be performed as a client (starting by sending a | ||||
"client_hello", and so on). | ||||
InitiateWithSend: Early idempotent data is supported by TLS 1.3, and | ||||
sends encrypted application data in the first TLS message when | ||||
performing session resumption. For older versions of TLS, or if a | ||||
session is not being resumed, the initial data will be delayed | ||||
until the TLS handshake is complete. TCP Fast Option can also be | ||||
enabled automatically. | ||||
Ready: A TLS Connection is ready once the underlying TCP connection | ||||
is Ready, and TLS handshake is also complete and keys have been | ||||
established to encrypt application data. | ||||
InitiateError: In addition to TCP initiation errors, TLS can | ||||
generate errors during its handshake. Examples of error include a | ||||
failure of the peer to successfully authenticate, the peer | ||||
rejecting the local authentication, or a failure to match versions | ||||
or algorithms. | ||||
ConnectionError: TLS connections will generate TCP errors, or errors | ||||
due to failures to rekey or decrypt received messages. | ||||
Listen: Calling "Listen" for TLS listens on TCP, and sets up | ||||
received connections to perform server-side TLS handshakes. | ||||
ConnectionReceived: TLS Listeners will deliver new connections once | ||||
they have successfully completed both TCP and TLS handshakes. | ||||
Clone: As with TCP, calling "Clone" on a TLS Connection creates a | ||||
new Connection with equivalent parameters. The two Connections | ||||
are otherwise independent. | ||||
Send: Like TCP, TLS does not preserve message boundaries. Although | ||||
application data is framed natively in TLS, there is not a general | ||||
guarantee that these TLS messages represent semantically | ||||
meaningful application stream boundaries. Rather, sending data on | ||||
a TLS Connection only guarantees that the application data will be | ||||
transmitted in an encrypted form. Marking Messages as Final | ||||
causes a "close_notify" to be generated once the data has been | ||||
written. | ||||
Receive: Like TCP, TLS delivers a stream of bytes without any | ||||
Message delineation. The data is decrypted prior to being | ||||
delivered to the application. If a "close_notify" is received, | ||||
the stream-wide Message will be delivered with EndOfMessage set. | ||||
Close: Calling "Close" on a TLS Connection indicates that the | ||||
Connection should be gracefully closed by sending a "close_notify" | ||||
to the peer and waiting for a corresponding "close_notify" before | ||||
delivering the "Closed" event. | ||||
Abort: Calling "Abort" on a TCP Connection indicates that the | ||||
Connection should be immediately closed by sending a | ||||
"close_notify", optionally preceded by "user_canceled", to the | ||||
peer. Implementations do not need to wait to receive | ||||
"close_notify" before delivering the "Closed" event. | ||||
9.4. DTLS | ||||
DTLS follows the same behavior as TLS (Section 9.3), with the notable | ||||
exception of not inheriting behavior directly from TCP. Differences | ||||
from TLS are detailed below, and all cases not explicitly mentioned | ||||
should be considered the same as TLS. | ||||
Connectedness: Connected | ||||
Data Unit: Datagram | ||||
Connection Object: Connection objects represent a single DTLS | ||||
connection running over a set of UDP ports between two hosts. | ||||
Initiate: Calling "Initiate" on a DTLS Connection causes it reserve | ||||
a UDP local port, and begin sending handshake messages to the peer | ||||
over UDP. These messages are reliable, and will be automatically | ||||
retransmitted. | ||||
Ready: A DTLS Connection is ready once the TLS handshake is complete | ||||
and keys have been established to encrypt application data. | ||||
Send: Sending over DTLS does preserve message boundaries in the same | ||||
way that UDP datagrams do. Marking a Message as Final does send a | ||||
"close_notify" like TLS. | ||||
Receive: Receiving over DTLS delivers one decrypted Message for each | ||||
received DTLS datagram. If a "close_notify" is received, a | ||||
Message will be delivered that is marked as Final. | ||||
9.5. HTTP | 9.5. HTTP | |||
HTTP requests and responses map naturally into Messages, since they | HTTP requests and responses map naturally into Messages, since they | |||
are delineated chunks of data with metadata that can be sent over a | are delineated chunks of data with metadata that can be sent over a | |||
transport. To that end, HTTP can be seen as the most prevalent | transport. To that end, HTTP can be seen as the most prevalent | |||
framing protocol that runs on top of streams like TCP, TLS, etc. | framing protocol that runs on top of streams like TCP, TLS, etc. | |||
In order to use a transport Connection that provides HTTP Message | In order to use a transport Connection that provides HTTP Message | |||
support, the establishment and closing of the connection can be | support, the establishment and closing of the connection can be | |||
treated as it would without the framing protocol. Sending and | treated as it would without the framing protocol. Sending and | |||
receiving of Messages, however, changes to treat each Message as a | receiving of Messages, however, changes to treat each Message as a | |||
well-delineated HTTP request or response, with the content of the | well-delineated HTTP request or response, with the content of the | |||
Message representing the body, and the Headers being provided in | Message representing the body, and the Headers being provided in | |||
Message metadata. | Message metadata. | |||
Connectedness: Multiplexing Connected | ||||
Data Unit: Message | ||||
Connection Object: Connection objects represent a flow of HTTP | ||||
messages between a client and a server, which may be an HTTP/1.1 | ||||
connection over TCP, or a single stream in an HTTP/2 connection. | ||||
Initiate: Calling "Initiate" on an HTTP connection intiates a TCP or | ||||
TLS connection as a client. | ||||
Clone: Calling "Clone" on an HTTP Connection opens a new stream on | ||||
an existing HTTP/2 connection when possible. If the underlying | ||||
version does not support multiplexed streams, calling "Clone" | ||||
simply creates a new parallel connection. | ||||
Send: When an application sends an HTTP Message, it is expected to | ||||
provide HTTP header values as a MessageContext in a canonical | ||||
form, along with any associated HTTP message body as the Message | ||||
data. The HTTP header values are encoded in the specific version | ||||
format upon sending. | ||||
Receive: HTTP Connections deliver Messages in which HTTP header | ||||
values attached to MessageContexts, and HTTP bodies in Message | ||||
data. | ||||
Close: Calling "Close" on an HTTP Connection will only close the | ||||
underlying TLS or TCP connection if the HTTP version does not | ||||
support multiplexing. For HTTP/2, for example, closing the | ||||
connection only closes a specific stream. | ||||
9.6. QUIC | 9.6. QUIC | |||
QUIC provides a multi-streaming interface to an encrypted transport. | QUIC provides a multi-streaming interface to an encrypted transport. | |||
Each stream can be viewed as equivalent to a TLS stream over TCP, so | Each stream can be viewed as equivalent to a TLS stream over TCP, so | |||
a natural mapping is to present each QUIC stream as an individual | a natural mapping is to present each QUIC stream as an individual | |||
Connection. The protocol for the stream will be considered Ready | Connection. The protocol for the stream will be considered Ready | |||
whenever the underlying QUIC connection is established to the point | whenever the underlying QUIC connection is established to the point | |||
that this stream's data can be sent. For streams after the first | that this stream's data can be sent. For streams after the first | |||
stream, this will likely be an immediate operation. | stream, this will likely be an immediate operation. | |||
Closing a single QUIC stream, presented to the application as a | Closing a single QUIC stream, presented to the application as a | |||
Connection, does not imply closing the underlying QUIC connection | Connection, does not imply closing the underlying QUIC connection | |||
itself. Rather, the implementation may choose to close the QUIC | itself. Rather, the implementation may choose to close the QUIC | |||
connection once all streams have been closed (possibly after some | connection once all streams have been closed (often after some | |||
timeout), or after an individual stream Connection sends an Abort. | timeout), or after an individual stream Connection sends an Abort. | |||
Messages over a direct QUIC stream should be represented similarly to | Connectedness: Multiplexing Connected | |||
the TCP stream (one Message per direction, see Section 9.1), unless a | ||||
framing mapping is used on top of QUIC. | Data Unit: Stream | |||
Connection Object: Connection objects represent a single QUIC stream | ||||
on a QUIC connection. | ||||
9.7. HTTP/2 transport | 9.7. HTTP/2 transport | |||
Similar to QUIC (Section 9.6), HTTP/2 provides a multi-streaming | Similar to QUIC (Section 9.6), HTTP/2 provides a multi-streaming | |||
interface. This will generally use HTTP as the unit of Messages over | interface. This will generally use HTTP as the unit of Messages over | |||
the streams, in which each stream can be represented as a transport | the streams, in which each stream can be represented as a transport | |||
Connection. The lifetime of streams and the HTTP/2 connection should | Connection. The lifetime of streams and the HTTP/2 connection should | |||
be managed as described for QUIC. | be managed as described for QUIC. | |||
It is possible to treat each HTTP/2 stream as a raw byte-stream | It is possible to treat each HTTP/2 stream as a raw byte-stream | |||
instead of a carrier for HTTP messages, in which case the Messages | instead of a carrier for HTTP messages, in which case the Messages | |||
over the streams can be represented similarly to the TCP stream (one | over the streams can be represented similarly to the TCP stream (one | |||
Message per direction, see Section 9.1). | Message per direction, see Section 9.1). | |||
10. Rendezvous and Environment Discovery | Connectedness: Multiplexing Connected | |||
The connection establishment process outlined in Section 4 is | ||||
appropriate for client-server connections, but needs to be expanded | ||||
in peer-to-peer Rendezvous scenarios, as follows: | ||||
o Gathering Local Endpoint candidates | ||||
The set of possible Local Endpoints is gathered. In the simple | ||||
case, this merely enumerates the local interfaces and protocols, | ||||
allocates ephemeral source ports. For example, a system that has | ||||
WiFi and Ethernet and supports IPv4 and IPv6 might gather four | ||||
candidate locals (IPv4 on Ethernet, IPv6 on Ethernet, IPv4 on | ||||
WiFi, and IPv6 on WiFi) that can form the source for a transient. | ||||
If NAT traversal is required, the process of gathering Local | ||||
Endpoints becomes broadly equivalent to the ICE candidate | ||||
gathering phase [RFC5245]. The endpoint determines its server | ||||
reflexive Local Endpoints (i.e., the translated address of a | ||||
local, on the other side of a NAT) and relayed locals (e.g., via a | ||||
TURN server or other relay), for each interface and network | ||||
protocol. These are added to the set of candidate Local Endpoints | ||||
for this connection. | ||||
Gathering Local Endpoints is primarily a local operation, although | ||||
it might involve exchanges with a STUN server to derive server | ||||
reflexive locals, or with a TURN server or other relay to derive | ||||
relayed locals. It does not involve communication with the Remote | ||||
Endpoint. | ||||
o Gathering Remote Endpoint Candidates | ||||
The Remote Endpoint is typically a name that needs to be resolved | ||||
into a set of possible addresses that can be used for | ||||
communication. Resolving the Remote Endpoint is the process of | ||||
recursively performing such name lookups, until fully resolved, to | ||||
return the set of candidates for the remote of this connection. | ||||
How this is done will depend on the type of the Remote Endpoint, | ||||
and can also be specific to each Local Endpoint. A common case is | ||||
when the Remote Endpoint is a DNS name, in which case it is | ||||
resolved to give a set of IPv4 and IPv6 addresses representing | ||||
that name. Some types of remote might require more complex | ||||
resolution. Resolving the Remote Endpoint for a peer-to-peer | ||||
connection might involve communication with a rendezvous server, | ||||
which in turn contacts the peer to gain consent to communicate and | ||||
retrieve its set of candidate locals, which are returned and form | ||||
the candidate remote addresses for contacting that peer. | ||||
Resolving the remote is _not_ a local operation. It will involve | ||||
a directory service, and can require communication with the remote | ||||
to rendezvous and exchange peer addresses. This can expose some | ||||
or all of the candidate locals to the remote. | ||||
o Establishing Connections | Data Unit: Stream | |||
The set of candidate Local Endpoints and the set of candidate | Connection Object: Connection objects represent a single HTTP/2 | |||
Remote Endpoints are paired, to derive a priority ordered set of | stream on a HTTP/2 connection. | |||
Candidate Paths that can potentially be used to establish a | ||||
Connection. | ||||
Then, communication is attempted over each candidate path, in | 9.8. SCTP | |||
priority order. If there are multiple candidates with the same | ||||
priority, then connection establishment proceeds simultaneously | ||||
and uses the transient that wins the race to be established. | ||||
Otherwise, connection establishment is sequential, paced at a rate | ||||
that should not congest the network. Depending on the chosen | ||||
transport, this phase might involve racing TCP connections to a | ||||
server over IPv4 and IPv6 [RFC8305], or it could involve a STUN | ||||
exchange to establish peer-to-peer UDP connectivity [RFC5245], or | ||||
some other means. | ||||
o Confirming and Maintaining Connections | To support sender-side stream schedulers (which are implemented on | |||
the sender side), a receiver-side Transport System should always | ||||
support message interleaving [RFC8260]. | ||||
Once connectivity has been established, unused resources can be | SCTP messages can be very large. To allow the reception of large | |||
released and the chosen path can be confirmed. This is primarily | messages in pieces, a "partial flag" can be used to inform a (native | |||
required when establishing peer-to-peer connectivity, where | SCTP) receiving application that a message is incomplete. After | |||
connections supporting relayed locals that were not required can | receiving the "partial flag", this application would know that the | |||
be closed, and where an associated signalling operation might be | next receive calls will only deliver remaining parts of the same | |||
needed to inform middleboxes and proxies of the chosen path. | message (i.e., no messages or partial messages will arrive on other | |||
Keep-alive messages may also be sent, as appropriate, to ensure | streams until the message is complete) (see Section 8.1.20 in | |||
NAT and firewall state is maintained, so the Connection remains | [RFC6458]). The "partial flag" can therefore facilitate the | |||
operational. | implementation of the receiver buffer in the receiving application, | |||
at the cost of limiting multiplexing and temporarily creating head- | ||||
of-line blocking delay at the receiver. | ||||
To support ICE, or similar protocols, that involve an out-of-band | When a Transport System transfers a Message, it seems natural to map | |||
indirect signalling exchange to exchange candidates with the Remote | the Message object to SCTP messages in order to support properties | |||
Endpoint, it's important to be able to query the set of candidate | such as "Ordered" or "Lifetime" (which maps onto partially reliable | |||
Local Endpoints, and give the protocol stack a set of candidate | delivery with a SCTP_PR_SCTP_TTL policy [RFC6458]). However, since | |||
Remote Endpoints, before it attempts to establish connections. | multiplexing of Connections onto SCTP streams may happen, and would | |||
be hidden from the application, the Transport System requires a per- | ||||
stream receiver buffer anyway, so this potential benefit is lost and | ||||
the "partial flag" becomes unnecessary for the system. | ||||
(TO-DO: It is expected that a single abstract algorithm can be | The problem of long messages either requiring large receiver-side | |||
identified that supports both the peer-to-peer and client-server | buffers or getting in the way of multiplexing is addressed by message | |||
connection racing, allowing this text to be merged with Section 4) | interleaving [RFC8260], which is yet another reason why a receivers- | |||
side transport system supporting SCTP should implement this | ||||
mechanism. | ||||
11. IANA Considerations | 10. IANA Considerations | |||
RFC-EDITOR: Please remove this section before publication. | RFC-EDITOR: Please remove this section before publication. | |||
This document has no actions for IANA. | This document has no actions for IANA. | |||
12. Security Considerations | 11. Security Considerations | |||
12.1. Considerations for Candidate Gathering | 11.1. Considerations for Candidate Gathering | |||
Implementations should avoid downgrade attacks that allow network | Implementations should avoid downgrade attacks that allow network | |||
interference to cause the implementation to select less secure, or | interference to cause the implementation to select less secure, or | |||
entirely insecure, combinations of paths and protocols. | entirely insecure, combinations of paths and protocols. | |||
12.2. Considerations for Candidate Racing | 11.2. Considerations for Candidate Racing | |||
See Section 5.2 for security considerations around racing with 0-RTT | See Section 5.2 for security considerations around racing with 0-RTT | |||
data. | data. | |||
An attacker that knows a particular device is racing several options | An attacker that knows a particular device is racing several options | |||
during connection establishment may be able to block packets for the | during connection establishment may be able to block packets for the | |||
first connection attempt, thus inducing the device to fall back to a | first connection attempt, thus inducing the device to fall back to a | |||
secondary attempt. This is a problem if the secondary attempts have | secondary attempt. This is a problem if the secondary attempts have | |||
worse security properties that enable further attacks. | worse security properties that enable further attacks. | |||
Implementations should ensure that all options have equivalent | Implementations should ensure that all options have equivalent | |||
security properties to avoid incentivizing attacks. | security properties to avoid incentivizing attacks. | |||
Since results from the network can determine how a connection attempt | Since results from the network can determine how a connection attempt | |||
tree is built, such as when DNS returns a list of resolved endpoints, | tree is built, such as when DNS returns a list of resolved endpoints, | |||
it is possible for the network to cause an implementation to consume | it is possible for the network to cause an implementation to consume | |||
significant on-device resources. Implementations should limit the | significant on-device resources. Implementations should limit the | |||
maximum amount of state allowed for any given node, including the | maximum amount of state allowed for any given node, including the | |||
number of child nodes, especially when the state is based on results | number of child nodes, especially when the state is based on results | |||
from the network. | from the network. | |||
13. Acknowledgements | 12. Acknowledgements | |||
This work has received funding from the European Union's Horizon 2020 | This work has received funding from the European Union's Horizon 2020 | |||
research and innovation programme under grant agreement No. 644334 | research and innovation programme under grant agreement No. 644334 | |||
(NEAT). | (NEAT). | |||
This work has been supported by Leibniz Prize project funds of DFG - | This work has been supported by Leibniz Prize project funds of DFG - | |||
German Research Foundation: Gottfried Wilhelm Leibniz-Preis 2011 (FKZ | German Research Foundation: Gottfried Wilhelm Leibniz-Preis 2011 (FKZ | |||
FE 570/4-1). | FE 570/4-1). | |||
This work has been supported by the UK Engineering and Physical | This work has been supported by the UK Engineering and Physical | |||
Sciences Research Council under grant EP/R04144X/1. | Sciences Research Council under grant EP/R04144X/1. | |||
Thanks to Stuart Cheshire, Josh Graessley, David Schinazi, and Eric | Thanks to Stuart Cheshire, Josh Graessley, David Schinazi, and Eric | |||
Kinnear for their implementation and design efforts, including Happy | Kinnear for their implementation and design efforts, including Happy | |||
Eyeballs, that heavily influenced this work. | Eyeballs, that heavily influenced this work. | |||
14. References | 13. References | |||
14.1. Normative References | 13.1. Normative References | |||
[I-D.ietf-taps-arch] | [I-D.ietf-taps-arch] | |||
Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G., | Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G., | |||
Perkins, C., Tiesel, P., and C. Wood, "An Architecture for | Perkins, C., Tiesel, P., and C. Wood, "An Architecture for | |||
Transport Services", draft-ietf-taps-arch-02 (work in | Transport Services", draft-ietf-taps-arch-03 (work in | |||
progress), October 2018. | progress), March 2019. | |||
[I-D.ietf-taps-interface] | [I-D.ietf-taps-interface] | |||
Trammell, B., Welzl, M., Enghardt, T., Fairhurst, G., | Trammell, B., Welzl, M., Enghardt, T., Fairhurst, G., | |||
Kuehlewind, M., Perkins, C., Tiesel, P., and C. Wood, "An | Kuehlewind, M., Perkins, C., Tiesel, P., and C. Wood, "An | |||
Abstract Application Layer Interface to Transport | Abstract Application Layer Interface to Transport | |||
Services", draft-ietf-taps-interface-02 (work in | Services", draft-ietf-taps-interface-03 (work in | |||
progress), October 2018. | progress), March 2019. | |||
[I-D.ietf-taps-minset] | [I-D.ietf-taps-minset] | |||
Welzl, M. and S. Gjessing, "A Minimal Set of Transport | Welzl, M. and S. Gjessing, "A Minimal Set of Transport | |||
Services for End Systems", draft-ietf-taps-minset-11 (work | Services for End Systems", draft-ietf-taps-minset-11 (work | |||
in progress), September 2018. | in progress), September 2018. | |||
[RFC6458] Stewart, R., Tuexen, M., Poon, K., Lei, P., and V. | [RFC6458] Stewart, R., Tuexen, M., Poon, K., Lei, P., and V. | |||
Yasevich, "Sockets API Extensions for the Stream Control | Yasevich, "Sockets API Extensions for the Stream Control | |||
Transmission Protocol (SCTP)", RFC 6458, | Transmission Protocol (SCTP)", RFC 6458, | |||
DOI 10.17487/RFC6458, December 2011, | DOI 10.17487/RFC6458, December 2011, | |||
skipping to change at page 34, line 35 ¶ | skipping to change at page 39, line 35 ¶ | |||
[RFC8305] Schinazi, D. and T. Pauly, "Happy Eyeballs Version 2: | [RFC8305] Schinazi, D. and T. Pauly, "Happy Eyeballs Version 2: | |||
Better Connectivity Using Concurrency", RFC 8305, | Better Connectivity Using Concurrency", RFC 8305, | |||
DOI 10.17487/RFC8305, December 2017, | DOI 10.17487/RFC8305, December 2017, | |||
<https://www.rfc-editor.org/info/rfc8305>. | <https://www.rfc-editor.org/info/rfc8305>. | |||
[RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol | [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol | |||
Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, | Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, | |||
<https://www.rfc-editor.org/info/rfc8446>. | <https://www.rfc-editor.org/info/rfc8446>. | |||
14.2. Informative References | 13.2. Informative References | |||
[I-D.ietf-quic-transport] | [I-D.ietf-quic-transport] | |||
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | |||
and Secure Transport", draft-ietf-quic-transport-18 (work | and Secure Transport", draft-ietf-quic-transport-20 (work | |||
in progress), January 2019. | in progress), April 2019. | |||
[NEAT-flow-mapping] | [NEAT-flow-mapping] | |||
"Transparent Flow Mapping for NEAT (in Workshop on Future | "Transparent Flow Mapping for NEAT (in Workshop on Future | |||
of Internet Transport (FIT 2017))", n.d.. | of Internet Transport (FIT 2017))", n.d.. | |||
[RFC5245] Rosenberg, J., "Interactive Connectivity Establishment | [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment | |||
(ICE): A Protocol for Network Address Translator (NAT) | (ICE): A Protocol for Network Address Translator (NAT) | |||
Traversal for Offer/Answer Protocols", RFC 5245, | Traversal for Offer/Answer Protocols", RFC 5245, | |||
DOI 10.17487/RFC5245, April 2010, | DOI 10.17487/RFC5245, April 2010, | |||
<https://www.rfc-editor.org/info/rfc5245>. | <https://www.rfc-editor.org/info/rfc5245>. | |||
skipping to change at page 36, line 38 ¶ | skipping to change at page 41, line 38 ¶ | |||
Tom Jones | Tom Jones | |||
University of Aberdeen | University of Aberdeen | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen, AB24 3UE | Aberdeen, AB24 3UE | |||
UK | UK | |||
Email: tom@erg.abdn.ac.uk | Email: tom@erg.abdn.ac.uk | |||
Philipp S. Tiesel | Philipp S. Tiesel | |||
TU Berlin | TU Berlin | |||
Marchstrasse 23 | Einsteinufer 25 | |||
10587 Berlin | 10587 Berlin | |||
Germany | Germany | |||
Email: philipp@inet.tu-berlin.de | Email: philipp@tiesel.net | |||
Colin Perkins | Colin Perkins | |||
University of Glasgow | University of Glasgow | |||
School of Computing Science | School of Computing Science | |||
Glasgow G12 8QQ | Glasgow G12 8QQ | |||
United Kingdom | United Kingdom | |||
Email: csp@csperkins.org | Email: csp@csperkins.org | |||
Michael Welzl | Michael Welzl | |||
University of Oslo | University of Oslo | |||
End of changes. 49 change blocks. | ||||
227 lines changed or deleted | 440 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |