draft-ietf-taps-impl-06.txt | draft-ietf-taps-impl-07.txt | |||
---|---|---|---|---|
TAPS Working Group A. Brunstrom, Ed. | TAPS Working Group A. Brunstrom, Ed. | |||
Internet-Draft Karlstad University | Internet-Draft Karlstad University | |||
Intended status: Informational T. Pauly, Ed. | Intended status: Informational T. Pauly, Ed. | |||
Expires: 10 September 2020 Apple Inc. | Expires: 14 January 2021 Apple Inc. | |||
T. Enghardt | T. Enghardt | |||
TU Berlin | Netflix | |||
K-J. Grinnemo | K-J. Grinnemo | |||
Karlstad University | Karlstad University | |||
T. Jones | T. Jones | |||
University of Aberdeen | University of Aberdeen | |||
P. Tiesel | P. Tiesel | |||
TU Berlin | TU Berlin | |||
C. Perkins | C. Perkins | |||
University of Glasgow | University of Glasgow | |||
M. Welzl | M. Welzl | |||
University of Oslo | University of Oslo | |||
9 March 2020 | 13 July 2020 | |||
Implementing Interfaces to Transport Services | Implementing Interfaces to Transport Services | |||
draft-ietf-taps-impl-06 | draft-ietf-taps-impl-07 | |||
Abstract | Abstract | |||
The Transport Services architecture [I-D.ietf-taps-arch] defines a | The Transport Services (TAPS) system enables applications to use | |||
system that allows applications to use transport networking protocols | transport protocols flexibly for network communication and defines a | |||
flexibly. This document serves as a guide to implementation on how | protocol-independent TAPS Application Programming Interface (API) | |||
to build such a system. | that is based on an asynchronous, event-driven interaction pattern. | |||
This document serves as a guide to implementation on how to build | ||||
such a system. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on 10 September 2020. | This Internet-Draft will expire on 14 January 2021. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
skipping to change at page 2, line 31 ¶ | skipping to change at page 2, line 31 ¶ | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Implementing Connection Objects . . . . . . . . . . . . . . . 4 | 2. Implementing Connection Objects . . . . . . . . . . . . . . . 4 | |||
3. Implementing Pre-Establishment . . . . . . . . . . . . . . . 5 | 3. Implementing Pre-Establishment . . . . . . . . . . . . . . . 5 | |||
3.1. Configuration-time errors . . . . . . . . . . . . . . . . 5 | 3.1. Configuration-time errors . . . . . . . . . . . . . . . . 5 | |||
3.2. Role of system policy . . . . . . . . . . . . . . . . . . 6 | 3.2. Role of system policy . . . . . . . . . . . . . . . . . . 6 | |||
4. Implementing Connection Establishment . . . . . . . . . . . . 7 | 4. Implementing Connection Establishment . . . . . . . . . . . . 7 | |||
4.1. Candidate Gathering . . . . . . . . . . . . . . . . . . . 8 | 4.1. Candidate Gathering . . . . . . . . . . . . . . . . . . . 8 | |||
4.1.1. Gathering Endpoint Candidates . . . . . . . . . . . . 8 | 4.1.1. Gathering Endpoint Candidates . . . . . . . . . . . . 8 | |||
4.1.2. Structuring Options as a Tree . . . . . . . . . . . . 9 | 4.1.2. Structuring Options as a Tree . . . . . . . . . . . . 9 | |||
4.1.3. Branch Types . . . . . . . . . . . . . . . . . . . . 11 | 4.1.3. Branch Types . . . . . . . . . . . . . . . . . . . . 11 | |||
4.2. Branching Order-of-Operations . . . . . . . . . . . . . . 13 | 4.1.4. Branching Order-of-Operations . . . . . . . . . . . . 13 | |||
4.3. Sorting Branches . . . . . . . . . . . . . . . . . . . . 14 | 4.1.5. Sorting Branches . . . . . . . . . . . . . . . . . . 14 | |||
4.4. Candidate Racing . . . . . . . . . . . . . . . . . . . . 16 | 4.2. Candidate Racing . . . . . . . . . . . . . . . . . . . . 16 | |||
4.4.1. Delayed . . . . . . . . . . . . . . . . . . . . . . . 16 | 4.2.1. Immediate . . . . . . . . . . . . . . . . . . . . . . 16 | |||
4.4.2. Failover . . . . . . . . . . . . . . . . . . . . . . 17 | 4.2.2. Delayed . . . . . . . . . . . . . . . . . . . . . . . 17 | |||
4.5. Completing Establishment . . . . . . . . . . . . . . . . 17 | 4.2.3. Failover . . . . . . . . . . . . . . . . . . . . . . 17 | |||
4.5.1. Determining Successful Establishment . . . . . . . . 18 | 4.3. Completing Establishment . . . . . . . . . . . . . . . . 18 | |||
4.6. Establishing multiplexed connections . . . . . . . . . . 19 | 4.3.1. Determining Successful Establishment . . . . . . . . 19 | |||
4.7. Handling racing with "unconnected" protocols . . . . . . 19 | 4.4. Establishing multiplexed connections . . . . . . . . . . 19 | |||
4.8. Implementing listeners . . . . . . . . . . . . . . . . . 20 | 4.5. Handling racing with "unconnected" protocols . . . . . . 20 | |||
4.8.1. Implementing listeners for Connected Protocols . . . 20 | 4.6. Implementing listeners . . . . . . . . . . . . . . . . . 20 | |||
4.8.2. Implementing listeners for Unconnected Protocols . . 21 | 4.6.1. Implementing listeners for Connected Protocols . . . 21 | |||
4.8.3. Implementing listeners for Multiplexed Protocols . . 21 | 4.6.2. Implementing listeners for Unconnected Protocols . . 21 | |||
4.6.3. Implementing listeners for Multiplexed Protocols . . 21 | ||||
5. Implementing Sending and Receiving Data . . . . . . . . . . . 21 | 5. Implementing Sending and Receiving Data . . . . . . . . . . . 21 | |||
5.1. Sending Messages . . . . . . . . . . . . . . . . . . . . 22 | 5.1. Sending Messages . . . . . . . . . . . . . . . . . . . . 22 | |||
5.1.1. Message Properties . . . . . . . . . . . . . . . . . 22 | 5.1.1. Message Properties . . . . . . . . . . . . . . . . . 22 | |||
5.1.2. Send Completion . . . . . . . . . . . . . . . . . . . 23 | 5.1.2. Send Completion . . . . . . . . . . . . . . . . . . . 23 | |||
5.1.3. Batching Sends . . . . . . . . . . . . . . . . . . . 23 | 5.1.3. Batching Sends . . . . . . . . . . . . . . . . . . . 24 | |||
5.2. Receiving Messages . . . . . . . . . . . . . . . . . . . 24 | 5.2. Receiving Messages . . . . . . . . . . . . . . . . . . . 24 | |||
5.3. Handling of data for fast-open protocols . . . . . . . . 24 | 5.3. Handling of data for fast-open protocols . . . . . . . . 24 | |||
6. Implementing Message Framers . . . . . . . . . . . . . . . . 25 | 6. Implementing Message Framers . . . . . . . . . . . . . . . . 25 | |||
6.1. Defining Message Framers . . . . . . . . . . . . . . . . 26 | 6.1. Defining Message Framers . . . . . . . . . . . . . . . . 26 | |||
6.2. Sender-side Message Framing . . . . . . . . . . . . . . . 27 | 6.2. Sender-side Message Framing . . . . . . . . . . . . . . . 27 | |||
6.3. Receiver-side Message Framing . . . . . . . . . . . . . . 27 | 6.3. Receiver-side Message Framing . . . . . . . . . . . . . . 27 | |||
7. Implementing Connection Management . . . . . . . . . . . . . 28 | 7. Implementing Connection Management . . . . . . . . . . . . . 28 | |||
7.1. Pooled Connection . . . . . . . . . . . . . . . . . . . . 29 | 7.1. Pooled Connection . . . . . . . . . . . . . . . . . . . . 29 | |||
7.2. Handling Path Changes . . . . . . . . . . . . . . . . . . 29 | 7.2. Handling Path Changes . . . . . . . . . . . . . . . . . . 29 | |||
8. Implementing Connection Termination . . . . . . . . . . . . . 30 | 8. Implementing Connection Termination . . . . . . . . . . . . . 30 | |||
9. Cached State . . . . . . . . . . . . . . . . . . . . . . . . 31 | 9. Cached State . . . . . . . . . . . . . . . . . . . . . . . . 31 | |||
9.1. Protocol state caches . . . . . . . . . . . . . . . . . . 31 | 9.1. Protocol state caches . . . . . . . . . . . . . . . . . . 31 | |||
9.2. Performance caches . . . . . . . . . . . . . . . . . . . 32 | 9.2. Performance caches . . . . . . . . . . . . . . . . . . . 32 | |||
10. Specific Transport Protocol Considerations . . . . . . . . . 33 | 10. Specific Transport Protocol Considerations . . . . . . . . . 33 | |||
10.1. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . 34 | 10.1. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . 34 | |||
10.2. UDP . . . . . . . . . . . . . . . . . . . . . . . . . . 35 | 10.2. UDP . . . . . . . . . . . . . . . . . . . . . . . . . . 35 | |||
10.3. UDP Multicast Receive . . . . . . . . . . . . . . . . . 36 | 10.3. UDP Multicast Receive . . . . . . . . . . . . . . . . . 37 | |||
10.4. TLS . . . . . . . . . . . . . . . . . . . . . . . . . . 38 | 10.4. TLS . . . . . . . . . . . . . . . . . . . . . . . . . . 38 | |||
10.5. DTLS . . . . . . . . . . . . . . . . . . . . . . . . . . 39 | 10.5. DTLS . . . . . . . . . . . . . . . . . . . . . . . . . . 40 | |||
10.6. HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . 40 | 10.6. HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . 40 | |||
10.7. QUIC . . . . . . . . . . . . . . . . . . . . . . . . . . 41 | 10.7. QUIC . . . . . . . . . . . . . . . . . . . . . . . . . . 41 | |||
10.8. HTTP/2 transport . . . . . . . . . . . . . . . . . . . . 41 | 10.8. HTTP/2 transport . . . . . . . . . . . . . . . . . . . . 42 | |||
10.9. SCTP . . . . . . . . . . . . . . . . . . . . . . . . . . 42 | 10.9. SCTP . . . . . . . . . . . . . . . . . . . . . . . . . . 42 | |||
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44 | 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44 | |||
12. Security Considerations . . . . . . . . . . . . . . . . . . . 44 | 12. Security Considerations . . . . . . . . . . . . . . . . . . . 45 | |||
12.1. Considerations for Candidate Gathering . . . . . . . . . 44 | 12.1. Considerations for Candidate Gathering . . . . . . . . . 45 | |||
12.2. Considerations for Candidate Racing . . . . . . . . . . 44 | 12.2. Considerations for Candidate Racing . . . . . . . . . . 45 | |||
13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 45 | 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 45 | |||
14. References . . . . . . . . . . . . . . . . . . . . . . . . . 45 | 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
14.1. Normative References . . . . . . . . . . . . . . . . . . 45 | 14.1. Normative References . . . . . . . . . . . . . . . . . . 46 | |||
14.2. Informative References . . . . . . . . . . . . . . . . . 46 | 14.2. Informative References . . . . . . . . . . . . . . . . . 47 | |||
Appendix A. Additional Properties . . . . . . . . . . . . . . . 47 | Appendix A. Additional Properties . . . . . . . . . . . . . . . 48 | |||
A.1. Properties Affecting Sorting of Branches . . . . . . . . 47 | A.1. Properties Affecting Sorting of Branches . . . . . . . . 48 | |||
Appendix B. Reasons for errors . . . . . . . . . . . . . . . . . 47 | Appendix B. Reasons for errors . . . . . . . . . . . . . . . . . 49 | |||
Appendix C. Existing Implementations . . . . . . . . . . . . . . 48 | Appendix C. Existing Implementations . . . . . . . . . . . . . . 50 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 49 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 51 | |||
1. Introduction | 1. Introduction | |||
The Transport Services architecture [I-D.ietf-taps-arch] defines a | The Transport Services architecture [I-D.ietf-taps-arch] defines a | |||
system that allows applications to use transport networking protocols | system that allows applications to use transport networking protocols | |||
flexibly. The interface such a system exposes to applications is | flexibly. The interface such a system exposes to applications is | |||
defined as the Transport Services API [I-D.ietf-taps-interface]. | defined as the Transport Services API [I-D.ietf-taps-interface]. | |||
This API is designed to be generic across multiple transport | This API is designed to be generic across multiple transport | |||
protocols and sets of protocols features. | protocols and sets of protocols features. | |||
skipping to change at page 4, line 12 ¶ | skipping to change at page 4, line 18 ¶ | |||
an application into decisions on how to establish connections, and | an application into decisions on how to establish connections, and | |||
how to transfer data over those connections once established. The | how to transfer data over those connections once established. The | |||
terminology used in this document is based on the Architecture | terminology used in this document is based on the Architecture | |||
[I-D.ietf-taps-arch]. | [I-D.ietf-taps-arch]. | |||
2. Implementing Connection Objects | 2. Implementing Connection Objects | |||
The connection objects that are exposed to applications for Transport | The connection objects that are exposed to applications for Transport | |||
Services are: | Services are: | |||
* the Preconnection, the bundle of properties that describes the | * the Preconnection, the bundle of Properties that describes the | |||
application constraints on the transport; | application constraints on the transport; | |||
* the Connection, the basic object that represents a flow of data in | * the Connection, the basic object that represents a flow of data as | |||
either direction between the Local and Remote Endpoints; | Messages in either direction between the Local and Remote | |||
Endpoints; | ||||
* and the Listener, a passive waiting object that delivers new | * and the Listener, a passive waiting object that delivers new | |||
Connections. | Connections. | |||
Preconnection objects should be implemented as bundles of properties | Preconnection objects should be implemented as bundles of properties | |||
that an application can both read and write. Once a Preconnection | that an application can both read and write. Once a Preconnection | |||
has been used to create an outbound Connection or a Listener, the | has been used to create an outbound Connection or a Listener, the | |||
implementation should ensure that the copy of the properties held by | implementation should ensure that the copy of the properties held by | |||
the Connection or Listener is immutable. This may involve performing | the Connection or Listener is immutable. This may involve performing | |||
a deep-copy if the application is still able to modify properties on | a deep-copy if the application is still able to modify properties on | |||
skipping to change at page 4, line 45 ¶ | skipping to change at page 4, line 52 ¶ | |||
a specific Protocol Stack. The notion of a Connection maps to many | a specific Protocol Stack. The notion of a Connection maps to many | |||
different protocols, depending on the Protocol Stack. For example, | different protocols, depending on the Protocol Stack. For example, | |||
the Connection may ultimately represent the interface into a TCP | the Connection may ultimately represent the interface into a TCP | |||
connection, a TLS session over TCP, a UDP flow with fully-specified | connection, a TLS session over TCP, a UDP flow with fully-specified | |||
local and remote endpoints, a DTLS session, a SCTP stream, a QUIC | local and remote endpoints, a DTLS session, a SCTP stream, a QUIC | |||
stream, or an HTTP/2 stream. | stream, or an HTTP/2 stream. | |||
Listener objects are created with a Preconnection, at which point | Listener objects are created with a Preconnection, at which point | |||
their configuration should be considered immutable by the | their configuration should be considered immutable by the | |||
implementation. The process of listening is described in | implementation. The process of listening is described in | |||
Section 4.8. | Section 4.6. | |||
3. Implementing Pre-Establishment | 3. Implementing Pre-Establishment | |||
During pre-establishment the application specifies the Endpoints to | During pre-establishment the application specifies the Endpoints to | |||
be used for communication as well as its preferences via Selection | be used for communication as well as its preferences via Selection | |||
Properties and, if desired, also Connection Properties. Generally, | Properties and, if desired, also Connection Properties. Generally, | |||
Connection Properties should be configured as early as possible, as | Connection Properties should be configured as early as possible, | |||
they may serve as input to decisions that are made by the | because they can serve as input to decisions that are made by the | |||
implementation (the Capacity Profile may guide usage of a protocol | implementation (e.g., the Capacity Profile can guide usage of a | |||
offering scavenger-type congestion control, for example). In the | protocol offering scavenger-type congestion control). | |||
remainder of this document, we only refer to Selection Properties | ||||
because they are the more typical case and have to be handled by all | ||||
implementations. | ||||
The implementation stores these objects and properties as part of the | The implementation stores these properties as a part of the | |||
Preconnection object for use during connection establishment. For | Preconnection object for use during connection establishment. For | |||
Selection Properties that are not provided by the application, the | Selection Properties that are not provided by the application, the | |||
implementation must use the default values specified in the Transport | implementation must use the default values specified in the Transport | |||
Services API ([I-D.ietf-taps-interface]). | Services API ([I-D.ietf-taps-interface]). | |||
3.1. Configuration-time errors | 3.1. Configuration-time errors | |||
The transport system should have a list of supported protocols | The transport system should have a list of supported protocols | |||
available, which each have transport features reflecting the | available, which each have transport features reflecting the | |||
capabilities of the protocol. Once an application specifies its | capabilities of the protocol. Once an application specifies its | |||
Transport Parameters, the transport system should match the required | Transport Properties, the transport system matches the required and | |||
and prohibited properties against the transport features of the | prohibited properties against the transport features of the available | |||
available protocols. | protocols. | |||
In the following cases, failure should be detected during pre- | In the following cases, failure should be detected during pre- | |||
establishment: | establishment: | |||
* The application requested Protocol Properties that include | * A request by an application for Protocol Properties that include | |||
requirements or prohibitions that cannot be satisfied by any of | requirements or prohibitions that cannot be satisfied by any of | |||
the available protocols. For example, if an application requires | the available protocols. For example, if an application requires | |||
"Configure Reliability per Message", but no such protocol is | "Configure Reliability per Message", but no such protocol is | |||
available on the host running the transport system, e.g., because | available on the host running the transport system this should | |||
SCTP is not supported by the operating system, this should result | result in an error, e.g., when SCTP is not supported by the | |||
in an error. | operating system. | |||
* The application requested Protocol Properties that are in conflict | * A request by an application for Protocol Properties that are in | |||
with each other, i.e., the required and prohibited properties | conflict with each other, i.e., the required and prohibited | |||
cannot be satisfied by the same protocol. For example, if an | properties cannot be satisfied by the same protocol. For example, | |||
application prohibits "Reliable Data Transfer" but then requires | if an application prohibits "Reliable Data Transfer" but then | |||
"Configure Reliability per Message", this mismatch should result | requires "Configure Reliability per Message", this mismatch should | |||
in an error. | result in an error. | |||
It is important to fail as early as possible in such cases in order | To avoid allocating resources, it is important that such cases fail | |||
to avoid allocating resources, e.g., to endpoint resolution, only to | as early as possible, e.g., to endpoint resolution, only to find out | |||
find out later that there is no protocol that satisfies the | later that there is no protocol that satisfies the requirements. | |||
requirements. | ||||
3.2. Role of system policy | 3.2. Role of system policy | |||
The properties specified during pre-establishment have a close | The properties specified during pre-establishment have a close | |||
connection to system policy. The implementation is responsible for | relationship to system policy. The implementation is responsible for | |||
combining and reconciling several different sources of preferences | combining and reconciling several different sources of preferences | |||
when establishing Connections. These include, but are not limited | when establishing Connections. These include, but are not limited | |||
to: | to: | |||
1. Application preferences, i.e., preferences specified during the | 1. Application preferences, i.e., preferences specified during the | |||
pre-establishment via Selection Properties. | pre-establishment via Selection Properties. | |||
2. Dynamic system policy, i.e., policy compiled from internally and | 2. Dynamic system policy, i.e., policy compiled from internally and | |||
externally acquired information about available network | externally acquired information about available network | |||
interfaces, supported transport protocols, and current/previous | interfaces, supported transport protocols, and current/previous | |||
Connections. Examples of ways to externally retrieve policy- | Connections. Examples of ways to externally retrieve policy- | |||
support information are through OS-specific statistics/ | support information are through OS-specific statistics/ | |||
measurement tools and tools that reside on middleboxes and | measurement tools and tools that reside on middleboxes and | |||
routers. | routers. | |||
3. Default implementation policy, i.e., predefined policy by OS or | 3. Default implementation policy, i.e., predefined policy by OS or | |||
application. | application. | |||
In general, any protocol or path used for a connection must conform | In general, any protocol or path used for a connection must conform | |||
to all three sources of constraints. Any violation of any of the | to all three sources of constraints. A violation of any of the | |||
layers should cause a protocol or path to be considered ineligible | layers should cause a protocol or path to be considered ineligible | |||
for use. For an example of application preferences leading to | for use. For an example of application preferences leading to | |||
constraints, an application may prohibit the use of metered network | constraints, an application may prohibit the use of metered network | |||
interfaces for a given Connection to avoid user cost. Similarly, the | interfaces for a given Connection to avoid user cost. Similarly, the | |||
system policy at a given time may prohibit the use of such a metered | system policy at a given time may prohibit the use of such a metered | |||
network interface from the application's process. Lastly, the | network interface from the application's process. Lastly, the | |||
implementation itself may default to disallowing certain network | implementation itself may default to disallowing certain network | |||
interfaces unless explicitly requested by the application and allowed | interfaces unless explicitly requested by the application and allowed | |||
by the system. | by the system. | |||
skipping to change at page 7, line 47 ¶ | skipping to change at page 7, line 47 ¶ | |||
establishment options as a single, aggregate connection | establishment options as a single, aggregate connection | |||
establishment. The aggregate set conceptually includes every valid | establishment. The aggregate set conceptually includes every valid | |||
combination of endpoints, paths, and protocols. As an example, | combination of endpoints, paths, and protocols. As an example, | |||
consider an implementation that initiates a TCP connection to a | consider an implementation that initiates a TCP connection to a | |||
hostname + port endpoint, and has two valid interfaces available (Wi- | hostname + port endpoint, and has two valid interfaces available (Wi- | |||
Fi and LTE). The hostname resolves to a single IPv4 address on the | Fi and LTE). The hostname resolves to a single IPv4 address on the | |||
Wi-Fi network, and resolves to the same IPv4 address on the LTE | Wi-Fi network, and resolves to the same IPv4 address on the LTE | |||
network, as well as a single IPv6 address. The aggregate set of | network, as well as a single IPv6 address. The aggregate set of | |||
connection establishment options can be viewed as follows: | connection establishment options can be viewed as follows: | |||
Aggregate [Endpoint: www.example.com:80] [Interface: Any] [Protocol: TCP] | Aggregate [Endpoint: www.example.com:80] [Interface: Any] [Protocol: TCP] | |||
|-> [Endpoint: 192.0.2.1:80] [Interface: Wi-Fi] [Protocol: TCP] | |-> [Endpoint: 192.0.2.1:80] [Interface: Wi-Fi] [Protocol: TCP] | |||
|-> [Endpoint: 192.0.2.1:80] [Interface: LTE] [Protocol: TCP] | |-> [Endpoint: 192.0.2.1:80] [Interface: LTE] [Protocol: TCP] | |||
|-> [Endpoint: 2001:DB8::1.80] [Interface: LTE] [Protocol: TCP] | |-> [Endpoint: 2001:DB8::1.80] [Interface: LTE] [Protocol: TCP] | |||
Any one of these sub-entries on the aggregate connection attempt | Any one of these sub-entries on the aggregate connection attempt | |||
would satisfy the original application intent. The concern of this | would satisfy the original application intent. The concern of this | |||
section is the algorithm defining which of these options to try, | section is the algorithm defining which of these options to try, | |||
when, and in what order. | when, and in what order. | |||
During Candidate Gathering, an implementation first excludes all | During Candidate Gathering, an implementation first excludes all | |||
protocols and paths that match a Prohibit or do not match all Require | protocols and paths that match a Prohibit or do not match all Require | |||
properties. Then, the implementation will sort branches according to | properties. Then, the implementation will sort branches according to | |||
Preferred properties, Avoided properties, and possibly other | Preferred properties, Avoided properties, and possibly other | |||
criteria. | criteria. | |||
skipping to change at page 8, line 25 ¶ | skipping to change at page 8, line 25 ¶ | |||
4.1. Candidate Gathering | 4.1. Candidate Gathering | |||
The step of gathering candidates involves identifying which paths, | The step of gathering candidates involves identifying which paths, | |||
protocols, and endpoints may be used for a given Connection. This | protocols, and endpoints may be used for a given Connection. This | |||
list is determined by the requirements, prohibitions, and preferences | list is determined by the requirements, prohibitions, and preferences | |||
of the application as specified in the Selection Properties. | of the application as specified in the Selection Properties. | |||
4.1.1. Gathering Endpoint Candidates | 4.1.1. Gathering Endpoint Candidates | |||
Both Local and Remote Endpoint Candidates must be discovered during | Both Local and Remote Endpoint Candidates must be discovered during | |||
connection establishment. To support ICE, or similar protocols, that | connection establishment. To support Interactive Connectivity | |||
involve out-of-band indirect signalling to exchange candidates with | Establishment (ICE) [RFC8445], or similar protocols, that involve | |||
the Remote Endpoint, it's important to be able to query the set of | out-of-band indirect signalling to exchange candidates with the | |||
Remote Endpoint, it's important to be able to query the set of | ||||
candidate Local Endpoints, and give the protocol stack a set of | candidate Local Endpoints, and give the protocol stack a set of | |||
candidate Remote Endpoints, before it attempts to establish | candidate Remote Endpoints, before it attempts to establish | |||
connections. | connections. | |||
4.1.1.1. Local Endpoint candidates | 4.1.1.1. Local Endpoint candidates | |||
The set of possible Local Endpoints is gathered. In the simple case, | The set of possible Local Endpoints is gathered. In the simple case, | |||
this merely enumerates the local interfaces and protocols, allocates | this merely enumerates the local interfaces and protocols, allocates | |||
ephemeral source ports. For example, a system that has WiFi and | ephemeral source ports. For example, a system that has WiFi and | |||
Ethernet and supports IPv4 and IPv6 might gather four candidate | Ethernet and supports IPv4 and IPv6 might gather four candidate | |||
locals (IPv4 on Ethernet, IPv6 on Ethernet, IPv4 on WiFi, and IPv6 on | locals (IPv4 on Ethernet, IPv6 on Ethernet, IPv4 on WiFi, and IPv6 on | |||
WiFi) that can form the source for a transient. | WiFi) that can form the source for a transient. | |||
If NAT traversal is required, the process of gathering Local | If NAT traversal is required, the process of gathering Local | |||
Endpoints becomes broadly equivalent to the ICE candidate gathering | Endpoints becomes broadly equivalent to the ICE candidate gathering | |||
phase [RFC5245]. The endpoint determines its server reflexive Local | phase (see Section 5.1.1. of [RFC8445]). The endpoint determines its | |||
Endpoints (i.e., the translated address of a local, on the other side | server reflexive Local Endpoints (i.e., the translated address of a | |||
of a NAT) and relayed locals (e.g., via a TURN server or other | local, on the other side of a NAT, e.g via a STUN sever [RFC5389]) | |||
and relayed locals (e.g., via a TURN server [RFC5766] or other | ||||
relay), for each interface and network protocol. These are added to | relay), for each interface and network protocol. These are added to | |||
the set of candidate Local Endpoints for this connection. | the set of candidate Local Endpoints for this connection. | |||
Gathering Local Endpoints is primarily a local operation, although it | Gathering Local Endpoints is primarily a local operation, although it | |||
might involve exchanges with a STUN server to derive server reflexive | might involve exchanges with a STUN server to derive server reflexive | |||
locals, or with a TURN server or other relay to derive relayed | locals, or with a TURN server or other relay to derive relayed | |||
locals. It does not involve communication with the Remote Endpoint. | locals. However, it does not involve communication with the Remote | |||
Endpoint. | ||||
4.1.1.2. Remote Endpoint Candidates | 4.1.1.2. Remote Endpoint Candidates | |||
The Remote Endpoint is typically a name that needs to be resolved | The Remote Endpoint is typically a name that needs to be resolved | |||
into a set of possible addresses that can be used for communication. | into a set of possible addresses that can be used for communication. | |||
Resolving the Remote Endpoint is the process of recursively | Resolving the Remote Endpoint is the process of recursively | |||
performing such name lookups, until fully resolved, to return the set | performing such name lookups, until fully resolved, to return the set | |||
of candidates for the remote of this connection. | of candidates for the remote of this connection. | |||
How this is done will depend on the type of the Remote Endpoint, and | How this is done will depend on the type of the Remote Endpoint, and | |||
skipping to change at page 9, line 40 ¶ | skipping to change at page 9, line 44 ¶ | |||
4.1.2. Structuring Options as a Tree | 4.1.2. Structuring Options as a Tree | |||
When an implementation responsible for connection establishment needs | When an implementation responsible for connection establishment needs | |||
to consider multiple options, it should logically structure these | to consider multiple options, it should logically structure these | |||
options as a hierarchical tree. Each leaf node of the tree | options as a hierarchical tree. Each leaf node of the tree | |||
represents a single, coherent connection attempt, with an Endpoint, a | represents a single, coherent connection attempt, with an Endpoint, a | |||
Path, and a set of protocols that can directly negotiate and send | Path, and a set of protocols that can directly negotiate and send | |||
data on the network. Each node in the tree that is not a leaf | data on the network. Each node in the tree that is not a leaf | |||
represents a connection attempt that is either underspecified, or | represents a connection attempt that is either underspecified, or | |||
else includes multiple distinct options. For example. when | else includes multiple distinct options. For example, when | |||
connecting on an IP network, a connection attempt to a hostname and | connecting on an IP network, a connection attempt to a hostname and | |||
port is underspecified, because the connection attempt requires a | port is underspecified, because the connection attempt requires a | |||
resolved IP address as its remote endpoint. In this case, the node | resolved IP address as its remote endpoint. In this case, the node | |||
represented by the connection attempt to the hostname is a parent | represented by the connection attempt to the hostname is a parent | |||
node, with child nodes for each IP address. Similarly, an | node, with child nodes for each IP address. Similarly, an | |||
implementation that is allowed to connect using multiple interfaces | implementation that is allowed to connect using multiple interfaces | |||
will have a parent node of the tree for the decision between the | will have a parent node of the tree for the decision between the | |||
paths, with a branch for each interface. | paths, with a branch for each interface. | |||
The example aggregate connection attempt above can be drawn as a tree | The example aggregate connection attempt above can be drawn as a tree | |||
by grouping the addresses resolved on the same interface into | by grouping the addresses resolved on the same interface into | |||
branches: | branches: | |||
|| | || | |||
+==========================+ | +==========================+ | |||
| www.example.com:80/Any | | | www.example.com:80/Any | | |||
+==========================+ | +==========================+ | |||
// \\ | // \\ | |||
+==========================+ +==========================+ | +==========================+ +==========================+ | |||
| www.example.com:80/Wi-Fi | | www.example.com:80/LTE | | | www.example.com:80/Wi-Fi | | www.example.com:80/LTE | | |||
+==========================+ +==========================+ | +==========================+ +==========================+ | |||
|| // \\ | || // \\ | |||
+====================+ +====================+ +======================+ | +====================+ +====================+ +======================+ | |||
| 192.0.2.1:80/Wi-Fi | | 192.0.2.1:80/LTE | | 2001:DB8::1.80/LTE | | | 192.0.2.1:80/Wi-Fi | | 192.0.2.1:80/LTE | | 2001:DB8::1.80/LTE | | |||
+====================+ +====================+ +======================+ | +====================+ +====================+ +======================+ | |||
The rest of this section will use a notation scheme to represent this | The rest of this section will use a notation scheme to represent this | |||
tree. The parent (or trunk) node of the tree will be represented by | tree. The parent (or trunk) node of the tree will be represented by | |||
a single integer, such as "1". Each child of that node will have an | a single integer, such as "1". Each child of that node will have an | |||
integer that identifies it, from 1 to the number of children. That | integer that identifies it, from 1 to the number of children. That | |||
child node will be uniquely identified by concatenating its integer | child node will be uniquely identified by concatenating its integer | |||
to it's parents identifier with a dot in between, such as "1.1" and | to it's parents identifier with a dot in between, such as "1.1" and | |||
"1.2". Each node will be summarized by a tuple of three elements: | "1.2". Each node will be summarized by a tuple of three elements: | |||
Endpoint, Path, and Protocol. The above example can now be written | Endpoint, Path, and Protocol. The above example can now be written | |||
more succinctly as: | more succinctly as: | |||
skipping to change at page 11, line 20 ¶ | skipping to change at page 11, line 20 ¶ | |||
4.1.3. Branch Types | 4.1.3. Branch Types | |||
There are three types of branching from a parent node into one or | There are three types of branching from a parent node into one or | |||
more child nodes. Any parent node of the tree must only use one type | more child nodes. Any parent node of the tree must only use one type | |||
of branching. | of branching. | |||
4.1.3.1. Derived Endpoints | 4.1.3.1. Derived Endpoints | |||
If a connection originally targets a single endpoint, there may be | If a connection originally targets a single endpoint, there may be | |||
multiple endpoints of different types that can be derived from the | multiple endpoints of different types that can be derived from the | |||
original. The connection library should order the derived endpoints | original. The connection library creates an ordered list of the | |||
according to application preference, system policy and expected | derived endpoints according to application preference, system policy | |||
performance. | and expected performance. | |||
DNS hostname-to-address resolution is the most common method of | DNS hostname-to-address resolution is the most common method of | |||
endpoint derivation. When trying to connect to a hostname endpoint | endpoint derivation. When trying to connect to a hostname endpoint | |||
on a traditional IP network, the implementation should send DNS | on a traditional IP network, the implementation should send DNS | |||
queries for both A (IPv4) and AAAA (IPv6) records if both are | queries for both A (IPv4) and AAAA (IPv6) records if both are | |||
supported on the local link. The algorithm for ordering and racing | supported on the local link. The algorithm for ordering and racing | |||
these addresses should follow the recommendations in Happy Eyeballs | these addresses should follow the recommendations in Happy Eyeballs | |||
[RFC8305]. | [RFC8305]. | |||
1 [www.example.com:80, Wi-Fi, TCP] | 1 [www.example.com:80, Wi-Fi, TCP] | |||
1.1 [2001:DB8::1.80, Wi-Fi, TCP] | 1.1 [2001:DB8::1.80, Wi-Fi, TCP] | |||
1.2 [192.0.2.1:80, Wi-Fi, TCP] | 1.2 [192.0.2.1:80, Wi-Fi, TCP] | |||
1.3 [2001:DB8::2.80, Wi-Fi, TCP] | 1.3 [2001:DB8::2.80, Wi-Fi, TCP] | |||
1.4 [2001:DB8::3.80, Wi-Fi, TCP] | 1.4 [2001:DB8::3.80, Wi-Fi, TCP] | |||
DNS-Based Service Discovery can also provide an endpoint derivation | DNS-Based Service Discovery [RFC6763] can also provide an endpoint | |||
step. When trying to connect to a named service, the client may | derivation step. When trying to connect to a named service, the | |||
discover one or more hostname and port pairs on the local network | client may discover one or more hostname and port pairs on the local | |||
using multicast DNS. These hostnames should each be treated as a | network using multicast DNS [RFC6762]. These hostnames should each | |||
branch which can be attempted independently from other hostnames. | be treated as a branch that can be attempted independently from other | |||
Each of these hostnames may also resolve to one or more addresses, | hostnames. Each of these hostnames might resolve to one or more | |||
thus creating multiple layers of branching. | addresses, which would create multiple layers of branching. | |||
1 [term-printer._ipp._tcp.meeting.ietf.org, Wi-Fi, TCP] | 1 [term-printer._ipp._tcp.meeting.ietf.org, Wi-Fi, TCP] | |||
1.1 [term-printer.meeting.ietf.org:631, Wi-Fi, TCP] | 1.1 [term-printer.meeting.ietf.org:631, Wi-Fi, TCP] | |||
1.1.1 [31.133.160.18.631, Wi-Fi, TCP] | 1.1.1 [31.133.160.18.631, Wi-Fi, TCP] | |||
4.1.3.2. Alternate Paths | 4.1.3.2. Alternate Paths | |||
If a client has multiple network interfaces available to it, such as | If a client has multiple network interfaces available to it, e.g., a | |||
mobile client with both Wi-Fi and Cellular connectivity, it can | mobile client with both Wi-Fi and Cellular connectivity, it can | |||
attempt a connection over either interface. This represents a branch | attempt a connection over any of the interfaces. This represents a | |||
point in the connection establishment. Like with derived endpoints, | branch point in the connection establishment. Similar to a derived | |||
the interfaces should be ranked based on preference, system policy, | endpoint, the interfaces should be ranked based on preference, system | |||
and performance. Attempts should be started on one interface, and | policy, and performance. Attempts should be started on one | |||
then on other interfaces successively after delays based on expected | interface, and then on other interfaces successively after delays | |||
round-trip-time or other available metrics. | based on expected round-trip-time or other available metrics. | |||
1 [192.0.2.1:80, Any, TCP] | 1 [192.0.2.1:80, Any, TCP] | |||
1.1 [192.0.2.1:80, Wi-Fi, TCP] | 1.1 [192.0.2.1:80, Wi-Fi, TCP] | |||
1.2 [192.0.2.1:80, LTE, TCP] | 1.2 [192.0.2.1:80, LTE, TCP] | |||
This same approach applies to any situation in which the client is | This same approach applies to any situation in which the client is | |||
aware of multiple links or views of the network. Multiple Paths, | aware of multiple links or views of the network. Multiple Paths, | |||
each with a coherent set of addresses, routes, DNS server, and more, | each with a coherent set of addresses, routes, DNS server, and more, | |||
may share a single interface. A path may also represent a virtual | may share a single interface. A path may also represent a virtual | |||
interface service such as a Virtual Private Network (VPN). | interface service such as a Virtual Private Network (VPN). | |||
skipping to change at page 12, line 37 ¶ | skipping to change at page 12, line 37 ¶ | |||
or prohibitions the application sets, as well as system policy. | or prohibitions the application sets, as well as system policy. | |||
4.1.3.3. Protocol Options | 4.1.3.3. Protocol Options | |||
Differences in possible protocol compositions and options can also | Differences in possible protocol compositions and options can also | |||
provide a branching point in connection establishment. This allows | provide a branching point in connection establishment. This allows | |||
clients to be resilient to situations in which a certain protocol is | clients to be resilient to situations in which a certain protocol is | |||
not functioning on a server or network. | not functioning on a server or network. | |||
This approach is commonly used for connections with optional proxy | This approach is commonly used for connections with optional proxy | |||
server configurations. A single connection may be allowed to use an | server configurations. A single connection might have several | |||
HTTP-based proxy, a SOCKS-based proxy, or connect directly. These | options available: an HTTP-based proxy, a SOCKS-based proxy, or no | |||
options should be ranked and attempted in succession. | proxy. These options should be ranked and attempted in succession. | |||
1 [www.example.com:80, Any, HTTP/TCP] | 1 [www.example.com:80, Any, HTTP/TCP] | |||
1.1 [192.0.2.8:80, Any, HTTP/HTTP Proxy/TCP] | 1.1 [192.0.2.8:80, Any, HTTP/HTTP Proxy/TCP] | |||
1.2 [192.0.2.7:10234, Any, HTTP/SOCKS/TCP] | 1.2 [192.0.2.7:10234, Any, HTTP/SOCKS/TCP] | |||
1.3 [www.example.com:80, Any, HTTP/TCP] | 1.3 [www.example.com:80, Any, HTTP/TCP] | |||
1.3.1 [192.0.2.1:80, Any, HTTP/TCP] | 1.3.1 [192.0.2.1:80, Any, HTTP/TCP] | |||
This approach also allows a client to attempt different sets of | This approach also allows a client to attempt different sets of | |||
application and transport protocols that may provide preferable | application and transport protocols that, when available, could | |||
characteristics when available. For example, the protocol options | provide preferable features. For example, the protocol options could | |||
could involve QUIC [I-D.ietf-quic-transport] over UDP on one branch, | involve QUIC [I-D.ietf-quic-transport] over UDP on one branch, and | |||
and HTTP/2 [RFC7540] over TLS over TCP on the other: | HTTP/2 [RFC7540] over TLS over TCP on the other: | |||
1 [www.example.com:443, Any, Any HTTP] | 1 [www.example.com:443, Any, Any HTTP] | |||
1.1 [www.example.com:443, Any, QUIC/UDP] | 1.1 [www.example.com:443, Any, QUIC/UDP] | |||
1.1.1 [192.0.2.1:443, Any, QUIC/UDP] | 1.1.1 [192.0.2.1:443, Any, QUIC/UDP] | |||
1.2 [www.example.com:443, Any, HTTP2/TLS/TCP] | 1.2 [www.example.com:443, Any, HTTP2/TLS/TCP] | |||
1.2.1 [192.0.2.1:443, Any, HTTP2/TLS/TCP] | 1.2.1 [192.0.2.1:443, Any, HTTP2/TLS/TCP] | |||
Another example is racing SCTP with TCP: | Another example is racing SCTP with TCP: | |||
1 [www.example.com:80, Any, Any Stream] | 1 [www.example.com:80, Any, Any Stream] | |||
1.1 [www.example.com:80, Any, SCTP] | 1.1 [www.example.com:80, Any, SCTP] | |||
1.1.1 [192.0.2.1:80, Any, SCTP] | 1.1.1 [192.0.2.1:80, Any, SCTP] | |||
1.2 [www.example.com:80, Any, TCP] | 1.2 [www.example.com:80, Any, TCP] | |||
1.2.1 [192.0.2.1:80, Any, TCP] | 1.2.1 [192.0.2.1:80, Any, TCP] | |||
Implementations that support racing protocols and protocol options | Implementations that support racing protocols and protocol options | |||
should maintain a history of which protocols and protocol options | should maintain a history of which protocols and protocol options | |||
successfully established, on a per-network basis (see Section 9.2). | successfully established, on a per-network and per-endpoint basis | |||
This information can influence future racing decisions to prioritize | (see Section 9.2). This information can influence future racing | |||
or prune branches. | decisions to prioritize or prune branches. | |||
4.2. Branching Order-of-Operations | 4.1.4. Branching Order-of-Operations | |||
Branch types must occur in a specific order relative to one another | Branch types must occur in a specific order relative to one another | |||
to avoid creating leaf nodes with invalid or incompatible settings. | to avoid creating leaf nodes with invalid or incompatible settings. | |||
In the example above, it would be invalid to branch for derived | In the example above, it would be invalid to branch for derived | |||
endpoints (the DNS results for www.example.com) before branching | endpoints (the DNS results for www.example.com) before branching | |||
between interface paths, since usable DNS results on one network may | between interface paths, since there are situations when the results | |||
not necessarily be the same as DNS results on another network due to | will be different across networks due to private names or different | |||
local network entities, supported address families, or enterprise | supported IP versions. Implementations must be careful to branch in | |||
network configurations. Implementations must be careful to branch in | ||||
an order that results in usable leaf nodes whenever there are | an order that results in usable leaf nodes whenever there are | |||
multiple branch types that could be used from a single node. | multiple branch types that could be used from a single node. | |||
The order of operations for branching, where lower numbers are acted | The order of operations for branching, where lower numbers are acted | |||
upon first, should be: | upon first, should be: | |||
1. Alternate Paths | 1. Alternate Paths | |||
2. Protocol Options | 2. Protocol Options | |||
3. Derived Endpoints | 3. Derived Endpoints | |||
Branching between paths is the first in the list because results | Branching between paths is the first in the list because results | |||
across multiple interfaces are likely not related to one another: | across multiple interfaces are likely not related to one another: | |||
endpoint resolution may return different results, especially when | endpoint resolution may return different results, especially when | |||
using locally resolved host and service names, and which protocols | using locally resolved host and service names, and which protocols | |||
are supported and preferred may differ across interfaces. Thus, if | are supported and preferred may differ across interfaces. Thus, if | |||
multiple paths are attempted, the overall connection can be seen as a | multiple paths are attempted, the overall connection can be seen as a | |||
race between the available paths or interfaces. | race between the available paths or interfaces. | |||
Protocol options are checked next in order. Whether or not a set of | Protocol options are next checked in order. Whether or not a set of | |||
protocol, or protocol-specific options, can successfully connect is | protocol, or protocol-specific options, can successfully connect is | |||
generally not dependent on which specific IP address is used. | generally not dependent on which specific IP address is used. | |||
Furthermore, the protocol stacks being attempted may influence or | Furthermore, the protocol stacks being attempted may influence or | |||
altogether change the endpoints being used. Adding a proxy to a | altogether change the endpoints being used. Adding a proxy to a | |||
connection's branch will change the endpoint to the proxy's IP | connection's branch will change the endpoint to the proxy's IP | |||
address or hostname. Choosing an alternate protocol may also modify | address or hostname. Choosing an alternate protocol may also modify | |||
the ports that should be selected. | the ports that should be selected. | |||
Branching for derived endpoints is the final step, and may have | Branching for derived endpoints is the final step, and may have | |||
multiple layers of derivation or resolution, such as DNS service | multiple layers of derivation or resolution, such as DNS service | |||
resolution and DNS hostname resolution. | resolution and DNS hostname resolution. | |||
For example, if the application has indicated both a preference for | For example, if the application has indicated both a preference for | |||
WiFi over LTE and for a feature only available in SCTP, branches will | WiFi over LTE and for a feature only available in SCTP, branches will | |||
be first sorted accord to path selection, with WiFi at the top. | be first sorted accord to path selection, with WiFi at the top. | |||
Then, branches with SCTP will be sorted to the top within their | Then, branches with SCTP will be sorted to the top within their | |||
subtree according to the properties influencing protocol selection. | subtree according to the properties influencing protocol selection. | |||
However, if the implementation has cached the information that SCTP | However, if the implementation has current cache information that | |||
is not available on the path over WiFi, there is no SCTP node in the | SCTP is not available on the path over WiFi, there is no SCTP node in | |||
WiFi subtree. Here, the path over WiFi will be tried first, and, if | the WiFi subtree. Here, the path over WiFi will be tried first, and, | |||
connection establishment succeeds, TCP will be used. So the | if connection establishment succeeds, TCP will be used. So the | |||
Selection Property of preferring WiFi takes precedence over the | Selection Property of preferring WiFi takes precedence over the | |||
Property that led to a preference for SCTP. | Property that led to a preference for SCTP. | |||
1. [www.example.com:80, Any, Any Stream] | 1. [www.example.com:80, Any, Any Stream] | |||
1.1 [192.0.2.1:80, Wi-Fi, Any Stream] | 1.1 [192.0.2.1:80, Wi-Fi, Any Stream] | |||
1.1.1 [192.0.2.1:80, Wi-Fi, TCP] | 1.1.1 [192.0.2.1:80, Wi-Fi, TCP] | |||
1.2 [192.0.3.1:80, LTE, Any Stream] | 1.2 [192.0.3.1:80, LTE, Any Stream] | |||
1.2.1 [192.0.3.1:80, LTE, SCTP] | 1.2.1 [192.0.3.1:80, LTE, SCTP] | |||
1.2.2 [192.0.3.1:80, LTE, TCP] | 1.2.2 [192.0.3.1:80, LTE, TCP] | |||
4.3. Sorting Branches | 4.1.5. Sorting Branches | |||
Implementations should sort the branches of the tree of connection | Implementations should sort the branches of the tree of connection | |||
options in order of their preference rank. Leaf nodes on branches | options in order of their preference rank, from most preferred to | |||
with higher rankings represent connection attempts that will be raced | least preferred. Leaf nodes on branches with higher rankings | |||
first. Implementations should order the branches to reflect the | represent connection attempts that will be raced first. | |||
preferences expressed by the application for its new connection, | Implementations should order the branches to reflect the preferences | |||
including Selection Properties, which are specified in | expressed by the application for its new connection, including | |||
Selection Properties, which are specified in | ||||
[I-D.ietf-taps-interface]. | [I-D.ietf-taps-interface]. | |||
In addition to the properties provided by the application, an | In addition to the properties provided by the application, an | |||
implementation may include additional criteria such as cached | implementation may include additional criteria such as cached | |||
performance estimates, see Section 9.2, or system policy, see | performance estimates, see Section 9.2, or system policy, see | |||
Section 3.2, in the ranking. Two examples of how Selection and | Section 3.2, in the ranking. Two examples of how Selection and | |||
Connection Properties may be used to sort branches are provided | Connection Properties may be used to sort branches are provided | |||
below: | below: | |||
* "Interface Instance or Type": If the application specifies an | * "Interface Instance or Type": If the application specifies an | |||
interface type to be preferred or avoided, implementations should | interface type to be preferred or avoided, implementations should | |||
rank paths accordingly. If the application specifies an interface | accordingly rank the paths. If the application specifies an | |||
type to be required or prohibited, we expect an implementation to | interface type to be required or prohibited, an implementation is | |||
not include the non-conforming paths into the three. | expeceted to not include the non-conforming paths. | |||
* "Capacity Profile": An implementation may use the Capacity Profile | * "Capacity Profile": An implementation can use the Capacity Profile | |||
to prefer paths optimized for the application's expected traffic | to prefer paths that match an application's expected traffic | |||
pattern according to cached performance estimates, see | pattern. This match will use cached performance estimates, see | |||
Section 9.2: | Section 9.2: | |||
- Scavenger: Prefer paths with the highest expected available | - Scavenger: Prefer paths with the highest expected available | |||
bandwidth, based on observed maximum throughput | capacity, based on the observed maximum throughput; | |||
- Low Latency/Interactive: Prefer paths with the lowest expected | - Low Latency/Interactive: Prefer paths with the lowest expected | |||
Round Trip Time | Round Trip Time, based on observed round trip time estimates; | |||
- Constant-Rate Streaming: Prefer paths that can satisfy the | - Constant-Rate Streaming: Prefer paths that can are expected to | |||
requested Stream Send or Stream Receive Bitrate, based on | satisy the requested Stream Send or Stream Receive Bitrate, | |||
observed maximum throughput | based on the observed maximum throughput. | |||
Implementations should process properties in the following order: | Implementations process the Properties in the following order: | |||
Prohibit, Require, Prefer, Avoid. If Selection Properties contain | Prohibit, Require, Prefer, Avoid. If Selection Properties contain | |||
any prohibited properties, the implementation should first purge | any prohibited properties, the implementation should first purge | |||
branches containing nodes with these properties. For required | branches containing nodes with these properties. For required | |||
properties, it should only keep branches that satisfy these | properties, it should only keep branches that satisfy these | |||
requirements. Finally, it should order branches according to | requirements. Finally, it should order the branches according to the | |||
preferred properties, and finally use avoided properties as a | preferred properties, and finally use any avoided properties as a | |||
tiebreaker. When ordering branches, an implementation may give more | tiebreaker. When ordering branches, an implementation can give more | |||
weight to properties that the application has explicitly set than to | weight to properties that the application has explicitly set, than to | |||
properties that are default. | the properties that are default. | |||
As the available protocols and paths on a specific system and in a | The available protocols and paths on a specific system and in a | |||
specific context may vary, the result of sorting and the outcome of | specific context can change; therefore, the result of sorting and the | |||
racing may vary even given the same Selection and Connection | outcome of racing may vary, even when using the same Selection and | |||
Properties. However, an implementation ought to aim to provide a | Connection Properties. However, an implementation ought to provide a | |||
consistent outcome to applications, e.g., by preferring protocols and | consistent outcome to applications, e.g., by preferring protocols and | |||
paths that existing Connections with similar Properties are already | paths that are already used by existing Connections that specified | |||
using. | similar Properties. | |||
4.4. Candidate Racing | 4.2. Candidate Racing | |||
The primary goal of the Candidate Racing process is to successfully | The primary goal of the Candidate Racing process is to successfully | |||
negotiate a protocol stack to an endpoint over an interface--to | negotiate a protocol stack to an endpoint over an interface--to | |||
connect a single leaf node of the tree--with as little delay and as | connect a single leaf node of the tree--with as little delay and as | |||
few unnecessary connections attempts as possible. Optimizing these | few unnecessary connections attempts as possible. Optimizing these | |||
two factors improves the user experience, while minimizing network | two factors improves the user experience, while minimizing network | |||
load. | load. | |||
This section covers the dynamic aspect of connection establishment. | This section covers the dynamic aspect of connection establishment. | |||
While the tree described above is a useful conceptual and | The tree described above is a useful conceptual and architectural | |||
architectural model, an implementation does not know what the full | model. However, an implementation is unable to know the full tree | |||
tree may become up front, nor will many of the possible branches be | before it is formed and many of the possible branches ultimately | |||
used in the common case. | might not be used. | |||
There are three different approaches to racing the attempts for | There are three different approaches to racing the attempts for | |||
different nodes of the connection establishment tree: | different nodes of the connection establishment tree: | |||
1. Immediate | 1. Immediate | |||
2. Delayed | 2. Delayed | |||
3. Failover | 3. Failover | |||
Each approach is appropriate in different use-cases and branch types. | Each approach is appropriate in different use-cases and branch types. | |||
However, to avoid consuming unnecessary network resources, | However, to avoid consuming unnecessary network resources, | |||
implementations should not use immediate racing as a default | implementations should not use immediate racing as a default | |||
approach. | approach. | |||
The timing algorithms for racing should remain independent across | The timing algorithms for racing should remain independent across | |||
branches of the tree. Any timers or racing logic is isolated to a | branches of the tree. Any timers or racing logic is isolated to a | |||
given parent node, and is not ordered precisely with regards to other | given parent node, and is not ordered precisely with regards to other | |||
children of other nodes. | children of other nodes. | |||
4.4.1. Delayed | 4.2.1. Immediate | |||
Immediate racing is when multiple alternate branches are started | ||||
without waiting for any one branch to make progress before starting | ||||
the next alternative. This means the attempts are effectively | ||||
simultaneous. Immediate racing should be avoided by implementations, | ||||
since it consumes extra network resources and establishes state that | ||||
might not be used. | ||||
4.2.2. Delayed | ||||
Delayed racing can be used whenever a single node of the tree has | Delayed racing can be used whenever a single node of the tree has | |||
multiple child nodes. Based on the order determined when building | multiple child nodes. Based on the order determined when building | |||
the tree, the first child node will be initiated immediately, | the tree, the first child node will be initiated immediately, | |||
followed by the next child node after some delay. Once that second | followed by the next child node after some delay. Once that second | |||
child node is initiated, the third child node (if present) will begin | child node is initiated, the third child node (if present) will begin | |||
after another delay, and so on until all child nodes have been | after another delay, and so on until all child nodes have been | |||
initiated, or one of the child nodes successfully completes its | initiated, or one of the child nodes successfully completes its | |||
negotiation. | negotiation. | |||
skipping to change at page 17, line 27 ¶ | skipping to change at page 17, line 42 ¶ | |||
Any delay should have a defined minimum and maximum value based on | Any delay should have a defined minimum and maximum value based on | |||
the branch type. Generally, branches between paths and protocols | the branch type. Generally, branches between paths and protocols | |||
should have longer delays than branches between derived endpoints. | should have longer delays than branches between derived endpoints. | |||
The maximum delay should be considered with regards to how long a | The maximum delay should be considered with regards to how long a | |||
user is expected to wait for the connection to complete. | user is expected to wait for the connection to complete. | |||
If a child node fails to connect before the delay timer has fired for | If a child node fails to connect before the delay timer has fired for | |||
the next child, the next child should be started immediately. | the next child, the next child should be started immediately. | |||
4.4.2. Failover | 4.2.3. Failover | |||
If an implementation or application has a strong preference for one | If an implementation or application has a strong preference for one | |||
branch over another, the branching node may choose to wait until one | branch over another, the branching node may choose to wait until one | |||
child has failed before starting the next. Failure of a leaf node is | child has failed before starting the next. Failure of a leaf node is | |||
determined by its protocol negotiation failing or timing out; failure | determined by its protocol negotiation failing or timing out; failure | |||
of a parent branching node is determined by all of its children | of a parent branching node is determined by all of its children | |||
failing. | failing. | |||
An example in which failover is recommended is a race between a | An example in which failover is recommended is a race between a | |||
protocol stack that uses a proxy and a protocol stack that bypasses | protocol stack that uses a proxy and a protocol stack that bypasses | |||
the proxy. Failover is useful in case the proxy is down or | the proxy. Failover is useful in case the proxy is down or | |||
misconfigured, but any more aggressive type of racing may end up | misconfigured, but any more aggressive type of racing may end up | |||
unnecessarily avoiding a proxy that was preferred by policy. | unnecessarily avoiding a proxy that was preferred by policy. | |||
4.5. Completing Establishment | 4.3. Completing Establishment | |||
The process of connection establishment completes when one leaf node | The process of connection establishment completes when one leaf node | |||
of the tree has completed negotiation with the remote endpoint | of the tree has completed negotiation with the remote endpoint | |||
successfully, or else all nodes of the tree have failed to connect. | successfully, or else all nodes of the tree have failed to connect. | |||
The first leaf node to complete its connection is then used by the | The first leaf node to complete its connection is then used by the | |||
application to send and receive data. | application to send and receive data. | |||
It is useful to process success and failure throughout the tree by | Successes and failures of a given attempt should be reported up to | |||
child nodes reporting to their parent nodes (towards the trunk of the | parent nodes (towards the trunk of the tree). For example, in the | |||
tree). For example, in the following case, if 1.1.1 fails to | following case, if 1.1.1 fails to connect, it reports the failure to | |||
connect, it reports the failure to 1.1. Since 1.1 has no other child | 1.1. Since 1.1 has no other child nodes, it also has failed and | |||
nodes, it also has failed and reports that failure to 1. Because 1.2 | reports that failure to 1. Because 1.2 has not yet failed, 1 is not | |||
has not yet failed, 1 is not considered to have failed. Since 1.2 | considered to have failed. Since 1.2 has not yet started, it is | |||
has not yet started, it is started and the process continues. | started and the process continues. Similarly, if 1.1.1 successfully | |||
Similarly, if 1.1.1 successfully connects, then it marks 1.1 as | connects, then it marks 1.1 as connected, which propagates to the | |||
connected, which propagates to the trunk node 1. At this point, the | trunk node 1. At this point, the connection as a whole is considered | |||
connection as a whole is considered to be successfully connected and | to be successfully connected and ready to process application data | |||
ready to process application data | ||||
1 [www.example.com:80, Any, TCP] | 1 [www.example.com:80, Any, TCP] | |||
1.1 [www.example.com:80, Wi-Fi, TCP] | 1.1 [www.example.com:80, Wi-Fi, TCP] | |||
1.1.1 [192.0.2.1:80, Wi-Fi, TCP] | 1.1.1 [192.0.2.1:80, Wi-Fi, TCP] | |||
1.2 [www.example.com:80, LTE, TCP] | 1.2 [www.example.com:80, LTE, TCP] | |||
... | ... | |||
If a leaf node has successfully completed its connection, all other | If a leaf node has successfully completed its connection, all other | |||
attempts should be made ineligible for use by the application for the | attempts should be made ineligible for use by the application for the | |||
original request. New connection attempts that involve transmitting | original request. New connection attempts that involve transmitting | |||
data on the network should not be started after another leaf node has | data on the network ought not to be started after another leaf node | |||
completed successfully, as the connection as a whole has been | has already successfully completed, because the connection as a whole | |||
established. An implementation may choose to let certain handshakes | has now been established. An implementation may choose to let | |||
and negotiations complete in order to gather metrics to influence | certain handshakes and negotiations complete in order to gather | |||
future connections. Similarly, an implementation may choose to hold | metrics to influence future connections. Keeping additional | |||
onto fully established leaf nodes that were not the first to | connections is generally not recommended since those attempts were | |||
establish for use as part of a Pooled Connection, see Section 7.1, or | slower to connect and may exhibit less desirable properties. | |||
in future connections. In both cases, keeping additional connections | ||||
is generally not recommended since those attempts were slower to | ||||
connect and may exhibit less desirable properties. | ||||
4.5.1. Determining Successful Establishment | 4.3.1. Determining Successful Establishment | |||
Implementations may select the criteria by which a leaf node is | Implementations may select the criteria by which a leaf node is | |||
considered to be successfully connected differently on a per-protocol | considered to be successfully connected differently on a per-protocol | |||
basis. If the only protocol being used is a transport protocol with | basis. If the only protocol being used is a transport protocol with | |||
a clear handshake, like TCP, then the obvious choice is to declare | a clear handshake, like TCP, then the obvious choice is to declare | |||
that node "connected" when the last packet of the three-way handshake | that node "connected" when the last packet of the three-way handshake | |||
has been received. If the only protocol being used is an | has been received. If the only protocol being used is an | |||
"unconnected" protocol, like UDP, the implementation may consider the | "unconnected" protocol, like UDP, the implementation may consider the | |||
node fully "connected" the moment it determines a route is present, | node fully "connected" the moment it determines a route is present, | |||
before sending any packets on the network, see further Section 4.7. | before sending any packets on the network, see further Section 4.5. | |||
For protocol stacks with multiple handshakes, the decision becomes | For protocol stacks with multiple handshakes, the decision becomes | |||
more nuanced. If the protocol stack involves both TLS and TCP, an | more nuanced. If the protocol stack involves both TLS and TCP, an | |||
implementation could determine that a leaf node is connected after | implementation could determine that a leaf node is connected after | |||
the TCP handshake is complete, or it can wait for the TLS handshake | the TCP handshake is complete, or it can wait for the TLS handshake | |||
to complete as well. The benefit of declaring completion when the | to complete as well. The benefit of declaring completion when the | |||
TCP handshake finishes, and thus stopping the race for other branches | TCP handshake finishes, and thus stopping the race for other branches | |||
of the tree, is that there will be less burden on the network from | of the tree, is that there will be less burden on the network from | |||
other connection attempts. On the other hand, by waiting until the | other connection attempts. On the other hand, by waiting until the | |||
TLS handshake is complete, an implementation avoids the scenario in | TLS handshake is complete, an implementation avoids the scenario in | |||
which a TCP handshake completes quickly, but TLS negotiation is | which a TCP handshake completes quickly, but TLS negotiation is | |||
either very slow or fails altogether in particular network conditions | either very slow or fails altogether in particular network conditions | |||
or to a particular endpoint. To avoid the issue of TLS possibly | or to a particular endpoint. To avoid the issue of TLS possibly | |||
failing, the implementation should not generate a Ready event for the | failing, the implementation should not generate a Ready event for the | |||
Connection until TLS is established. | Connection until TLS is established. | |||
If all of the leaf nodes fail to connect during racing, i.e. none of | If all of the leaf nodes fail to connect during racing, i.e. none of | |||
the configurations that satisfy all requirements given in the | the configurations that satisfy all requirements given in the | |||
Transport Parameters actually work over the available paths, then the | Transport Properties actually work over the available paths, then the | |||
transport system should notify the application with an InitiateError | transport system should notify the application with an InitiateError | |||
event. An InitiateError event should also be generated in case the | event. An InitiateError event should also be generated in case the | |||
transport system finds no usable candidates to race. | transport system finds no usable candidates to race. | |||
4.6. Establishing multiplexed connections | 4.4. Establishing multiplexed connections | |||
Multiplexing several Connections over a single underlying transport | Multiplexing several Connections over a single underlying transport | |||
connection requires that the Connections to be multiplexed belong to | connection requires that the Connections to be multiplexed belong to | |||
the same Connection Group (as is indicated by the application using | the same Connection Group (as is indicated by the application using | |||
the Clone call). When the underlying transport connection supports | the Clone call). When the underlying transport connection supports | |||
multi-streaming, the Transport System can map each Connection in the | multi-streaming, the Transport System can map each Connection in the | |||
Connection Group to a different stream. Thus, when the Connections | Connection Group to a different stream. Thus, when the Connections | |||
that are offered to an application by the Transport System are | that are offered to an application by the Transport System are | |||
multiplexed, the Transport System may implement the establishment of | multiplexed, the Transport System may implement the establishment of | |||
a new Connection by simply beginning to use a new stream of an | a new Connection by simply beginning to use a new stream of an | |||
skipping to change at page 19, line 41 ¶ | skipping to change at page 20, line 12 ¶ | |||
connection establishment procedure. This, then, also means that | connection establishment procedure. This, then, also means that | |||
there may not be any "establishment" message (like a TCP SYN), but | there may not be any "establishment" message (like a TCP SYN), but | |||
the application can simply start sending or receiving. Therefore, | the application can simply start sending or receiving. Therefore, | |||
when the Initiate action of a Transport System is called without | when the Initiate action of a Transport System is called without | |||
Messages being handed over, it cannot be guaranteed that the other | Messages being handed over, it cannot be guaranteed that the other | |||
endpoint will have any way to know about this, and hence a passive | endpoint will have any way to know about this, and hence a passive | |||
endpoint's ConnectionReceived event may not be called upon an active | endpoint's ConnectionReceived event may not be called upon an active | |||
endpoint's Inititate. Instead, calling the ConnectionReceived event | endpoint's Inititate. Instead, calling the ConnectionReceived event | |||
may be delayed until the first Message arrives. | may be delayed until the first Message arrives. | |||
4.7. Handling racing with "unconnected" protocols | 4.5. Handling racing with "unconnected" protocols | |||
While protocols that use an explicit handshake to validate a | While protocols that use an explicit handshake to validate a | |||
Connection to a peer can be used for racing multiple establishment | Connection to a peer can be used for racing multiple establishment | |||
attempts in parallel, "unconnected" protocols such as raw UDP do not | attempts in parallel, "unconnected" protocols such as raw UDP do not | |||
offer a way to validate the presence of a peer or the usability of a | offer a way to validate the presence of a peer or the usability of a | |||
Connection without application feedback. An implementation should | Connection without application feedback. An implementation should | |||
consider such a protocol stack to be established as soon as a local | consider such a protocol stack to be established as soon as a local | |||
route to the peer endpoint is confirmed. | route to the peer endpoint is confirmed. | |||
However, if a peer is not reachable over the network using the | However, if a peer is not reachable over the network using the | |||
unconnected protocol, or data cannot be exchanged for any other | unconnected protocol, or data cannot be exchanged for any other | |||
reason, the application may want to attempt using another candidate | reason, the application may want to attempt using another candidate | |||
Protocol Stack. The implementation should maintain the list of other | Protocol Stack. The implementation should maintain the list of other | |||
candidate Protocol Stacks that were eligible to use. In the case | candidate Protocol Stacks that were eligible to use. | |||
that the application signals that the initial Protocol Stack is | ||||
failing for some reason and that another option should be attempted, | ||||
the Connection can be updated to point to the next candidate Protocol | ||||
Stack. This can be viewed as an application-driven form of Protocol | ||||
Stack racing. | ||||
4.8. Implementing listeners | 4.6. Implementing listeners | |||
When an implementation is asked to Listen, it registers with the | When an implementation is asked to Listen, it registers with the | |||
system to wait for incoming traffic to the Local Endpoint. If no | system to wait for incoming traffic to the Local Endpoint. If no | |||
Local Endpoint is specified, the implementation should either use an | Local Endpoint is specified, the implementation should use an | |||
ephemeral port or generate an error. | ephemeral port. | |||
If the Selection Properties do not require a single network interface | If the Selection Properties do not require a single network interface | |||
or path, but allow the use of multiple paths, the Listener object | or path, but allow the use of multiple paths, the Listener object | |||
should register for incoming traffic on all of the network interfaces | should register for incoming traffic on all of the network interfaces | |||
or paths that conform to the Properties. The set of available paths | or paths that conform to the Properties. The set of available paths | |||
can change over time, so the implementation should monitor network | can change over time, so the implementation should monitor network | |||
path changes and register and de-register the Listener across all | path changes and register and de-register the Listener across all | |||
usable paths. When using multiple paths, the Listener is generally | usable paths. When using multiple paths, the Listener is generally | |||
expected to use the same port for listening on each. | expected to use the same port for listening on each. | |||
If the Selection Properties allow multiple protocols to be used for | If the Selection Properties allow multiple protocols to be used for | |||
listening, and the implementation supports it, the Listener object | listening, and the implementation supports it, the Listener object | |||
should register across the eligble protocols for each path. This | should support receiving inbound connections for each eligible | |||
means that inbound Connections delivered by the implementation may | protocol on each eligible path. | |||
have heterogeneous protocol stacks. | ||||
4.8.1. Implementing listeners for Connected Protocols | 4.6.1. Implementing listeners for Connected Protocols | |||
Connected protocols such as TCP and TLS-over-TCP have a strong | Connected protocols such as TCP and TLS-over-TCP have a strong | |||
mapping between the Local and Remote Endpoints (five-tuple) and their | mapping between the Local and Remote Endpoints (five-tuple) and their | |||
protocol connection state. These map well into Connection objects. | protocol connection state. These map into Connection objects. | |||
Whenever a new inbound handshake is being started, the Listener | Whenever a new inbound handshake is being started, the Listener | |||
should generate a new Connection object and pass it to the | should generate a new Connection object and pass it to the | |||
application. | application. | |||
4.8.2. Implementing listeners for Unconnected Protocols | 4.6.2. Implementing listeners for Unconnected Protocols | |||
Unconnected protocols such as UDP and UDP-lite generally do not | Unconnected protocols such as UDP and UDP-lite generally do not | |||
provide the same mechanisms that connected protocols do to offer | provide the same mechanisms that connected protocols do to offer | |||
Connection objects. Implementations should wait for incoming packets | Connection objects. Implementations should wait for incoming packets | |||
for unconnected protocols on a listening port and should perform | for unconnected protocols on a listening port and should perform | |||
five-tuple matching of packets to either existing Connection objects | five-tuple matching of packets to either existing Connection objects | |||
or the creation of new Connection objects. On platforms with | or the creation of new Connection objects. On platforms with | |||
facilities to create a "virtual connection" for unconnected protocols | facilities to create a "virtual connection" for unconnected protocols | |||
implementations should use these mechanisms to minimise the handling | implementations should use these mechanisms to minimise the handling | |||
of datagrams intended for already created Connection objects. | of datagrams intended for already created Connection objects. | |||
4.8.3. Implementing listeners for Multiplexed Protocols | 4.6.3. Implementing listeners for Multiplexed Protocols | |||
Protocols that provide multiplexing of streams into a single five- | Protocols that provide multiplexing of streams into a single five- | |||
tuple can listen both for entirely new connections (a new HTTP/2 | tuple can listen both for entirely new connections (a new HTTP/2 | |||
stream on a new TCP connection, for example) and for new sub- | stream on a new TCP connection, for example) and for new sub- | |||
connections (a new HTTP/2 stream on an existing connection). If the | connections (a new HTTP/2 stream on an existing connection). If the | |||
abstraction of Connection presented to the application is mapped to | abstraction of Connection presented to the application is mapped to | |||
the multiplexed stream, then the Listener should deliver new | the multiplexed stream, then the Listener should deliver new | |||
Connection objects in the same way for either case. The | Connection objects in the same way for either case. The | |||
implementation should allow the application to introspect the | implementation should allow the application to introspect the | |||
Connection Group marked on the Connections to determine the grouping | Connection Group marked on the Connections to determine the grouping | |||
skipping to change at page 22, line 17 ¶ | skipping to change at page 22, line 23 ¶ | |||
The effect of the application sending a Message is determined by the | The effect of the application sending a Message is determined by the | |||
top-level protocol in the established Protocol Stack. That is, if | top-level protocol in the established Protocol Stack. That is, if | |||
the top-level protocol provides an abstraction of framed messages | the top-level protocol provides an abstraction of framed messages | |||
over a connection, the receiving application will be able to obtain | over a connection, the receiving application will be able to obtain | |||
multiple Messages on that connection, even if the framing protocol is | multiple Messages on that connection, even if the framing protocol is | |||
built on a byte-stream protocol like TCP. | built on a byte-stream protocol like TCP. | |||
5.1.1. Message Properties | 5.1.1. Message Properties | |||
* Lifetime: this should be implemented by removing the Message from | * Lifetime: this should be implemented by removing the Message from | |||
its queue of pending Messages after the Lifetime has expired. A | the queue of pending Messages after the Lifetime has expired. A | |||
queue of pending Messages within the transport system | queue of pending Messages within the transport system | |||
implementation that have yet to be handed to the Protocol Stack | implementation that have yet to be handed to the Protocol Stack | |||
can always support this property, but once a Message has been sent | can always support this property, but once a Message has been sent | |||
into the send buffer of a protocol, only certain protocols may | into the send buffer of a protocol, only certain protocols may | |||
support de-queueing a message. For example, TCP cannot remove | support removing a message. For example, an implementation cannot | |||
bytes from its send buffer, while in case of SCTP, such control | remove bytes from a TCP send buffer, while it can remove data from | |||
over the SCTP send buffer can be exercised using the partial | a SCTP send buffer using the partial reliability extension | |||
reliability extension [RFC8303]. When there is no standing queue | [RFC8303]. When there is no standing queue of Messages within the | |||
of Messages within the system, and the Protocol Stack does not | system, and the Protocol Stack does not support the removal of a | |||
support removing a Message from its buffer, this property may be | Message from the stack's send buffer, this property may be | |||
ignored. | ignored. | |||
* Priority: this represents the ability to prioritize a Message over | * Priority: this represents the ability to prioritize a Message over | |||
other Messages. This can be implemented by the system re-ordering | other Messages. This can be implemented by the system re-ordering | |||
Messages that have yet to be handed to the Protocol Stack, or by | Messages that have yet to be handed to the Protocol Stack, or by | |||
giving relative priority hints to protocols that support | giving relative priority hints to protocols that support | |||
priorities per Message. For example, an implementation of HTTP/2 | priorities per Message. For example, an implementation of HTTP/2 | |||
could choose to send Messages of different Priority on streams of | could choose to send Messages of different Priority on streams of | |||
different priority. | different priority. | |||
* Ordered: when this is false, it disables the requirement of in- | * Ordered: when this is false, this disables the requirement of in- | |||
order-delivery for protocols that support configurable ordering. | order-delivery for protocols that support configurable ordering. | |||
* Idempotent: when this is true, it means that the Message can be | * Safely Replayable: when this is true, this means that the Message | |||
used by mechanisms that might transfer it multiple times - e.g., | can be used by mechanisms that might transfer it multiple times - | |||
as a result of racing multiple transports or as part of TCP Fast | e.g., as a result of racing multiple transports or as part of TCP | |||
Open. | Fast Open. Also, protocols that do not protect against duplicated | |||
messages, such as UDP, can only be used with Messages that are | ||||
Safely Replayable. | ||||
* Final: when this is true, it means that a transport connection can | * Final: when this is true, this means that a transport connection | |||
be closed immediately after its transmission. | can be closed immediately after transmission of the message. | |||
* Corruption Protection Length: when this is set to any value other | * Corruption Protection Length: when this is set to any value other | |||
than -1, it limits the required checksum in protocols that allow | than "Full Coverage", it sets the minimum protection in protocols | |||
limiting the checksum length (e.g. UDP-Lite). | that allow limiting the checksum length (e.g. UDP-Lite). | |||
* Transmission Profile: TBD - because it's not final in the API yet. | * Reliable Data Transfer (Message): When true, the property | |||
Old text follows: when this is set to "Interactive/Low Latency", | specifies that the Message must be reliably transmitted. When | |||
the Message should be sent immediately, even when this comes at | false, and if unreliable transmission is supported by the | |||
the cost of using the network capacity less efficiently. For | underlying protocol, then the Message should be unreliably | |||
example, small messages can sometimes be bundled to fit into a | transmitted. If the underlying protocol does not support | |||
single data packet for the sake of reducing header overhead; such | unreliable transmission, the Message should be reliably | |||
bundling should not be used. For example, in case of TCP, the | transmitted. | |||
Nagle algorithm should be disabled when Interactive/Low Latency is | ||||
selected as the capacity profile. Scavenger/Bulk can translate | ||||
into usage of a congestion control mechanism such as LEDBAT, and/ | ||||
or the capacity profile can lead to a choice of a DSCP value as | ||||
described in [I-D.ietf-taps-minset]). | ||||
* Singular Transmission: when this is true, the application requests | * Message Capacity Profile Override: When true, this expresses a | |||
to avoid transport-layer segmentation or network-layer | wish to override the Generic Connection Property "Capacity | |||
fragmentation. Some transports implement network-layer | Profile" for this Message. Depending on the value, this can, for | |||
fragmentation avoidance (Path MTU Discovery) without exposing this | example, be implemented by changing the DSCP value of the | |||
functionality to the application; in this case, only transport- | associated packet (note that the he guidelines in Section 6 of | |||
layer segmentation should be avoided, by fitting the message into | [RFC7657] apply; e.g., the DSCP value should not be changed for | |||
a single transport-layer segment or otherwise failing. Otherwise, | different packets within a reliable transport protocol session or | |||
network-layer fragmentation should be avoided--e.g. by requesting | DCCP connection). | |||
the IP Don't Fragment bit to be set in case of UDP(-Lite) and IPv4 | ||||
(SET_DF in [RFC8304]). | * No Fragmentation: When set, this property limits the message size | |||
to the Maximum Message Size Before Fragmentation or Segmentation | ||||
(see Section 10.1.7 of [I-D.ietf-taps-interface]). Messages | ||||
larger than this size generate an error. Setting this avoids | ||||
transport-layer segmentation or network-layer fragmentation. When | ||||
used with transports running over IP version 4 the Don't Fragment | ||||
bit will be set to avoid on-path IP fragmentation ([RFC8304]). | ||||
5.1.2. Send Completion | 5.1.2. Send Completion | |||
The application should be notified whenever a Message or partial | The application should be notified whenever a Message or partial | |||
Message has been consumed by the Protocol Stack, or has failed to | Message has been consumed by the Protocol Stack, or has failed to | |||
send. The meaning of the Message being consumed by the stack may | send. The meaning of the Message being consumed by the stack may | |||
vary depending on the protocol. For a basic datagram protocol like | vary depending on the protocol. For a basic datagram protocol like | |||
UDP, this may correspond to the time when the packet is sent into the | UDP, this may correspond to the time when the packet is sent into the | |||
interface driver. For a protocol that buffers data in queues, like | interface driver. For a protocol that buffers data in queues, like | |||
TCP, this may correspond to when the data has entered the send | TCP, this may correspond to when the data has entered the send | |||
buffer. | buffer. | |||
5.1.3. Batching Sends | 5.1.3. Batching Sends | |||
Since sending a Message may involve a context switch between the | Since sending a Message may involve a context switch between the | |||
application and the transport system, sending patterns that involve | application and the transport system, sending patterns that involve | |||
multiple small Messages can incur high overhead if each needs to be | multiple small Messages can incur high overhead if each needs to be | |||
enqueued separately. To avoid this, the application should have a | enqueued separately. To avoid this, the application can indicate a | |||
way to indicate a batch of Send actions, during which time the | batch of Send actions through the API. When this is used, the | |||
implementation will hold off on processing Messages until the batch | implementation should hold off on processing Messages until the batch | |||
is complete. This can also help context switches when enqueuing data | is complete. | |||
in the interface driver if the operation can be batched. | ||||
5.2. Receiving Messages | 5.2. Receiving Messages | |||
Similar to sending, Receiving a Message is determined by the top- | Similar to sending, Receiving a Message is determined by the top- | |||
level protocol in the established Protocol Stack. The main | level protocol in the established Protocol Stack. The main | |||
difference with Receiving is that the size and boundaries of the | difference with Receiving is that the size and boundaries of the | |||
Message are not known beforehand. The application can communicate in | Message are not known beforehand. The application can communicate in | |||
its Receive action the parameters for the Message, which can help the | its Receive action the parameters for the Message, which can help the | |||
implementation know how much data to deliver and when. For example, | implementation know how much data to deliver and when. For example, | |||
if the application only wants to receive a complete Message, the | if the application only wants to receive a complete Message, the | |||
implementation should wait until an entire Message (datagram, stream, | implementation should wait until an entire Message (datagram, stream, | |||
or frame) is read before delivering any Message content to the | or frame) is read before delivering any Message content to the | |||
application. This requires the implementation to understand where | application. This requires the implementation to understand where | |||
messages end, either via a supplied deframer or because the top-level | messages end, either via a supplied deframer or because the top-level | |||
protocol in the established Protocol Stack preserves message | protocol in the established Protocol Stack preserves message | |||
boundaries; if, on the other hand, the top-level protocol only | boundaries. If the top-level protocol only supports a byte-stream | |||
supports a byte-stream and no deframers were supported, the | and no framers were supported, the application can control the flow | |||
application must specify the minimum number of bytes of Message | of received data by specifying the minimum number of bytes of Message | |||
content it wants to receive (which may be just a single byte) to | content it wants to receive at one time. | |||
control the flow of received data. | ||||
If a Connection becomes finished before a requested Receive action | If a Connection becomes finished before a requested Receive action | |||
can be satisfied, the implementation should deliver any partial | can be satisfied, the implementation should deliver any partial | |||
Message content outstanding, or if none is available, an indication | Message content outstanding, or if none is available, an indication | |||
that there will be no more received Messages. | that there will be no more received Messages. | |||
5.3. Handling of data for fast-open protocols | 5.3. Handling of data for fast-open protocols | |||
Several protocols allow sending higher-level protocol or application | Several protocols allow sending higher-level protocol or application | |||
data within the first packet of their protocol establishment, such as | data within the first packet of their protocol establishment, such as | |||
TCP Fast Open [RFC7413] and TLS 1.3 [RFC8446]. This approach is | TCP Fast Open [RFC7413] and TLS 1.3 [RFC8446]. This approach is | |||
referred to as sending Zero-RTT (0-RTT) data. This is a desirable | referred to as sending Zero-RTT (0-RTT) data. This is a desirable | |||
property, but poses challenges to an implementation that uses racing | property, but poses challenges to an implementation that uses racing | |||
during connection establishment. | during connection establishment. | |||
If the application has 0-RTT data to send in any protocol handshakes, | If the application has 0-RTT data to send in any protocol handshakes, | |||
it needs to provide this data before the handshakes have begun. When | it needs to provide this data before the handshakes have begun. When | |||
racing, this means that the data should be provided before the | racing, this means that the data should be provided before the | |||
process of connection establishment has begun. If the application | process of connection establishment has begun. If the application | |||
wants to send 0-RTT data, it must indicate this to the implementation | wants to send 0-RTT data, it must indicate this to the implementation | |||
by setting the Idempotent send parameter to true when sending the | by setting the "Safely Replayable" send parameter to true when | |||
data. In general, 0-RTT data may be replayed (for example, if a TCP | sending the data. In general, 0-RTT data may be replayed (for | |||
SYN contains data, and the SYN is retransmitted, the data will be | example, if a TCP SYN contains data, and the SYN is retransmitted, | |||
retransmitted as well), but racing means that different leaf nodes | the data will be retransmitted as well but may be considered as a new | |||
have the opportunity to send the same data independently. If data is | connection instead of a retransmission). Also, when racing | |||
truly idempotent, this should be permissible. | connections, different leaf nodes have the opportunity to send the | |||
same data independently. If data is truly safely replayable, this | ||||
should be permissible. | ||||
Once the application has provided its 0-RTT data, an implementation | Once the application has provided its 0-RTT data, an implementation | |||
should keep a copy of this data and provide it to each new leaf node | should keep a copy of this data and provide it to each new leaf node | |||
that is started and for which a 0-RTT protocol is being used. | that is started and for which a 0-RTT protocol is being used. | |||
It is also possible that protocol stacks within a particular leaf | It is also possible that protocol stacks within a particular leaf | |||
node use 0-RTT handshakes without any idempotent application data. | node use 0-RTT handshakes without any safely replayable application | |||
For example, TCP Fast Open could use a Client Hello from TLS as its | data. For example, TCP Fast Open could use a Client Hello from TLS | |||
0-RTT data, shortening the cumulative handshake time. | as its 0-RTT data, shortening the cumulative handshake time. | |||
0-RTT handshakes often rely on previous state, such as TCP Fast Open | 0-RTT handshakes often rely on previous state, such as TCP Fast Open | |||
cookies, previously established TLS tickets, or out-of-band | cookies, previously established TLS tickets, or out-of-band | |||
distributed pre-shared keys (PSKs). Implementations should be aware | distributed pre-shared keys (PSKs). Implementations should be aware | |||
of security concerns around using these tokens across multiple | of security concerns around using these tokens across multiple | |||
addresses or paths when racing. In the case of TLS, any given ticket | addresses or paths when racing. In the case of TLS, any given ticket | |||
or PSK should only be used on one leaf node. If implementations have | or PSK should only be used on one leaf node, since servers will | |||
multiple tickets available from a previous connection, each leaf node | likely reject duplicate tickets in order to prevent replays (see | |||
attempt must use a different ticket. In effect, each leaf node will | section-8.1 [RFC8446]). If implementations have multiple tickets | |||
send the same early application data, yet encoded (encrypted) | available from a previous connection, each leaf node attempt can use | |||
differently on the wire. | a different ticket. In effect, each leaf node will send the same | |||
early application data, yet encoded (encrypted) differently on the | ||||
wire. | ||||
6. Implementing Message Framers | 6. Implementing Message Framers | |||
Message Framers are pieces of code that define simple transformations | Message Framers are pieces of code that define simple transformations | |||
between application Message data and raw transport protocol data. A | between application Message data and raw transport protocol data. A | |||
Framer can encapsulate or encode outbound Messages, and decapsulate | Framer can encapsulate or encode outbound Messages, and decapsulate | |||
or decode inbound data into Messages. | or decode inbound data into Messages. | |||
While many protocols can be represented as Message Framers, for the | While many protocols can be represented as Message Framers, for the | |||
purposes of the Transport Services interface these are ways for | purposes of the Transport Services interface these are ways for | |||
skipping to change at page 27, line 20 ¶ | skipping to change at page 27, line 29 ¶ | |||
to modify the Protocol Stack based on a handshake result. | to modify the Protocol Stack based on a handshake result. | |||
otherFramer := NewMessageFramer() | otherFramer := NewMessageFramer() | |||
MessageFramer.PrependFramer(Connection, otherFramer) | MessageFramer.PrependFramer(Connection, otherFramer) | |||
6.2. Sender-side Message Framing | 6.2. Sender-side Message Framing | |||
Message Framers generate an event whenever a Connection sends a new | Message Framers generate an event whenever a Connection sends a new | |||
Message. | Message. | |||
MessageFramer -> NewSentMessage<Connection, MessageData, MessageContext, IsEndOfMessage> | MessageFramer -> NewSentMessage<Connection, MessageData, MessageContext, IsEndOfMessage> | |||
Upon receiving this event, a framer implementation is responsible for | Upon receiving this event, a framer implementation is responsible for | |||
performing any necessary transformations and sending the resulting | performing any necessary transformations and sending the resulting | |||
data back to the Message Framer, which will in turn send it to the | data back to the Message Framer, which will in turn send it to the | |||
next protocol. Implementations SHOULD ensure that there is a way to | next protocol. Implementations SHOULD ensure that there is a way to | |||
pass the original data through without copying to improve | pass the original data through without copying to improve | |||
performance. | performance. | |||
MessageFramer.Send(Connection, Data) | MessageFramer.Send(Connection, Data) | |||
skipping to change at page 27, line 51 ¶ | skipping to change at page 28, line 13 ¶ | |||
available to parse. | available to parse. | |||
MessageFramer -> HandleReceivedData<Connection> | MessageFramer -> HandleReceivedData<Connection> | |||
Upon receiving this event, the framer implementation can inspect the | Upon receiving this event, the framer implementation can inspect the | |||
inbound data. The data is parsed from a particular cursor | inbound data. The data is parsed from a particular cursor | |||
representing the unprocessed data. The application requests a | representing the unprocessed data. The application requests a | |||
specific amount of data it needs to have available in order to parse. | specific amount of data it needs to have available in order to parse. | |||
If the data is not available, the parse fails. | If the data is not available, the parse fails. | |||
MessageFramer.Parse(Connection, MinimumIncompleteLength, MaximumLength) -> (Data, MessageContext, IsEndOfMessage) | MessageFramer.Parse(Connection, MinimumIncompleteLength, MaximumLength) -> (Data, MessageContext, IsEndOfMessage) | |||
The framer implementation can directly advance the receive cursor | The framer implementation can directly advance the receive cursor | |||
once it has parsed data to effectively discard data (for example, | once it has parsed data to effectively discard data (for example, | |||
discard a header once the content has been parsed). | discard a header once the content has been parsed). | |||
To deliver a Message to the application, the framer implementation | To deliver a Message to the application, the framer implementation | |||
can either directly deliver data that it has allocated, or deliver a | can either directly deliver data that it has allocated, or deliver a | |||
range of data directly from the underlying transport and | range of data directly from the underlying transport and | |||
simultaneously advance the receive cursor. | simultaneously advance the receive cursor. | |||
MessageFramer.AdvanceReceiveCursor(Connection, Length) | MessageFramer.AdvanceReceiveCursor(Connection, Length) | |||
MessageFramer.DeliverAndAdvanceReceiveCursor(Connection, MessageContext, Length, IsEndOfMessage) | MessageFramer.DeliverAndAdvanceReceiveCursor(Connection, MessageContext, Length, IsEndOfMessage) | |||
MessageFramer.Deliver(Connection, MessageContext, Data, IsEndOfMessage) | MessageFramer.Deliver(Connection, MessageContext, Data, IsEndOfMessage) | |||
Note that "MessageFramer.DeliverAndAdvanceReceiveCursor" allows the | Note that "MessageFramer.DeliverAndAdvanceReceiveCursor" allows the | |||
framer implementation to earmark bytes as part of a Message even | framer implementation to earmark bytes as part of a Message even | |||
before they are received by the transport. This allows the delivery | before they are received by the transport. This allows the delivery | |||
of very large Messages without requiring the implementation to | of very large Messages without requiring the implementation to | |||
directly inspect all of the bytes. | directly inspect all of the bytes. | |||
To provide an example, a simple protocol that parses a length as a | To provide an example, a simple protocol that parses a length as a | |||
header value would receive the "HandleReceivedData" event, and call | header value would receive the "HandleReceivedData" event, and call | |||
"Parse" with a minimum and maximum set to the length of the header | "Parse" with a minimum and maximum set to the length of the header | |||
skipping to change at page 30, line 31 ¶ | skipping to change at page 30, line 38 ¶ | |||
all supported protocols. Hence, as is common with all reliable | all supported protocols. Hence, as is common with all reliable | |||
transport protocols, after a Close action, the application can expect | transport protocols, after a Close action, the application can expect | |||
to have its reliability requirements honored regarding the data it | to have its reliability requirements honored regarding the data it | |||
has given to the Transport System, but it cannot expect to be able to | has given to the Transport System, but it cannot expect to be able to | |||
read any more data after calling Close. | read any more data after calling Close. | |||
Abort differs from Close only in that no guarantees are given | Abort differs from Close only in that no guarantees are given | |||
regarding data that the application has handed over to the Transport | regarding data that the application has handed over to the Transport | |||
System before calling Abort. | System before calling Abort. | |||
As explained in Section 4.6, when a new stream is multiplexed on an | As explained in Section 4.4, when a new stream is multiplexed on an | |||
already existing connection of a Transport Protocol Instance, there | already existing connection of a Transport Protocol Instance, there | |||
is no need for a connection establishment procedure. Because the | is no need for a connection establishment procedure. Because the | |||
Connections that are offered by the Transport System can be | Connections that are offered by the Transport System can be | |||
implemented as streams that are multiplexed on a transport protocol's | implemented as streams that are multiplexed on a transport protocol's | |||
connection, it can therefore not be guaranteed that one Endpoint's | connection, it can therefore not be guaranteed that one Endpoint's | |||
Initiate action provokes a ConnectionReceived event at its peer. | Initiate action provokes a ConnectionReceived event at its peer. | |||
For Close (provoking a Finished event) and Abort (provoking a | For Close (provoking a Finished event) and Abort (provoking a | |||
ConnectionError event), the same logic applies: while it is desirable | ConnectionError event), the same logic applies: while it is desirable | |||
to be informed when a peer closes or aborts a Connection, whether | to be informed when a peer closes or aborts a Connection, whether | |||
skipping to change at page 32, line 43 ¶ | skipping to change at page 33, line 6 ¶ | |||
options. Eligible options that historically had significantly better | options. Eligible options that historically had significantly better | |||
performance than others should be selected first when gathering | performance than others should be selected first when gathering | |||
candidates (see Section 4.1) to ensure better performance for the | candidates (see Section 4.1) to ensure better performance for the | |||
application. | application. | |||
The reasonable lifetime for cached performance values will vary | The reasonable lifetime for cached performance values will vary | |||
depending on the nature of the value. Certain information, like the | depending on the nature of the value. Certain information, like the | |||
connection establishment success rate to a Remote Endpoint using a | connection establishment success rate to a Remote Endpoint using a | |||
given protocol stack, can be stored for a long period of time (hours | given protocol stack, can be stored for a long period of time (hours | |||
or longer), since it is expected that the capabilities of the Remote | or longer), since it is expected that the capabilities of the Remote | |||
Endpoint are not changing very quickly. On the other hand, Round | Endpoint are not changing very quickly. On the other hand, the Round | |||
Trip Time observed by TCP over a particular network path may vary | Trip Time observed by TCP over a particular network path may vary | |||
over a relatively short time interval. For such values, the | over a relatively short time interval. For such values, the | |||
implementation should remove them from the cache more quickly, or | implementation should remove them from the cache more quickly, or | |||
treat older values with less confidence/weight. | treat older values with less confidence/weight. | |||
[I-D.ietf-tcpm-2140bis] provides guidance about sharing of TCP | ||||
Control Block information between connections on initialization. | ||||
10. Specific Transport Protocol Considerations | 10. Specific Transport Protocol Considerations | |||
Each protocol that can run as part of a Transport Services | Each protocol that can run as part of a Transport Services | |||
implementation defines both its API mapping as well as implementation | implementation defines both its API mapping as well as implementation | |||
details. API mappings for a protocol apply most to Connections in | details. API mappings for a protocol apply most to Connections in | |||
which the given protocol is the "top" of the Protocol Stack. For | which the given protocol is the "top" of the Protocol Stack. For | |||
example, the mapping of the "Send" function for TCP applies to | example, the mapping of the "Send" function for TCP applies to | |||
Connections in which the application directly sends over TCP. If | Connections in which the application directly sends over TCP. If | |||
HTTP/2 is used on top of TCP, the HTTP/2 mappings take precendence. | HTTP/2 is used on top of TCP, the HTTP/2 mappings take precendence. | |||
skipping to change at page 33, line 49 ¶ | skipping to change at page 34, line 14 ¶ | |||
* Datagram. Datagram protocols define Message boundaries at the | * Datagram. Datagram protocols define Message boundaries at the | |||
same level of transmission, such that only complete (not partial) | same level of transmission, such that only complete (not partial) | |||
Messages are supported. | Messages are supported. | |||
* Message. Message protocols support Message boundaries that can be | * Message. Message protocols support Message boundaries that can be | |||
sent and received either as complete or partial Messages. Maximum | sent and received either as complete or partial Messages. Maximum | |||
Message lengths can be defined, and Messages can be partially | Message lengths can be defined, and Messages can be partially | |||
reliable. | reliable. | |||
Below, primitives in the style of | Below, terms in capitals with a dot (e.g., "CONNECT.SCTP") refer to | |||
"CATEGORY.[SUBCATEGORY].PRIMITIVENAME.PROTOCOL" (e.g., | the primitives with the same name in section 4 of [RFC8303]. For | |||
"CONNECT.SCTP") refer to the primitives with the same name in section | further implementation details, the description of these primitives | |||
4 of [RFC8303]. For further implementation details, the description | in [RFC8303] points to section 3 of [RFC8303] and section 3 of | |||
of these primitives in [RFC8303] points to section 3, which refers | [RFC8304], which refers back to the relevant specifications for each | |||
back to the specifications for each protocol. This back-tracking | protocol. This back-tracking method applies to all elements of | |||
method applies to all elements of [I-D.ietf-taps-minset] (see | [I-D.ietf-taps-minset] (see appendix D of [I-D.ietf-taps-interface]): | |||
appendix D of [I-D.ietf-taps-interface]): they are listed in appendix | they are listed in appendix A of [I-D.ietf-taps-minset] with an | |||
A of [I-D.ietf-taps-minset] with an implementation hint in the same | implementation hint in the same style, pointing back to section 4 of | |||
style, pointing back to section 4 of [RFC8303]. | [RFC8303]. | |||
10.1. TCP | 10.1. TCP | |||
Connectedness: Connected | Connectedness: Connected | |||
Data Unit: Byte-stream | Data Unit: Byte-stream | |||
API mappings for TCP are as follows: | API mappings for TCP are as follows: | |||
Connection Object: TCP connections between two hosts map directly to | Connection Object: TCP connections between two hosts map directly to | |||
Connection objects. | Connection objects. | |||
Initiate: CONNECT.TCP. Calling "Initiate" on a TCP Connection | Initiate: CONNECT.TCP. Calling "Initiate" on a TCP Connection | |||
causes it to reserve a local port, and send a SYN to the Remote | causes it to reserve a local port, and send a SYN to the Remote | |||
Endpoint. | Endpoint. | |||
InitiateWithSend: CONNECT.TCP with parameter "user message". Early | InitiateWithSend: CONNECT.TCP with parameter "user message". Early | |||
idempotent data is sent on a TCP Connection in the SYN, as TCP | safely replayable data is sent on a TCP Connection in the SYN, as | |||
Fast Open data. | TCP Fast Open data. | |||
Ready: A TCP Connection is ready once the three-way handshake is | Ready: A TCP Connection is ready once the three-way handshake is | |||
complete. | complete. | |||
InitiateError: Failure of CONNECT.TCP. TCP can throw various errors | InitiateError: Failure of CONNECT.TCP. TCP can throw various errors | |||
during connection setup. Specifically, it is important to handle | during connection setup. Specifically, it is important to handle | |||
a RST being sent by the peer during the handshake. | a RST being sent by the peer during the handshake. | |||
ConnectionError: Once established, TCP throws errors whenever the | ConnectionError: Once established, TCP throws errors whenever the | |||
connection is disconnected, such as due to receiving a RST from | connection is disconnected, such as due to receiving a RST from | |||
skipping to change at page 38, line 26 ¶ | skipping to change at page 38, line 40 ¶ | |||
Data Unit: Byte-stream | Data Unit: Byte-stream | |||
Connection Object: Connection objects represent a single TLS | Connection Object: Connection objects represent a single TLS | |||
connection running over a TCP connection between two hosts. | connection running over a TCP connection between two hosts. | |||
Initiate: Calling "Initiate" on a TLS Connection causes it to first | Initiate: Calling "Initiate" on a TLS Connection causes it to first | |||
initiate a TCP connection. Once the TCP protocol is Ready, the | initiate a TCP connection. Once the TCP protocol is Ready, the | |||
TLS handshake will be performed as a client (starting by sending a | TLS handshake will be performed as a client (starting by sending a | |||
"client_hello", and so on). | "client_hello", and so on). | |||
InitiateWithSend: Early idempotent data is supported by TLS 1.3, and | InitiateWithSend: Early safely replayable data is supported by TLS | |||
sends encrypted application data in the first TLS message when | 1.3, and sends encrypted application data in the first TLS message | |||
performing session resumption. For older versions of TLS, or if a | when performing session resumption. For older versions of TLS, or | |||
session is not being resumed, the initial data will be delayed | if a session is not being resumed, the initial data will be | |||
until the TLS handshake is complete. TCP Fast Option can also be | delayed until the TLS handshake is complete. TCP Fast Open can | |||
enabled automatically. | also be enabled automatically. | |||
Ready: A TLS Connection is ready once the underlying TCP connection | Ready: A TLS Connection is ready once the underlying TCP connection | |||
is Ready, and TLS handshake is also complete and keys have been | is Ready, and TLS handshake is also complete and keys have been | |||
established to encrypt application data. | established to encrypt application data. | |||
InitiateError: In addition to TCP initiation errors, TLS can | InitiateError: In addition to TCP initiation errors, TLS can | |||
generate errors during its handshake. Examples of error include a | generate errors during its handshake. Examples of error include a | |||
failure of the peer to successfully authenticate, the peer | failure of the peer to successfully authenticate, the peer | |||
rejecting the local authentication, or a failure to match versions | rejecting the local authentication, or a failure to match versions | |||
or algorithms. | or algorithms. | |||
skipping to change at page 44, line 28 ¶ | skipping to change at page 45, line 7 ¶ | |||
"Close". | "Close". | |||
11. IANA Considerations | 11. IANA Considerations | |||
RFC-EDITOR: Please remove this section before publication. | RFC-EDITOR: Please remove this section before publication. | |||
This document has no actions for IANA. | This document has no actions for IANA. | |||
12. Security Considerations | 12. Security Considerations | |||
[I-D.ietf-taps-arch] outlines general security consideration and | ||||
requirements for any system that implements the TAPS archtecture. | ||||
[I-D.ietf-taps-interface] provides further discussion on security and | ||||
privacy implications of the TAPS API. This document provides | ||||
additional guidance on implementation specifics for the TAPS API and | ||||
as such the security considerations in both of these documents apply. | ||||
The next two subsections discuss further considerations that are | ||||
specific to mechanisms specified in this document. | ||||
12.1. Considerations for Candidate Gathering | 12.1. Considerations for Candidate Gathering | |||
Implementations should avoid downgrade attacks that allow network | Implementations should avoid downgrade attacks that allow network | |||
interference to cause the implementation to select less secure, or | interference to cause the implementation to select less secure, or | |||
entirely insecure, combinations of paths and protocols. | entirely insecure, combinations of paths and protocols. | |||
12.2. Considerations for Candidate Racing | 12.2. Considerations for Candidate Racing | |||
See Section 5.3 for security considerations around racing with 0-RTT | See Section 5.3 for security considerations around racing with 0-RTT | |||
data. | data. | |||
skipping to change at page 45, line 35 ¶ | skipping to change at page 46, line 23 ¶ | |||
Eyeballs, that heavily influenced this work. | Eyeballs, that heavily influenced this work. | |||
14. References | 14. References | |||
14.1. Normative References | 14.1. Normative References | |||
[I-D.ietf-taps-arch] | [I-D.ietf-taps-arch] | |||
Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G., | Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G., | |||
Perkins, C., Tiesel, P., and C. Wood, "An Architecture for | Perkins, C., Tiesel, P., and C. Wood, "An Architecture for | |||
Transport Services", Work in Progress, Internet-Draft, | Transport Services", Work in Progress, Internet-Draft, | |||
draft-ietf-taps-arch-06, 23 December 2019, | draft-ietf-taps-arch-07, 9 March 2020, | |||
<http://www.ietf.org/internet-drafts/draft-ietf-taps-arch- | <http://www.ietf.org/internet-drafts/draft-ietf-taps-arch- | |||
06.txt>. | 07.txt>. | |||
[I-D.ietf-taps-interface] | [I-D.ietf-taps-interface] | |||
Trammell, B., Welzl, M., Enghardt, T., Fairhurst, G., | Trammell, B., Welzl, M., Enghardt, T., Fairhurst, G., | |||
Kuehlewind, M., Perkins, C., Tiesel, P., Wood, C., and T. | Kuehlewind, M., Perkins, C., Tiesel, P., Wood, C., and T. | |||
Pauly, "An Abstract Application Layer Interface to | Pauly, "An Abstract Application Layer Interface to | |||
Transport Services", Work in Progress, Internet-Draft, | Transport Services", Work in Progress, Internet-Draft, | |||
draft-ietf-taps-interface-05, 4 November 2019, | draft-ietf-taps-interface-06, 9 March 2020, | |||
<http://www.ietf.org/internet-drafts/draft-ietf-taps- | <http://www.ietf.org/internet-drafts/draft-ietf-taps- | |||
interface-05.txt>. | interface-06.txt>. | |||
[I-D.ietf-taps-minset] | [I-D.ietf-taps-minset] | |||
Welzl, M. and S. Gjessing, "A Minimal Set of Transport | Welzl, M. and S. Gjessing, "A Minimal Set of Transport | |||
Services for End Systems", Work in Progress, Internet- | Services for End Systems", Work in Progress, Internet- | |||
Draft, draft-ietf-taps-minset-11, 27 September 2018, | Draft, draft-ietf-taps-minset-11, 27 September 2018, | |||
<http://www.ietf.org/internet-drafts/draft-ietf-taps- | <http://www.ietf.org/internet-drafts/draft-ietf-taps- | |||
minset-11.txt>. | minset-11.txt>. | |||
[RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP | [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP | |||
Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, | Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, | |||
skipping to change at page 46, line 46 ¶ | skipping to change at page 47, line 35 ¶ | |||
[RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol | [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol | |||
Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, | Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, | |||
<https://www.rfc-editor.org/info/rfc8446>. | <https://www.rfc-editor.org/info/rfc8446>. | |||
14.2. Informative References | 14.2. Informative References | |||
[I-D.ietf-quic-transport] | [I-D.ietf-quic-transport] | |||
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | |||
and Secure Transport", Work in Progress, Internet-Draft, | and Secure Transport", Work in Progress, Internet-Draft, | |||
draft-ietf-quic-transport-27, 21 February 2020, | draft-ietf-quic-transport-29, 9 June 2020, | |||
<http://www.ietf.org/internet-drafts/draft-ietf-quic- | <http://www.ietf.org/internet-drafts/draft-ietf-quic- | |||
transport-27.txt>. | transport-29.txt>. | |||
[I-D.ietf-tcpm-2140bis] | ||||
Touch, J., Welzl, M., and S. Islam, "TCP Control Block | ||||
Interdependence", Work in Progress, Internet-Draft, draft- | ||||
ietf-tcpm-2140bis-05, 29 April 2020, <http://www.ietf.org/ | ||||
internet-drafts/draft-ietf-tcpm-2140bis-05.txt>. | ||||
[NEAT-flow-mapping] | [NEAT-flow-mapping] | |||
"Transparent Flow Mapping for NEAT (in Workshop on Future | "Transparent Flow Mapping for NEAT", Workshop on Future of | |||
of Internet Transport (FIT 2017))", 2017. | Internet Transport (FIT 2017) , 2017. | |||
[RFC5245] Rosenberg, J., "Interactive Connectivity Establishment | [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, | |||
(ICE): A Protocol for Network Address Translator (NAT) | "Session Traversal Utilities for NAT (STUN)", RFC 5389, | |||
Traversal for Offer/Answer Protocols", RFC 5245, | DOI 10.17487/RFC5389, October 2008, | |||
DOI 10.17487/RFC5245, April 2010, | <https://www.rfc-editor.org/info/rfc5389>. | |||
<https://www.rfc-editor.org/info/rfc5245>. | ||||
[RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using | ||||
Relays around NAT (TURN): Relay Extensions to Session | ||||
Traversal Utilities for NAT (STUN)", RFC 5766, | ||||
DOI 10.17487/RFC5766, April 2010, | ||||
<https://www.rfc-editor.org/info/rfc5766>. | ||||
[RFC6762] Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762, | ||||
DOI 10.17487/RFC6762, February 2013, | ||||
<https://www.rfc-editor.org/info/rfc6762>. | ||||
[RFC6763] Cheshire, S. and M. Krochmal, "DNS-Based Service | ||||
Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013, | ||||
<https://www.rfc-editor.org/info/rfc6763>. | ||||
[RFC7657] Black, D., Ed. and P. Jones, "Differentiated Services | ||||
(Diffserv) and Real-Time Communication", RFC 7657, | ||||
DOI 10.17487/RFC7657, November 2015, | ||||
<https://www.rfc-editor.org/info/rfc7657>. | ||||
[RFC8445] Keranen, A., Holmberg, C., and J. Rosenberg, "Interactive | ||||
Connectivity Establishment (ICE): A Protocol for Network | ||||
Address Translator (NAT) Traversal", RFC 8445, | ||||
DOI 10.17487/RFC8445, July 2018, | ||||
<https://www.rfc-editor.org/info/rfc8445>. | ||||
Appendix A. Additional Properties | Appendix A. Additional Properties | |||
This appendix discusses implementation considerations for additional | This appendix discusses implementation considerations for additional | |||
parameters and properties that could be used to enhance transport | parameters and properties that could be used to enhance transport | |||
protocol and/or path selection, or the transmission of messages given | protocol and/or path selection, or the transmission of messages given | |||
a Protocol Stack that implements them. These are not part of the | a Protocol Stack that implements them. These are not part of the | |||
interface, and may be removed from the final document, but are | interface, and may be removed from the final document, but are | |||
presented here to support discussion within the TAPS working group as | presented here to support discussion within the TAPS working group as | |||
to whether they should be added to a future revision of the base | to whether they should be added to a future revision of the base | |||
specification. | specification. | |||
A.1. Properties Affecting Sorting of Branches | A.1. Properties Affecting Sorting of Branches | |||
In addition to the Protocol and Path Selection Properties discussed | In addition to the Protocol and Path Selection Properties discussed | |||
in Section 4.3, the following properties under discussion can | in Section 4.1.5, the following properties under discussion can | |||
influence branch sorting: | influence branch sorting: | |||
* Bounds on Send or Receive Rate: If the application indicates a | * Bounds on Send or Receive Rate: If the application indicates a | |||
bound on the expected Send or Receive bitrate, an implementation | bound on the expected Send or Receive bitrate, an implementation | |||
may prefer a path that can likely provide the desired bandwidth, | may prefer a path that can likely provide the desired bandwidth, | |||
based on cached maximum throughput, see Section 9.2. The | based on cached maximum throughput, see Section 9.2. The | |||
application may know the Send or Receive Bitrate from metadata in | application may know the Send or Receive Bitrate from metadata in | |||
adaptive HTTP streaming, such as MPEG-DASH. | adaptive HTTP streaming, such as MPEG-DASH. | |||
* Cost Preferences: If the application indicates a preference to | * Cost Preferences: If the application indicates a preference to | |||
skipping to change at page 49, line 11 ¶ | skipping to change at page 50, line 26 ¶ | |||
- Network.framework is a transport-level API built for C, | - Network.framework is a transport-level API built for C, | |||
Objective-C, and Swift. It a connect-by-name API that supports | Objective-C, and Swift. It a connect-by-name API that supports | |||
transport security protocols. It provides userspace | transport security protocols. It provides userspace | |||
implementations of TCP, UDP, TLS, DTLS, proxy protocols, and | implementations of TCP, UDP, TLS, DTLS, proxy protocols, and | |||
allows extension via custom framers. | allows extension via custom framers. | |||
- Documentation: https://developer.apple.com/documentation/ | - Documentation: https://developer.apple.com/documentation/ | |||
network (https://developer.apple.com/documentation/network) | network (https://developer.apple.com/documentation/network) | |||
* NEAT: | * NEAT and NEATPy: | |||
- NEAT is the output of the European H2020 research project | - NEAT is the output of the European H2020 research project | |||
"NEAT"; it is a user-space library for protocol-independent | "NEAT"; it is a user-space library for protocol-independent | |||
communication on top of TCP, UDP and SCTP, with many more | communication on top of TCP, UDP and SCTP, with many more | |||
features such as a policy manager. | features such as a policy manager. | |||
- Code: https://github.com/NEAT-project/neat (https://github.com/ | - Code: https://github.com/NEAT-project/neat (https://github.com/ | |||
NEAT-project/neat) | NEAT-project/neat) | |||
- NEAT project: https://www.neat-project.org (https://www.neat- | - NEAT project: https://www.neat-project.org (https://www.neat- | |||
project.org) | project.org) | |||
- NEATPy is a Python shim over NEAT which updates the NEAT API to | ||||
be in line with version 6 of the TAPS interface draft. | ||||
- Code: https://github.com/theagilepadawan/NEATPy | ||||
(https://github.com/theagilepadawan/NEATPy) | ||||
* PyTAPS: | * PyTAPS: | |||
- A TAPS implementation based on Python asyncio, offering | - A TAPS implementation based on Python asyncio, offering | |||
protocol-independent communication to applications on top of | protocol-independent communication to applications on top of | |||
TCP, UDP and TLS, with support for multicast. | TCP, UDP and TLS, with support for multicast. | |||
- Code: https://github.com/fg-inet/python-asyncio-taps | - Code: https://github.com/fg-inet/python-asyncio-taps | |||
(https://github.com/fg-inet/python-asyncio-taps) | (https://github.com/fg-inet/python-asyncio-taps) | |||
Authors' Addresses | Authors' Addresses | |||
Anna Brunstrom (editor) | Anna Brunstrom (editor) | |||
Karlstad University | Karlstad University | |||
Universitetsgatan 2 | Universitetsgatan 2 | |||
SE- 651 88 Karlstad | 651 88 Karlstad | |||
Sweden | Sweden | |||
Email: anna.brunstrom@kau.se | Email: anna.brunstrom@kau.se | |||
Tommy Pauly (editor) | Tommy Pauly (editor) | |||
Apple Inc. | Apple Inc. | |||
One Apple Park Way | One Apple Park Way | |||
Cupertino, California 95014, | Cupertino, California 95014, | |||
United States of America | United States of America | |||
skipping to change at page 50, line 4 ¶ | skipping to change at page 51, line 22 ¶ | |||
Email: anna.brunstrom@kau.se | Email: anna.brunstrom@kau.se | |||
Tommy Pauly (editor) | Tommy Pauly (editor) | |||
Apple Inc. | Apple Inc. | |||
One Apple Park Way | One Apple Park Way | |||
Cupertino, California 95014, | Cupertino, California 95014, | |||
United States of America | United States of America | |||
Email: tpauly@apple.com | Email: tpauly@apple.com | |||
Theresa Enghardt | Theresa Enghardt | |||
TU Berlin | Netflix | |||
Marchstrasse 23 | 121 Albright Way | |||
10587 Berlin | Los Gatos, CA 95032, | |||
Germany | United States of America | |||
Email: theresa@inet.tu-berlin.de | Email: ietf@tenghardt.net | |||
Karl-Johan Grinnemo | Karl-Johan Grinnemo | |||
Karlstad University | Karlstad University | |||
Universitetsgatan 2 | Universitetsgatan 2 | |||
SE- 651 88 Karlstad | 651 88 Karlstad | |||
Sweden | Sweden | |||
Email: karl-johan.grinnemo@kau.se | Email: karl-johan.grinnemo@kau.se | |||
Tom Jones | Tom Jones | |||
University of Aberdeen | University of Aberdeen | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen, AB24 3UE | Aberdeen, AB24 3UE | |||
United Kingdom | United Kingdom | |||
End of changes. 112 change blocks. | ||||
317 lines changed or deleted | 374 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |