--- 1/draft-ietf-taps-impl-07.txt 2020-11-02 12:13:15.432534765 -0800 +++ 2/draft-ietf-taps-impl-08.txt 2020-11-02 12:13:15.528537200 -0800 @@ -1,31 +1,31 @@ TAPS Working Group A. Brunstrom, Ed. Internet-Draft Karlstad University Intended status: Informational T. Pauly, Ed. -Expires: 14 January 2021 Apple Inc. +Expires: 6 May 2021 Apple Inc. T. Enghardt Netflix K-J. Grinnemo Karlstad University T. Jones University of Aberdeen P. Tiesel TU Berlin C. Perkins University of Glasgow M. Welzl University of Oslo - 13 July 2020 + 2 November 2020 Implementing Interfaces to Transport Services - draft-ietf-taps-impl-07 + draft-ietf-taps-impl-08 Abstract The Transport Services (TAPS) system enables applications to use transport protocols flexibly for network communication and defines a protocol-independent TAPS Application Programming Interface (API) that is based on an asynchronous, event-driven interaction pattern. This document serves as a guide to implementation on how to build such a system. @@ -37,21 +37,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 14 January 2021. + This Internet-Draft will expire on 6 May 2021. Copyright Notice Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights @@ -68,72 +68,70 @@ 3.1. Configuration-time errors . . . . . . . . . . . . . . . . 5 3.2. Role of system policy . . . . . . . . . . . . . . . . . . 6 4. Implementing Connection Establishment . . . . . . . . . . . . 7 4.1. Candidate Gathering . . . . . . . . . . . . . . . . . . . 8 4.1.1. Gathering Endpoint Candidates . . . . . . . . . . . . 8 4.1.2. Structuring Options as a Tree . . . . . . . . . . . . 9 4.1.3. Branch Types . . . . . . . . . . . . . . . . . . . . 11 4.1.4. Branching Order-of-Operations . . . . . . . . . . . . 13 4.1.5. Sorting Branches . . . . . . . . . . . . . . . . . . 14 4.2. Candidate Racing . . . . . . . . . . . . . . . . . . . . 16 - 4.2.1. Immediate . . . . . . . . . . . . . . . . . . . . . . 16 - 4.2.2. Delayed . . . . . . . . . . . . . . . . . . . . . . . 17 - 4.2.3. Failover . . . . . . . . . . . . . . . . . . . . . . 17 + 4.2.1. Simultaneous . . . . . . . . . . . . . . . . . . . . 16 + 4.2.2. Staggered . . . . . . . . . . . . . . . . . . . . . . 17 + 4.2.3. Failover . . . . . . . . . . . . . . . . . . . . . . 18 4.3. Completing Establishment . . . . . . . . . . . . . . . . 18 4.3.1. Determining Successful Establishment . . . . . . . . 19 - 4.4. Establishing multiplexed connections . . . . . . . . . . 19 + 4.4. Establishing multiplexed connections . . . . . . . . . . 20 4.5. Handling racing with "unconnected" protocols . . . . . . 20 4.6. Implementing listeners . . . . . . . . . . . . . . . . . 20 4.6.1. Implementing listeners for Connected Protocols . . . 21 4.6.2. Implementing listeners for Unconnected Protocols . . 21 4.6.3. Implementing listeners for Multiplexed Protocols . . 21 - 5. Implementing Sending and Receiving Data . . . . . . . . . . . 21 + 5. Implementing Sending and Receiving Data . . . . . . . . . . . 22 5.1. Sending Messages . . . . . . . . . . . . . . . . . . . . 22 5.1.1. Message Properties . . . . . . . . . . . . . . . . . 22 - 5.1.2. Send Completion . . . . . . . . . . . . . . . . . . . 23 + 5.1.2. Send Completion . . . . . . . . . . . . . . . . . . . 24 5.1.3. Batching Sends . . . . . . . . . . . . . . . . . . . 24 5.2. Receiving Messages . . . . . . . . . . . . . . . . . . . 24 - 5.3. Handling of data for fast-open protocols . . . . . . . . 24 - 6. Implementing Message Framers . . . . . . . . . . . . . . . . 25 + 5.3. Handling of data for fast-open protocols . . . . . . . . 25 + 6. Implementing Message Framers . . . . . . . . . . . . . . . . 26 6.1. Defining Message Framers . . . . . . . . . . . . . . . . 26 - 6.2. Sender-side Message Framing . . . . . . . . . . . . . . . 27 - 6.3. Receiver-side Message Framing . . . . . . . . . . . . . . 27 - 7. Implementing Connection Management . . . . . . . . . . . . . 28 - 7.1. Pooled Connection . . . . . . . . . . . . . . . . . . . . 29 - 7.2. Handling Path Changes . . . . . . . . . . . . . . . . . . 29 - 8. Implementing Connection Termination . . . . . . . . . . . . . 30 - 9. Cached State . . . . . . . . . . . . . . . . . . . . . . . . 31 - 9.1. Protocol state caches . . . . . . . . . . . . . . . . . . 31 - 9.2. Performance caches . . . . . . . . . . . . . . . . . . . 32 - 10. Specific Transport Protocol Considerations . . . . . . . . . 33 - 10.1. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . 34 - 10.2. UDP . . . . . . . . . . . . . . . . . . . . . . . . . . 35 - 10.3. UDP Multicast Receive . . . . . . . . . . . . . . . . . 37 - 10.4. TLS . . . . . . . . . . . . . . . . . . . . . . . . . . 38 - 10.5. DTLS . . . . . . . . . . . . . . . . . . . . . . . . . . 40 - 10.6. HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . 40 - 10.7. QUIC . . . . . . . . . . . . . . . . . . . . . . . . . . 41 - 10.8. HTTP/2 transport . . . . . . . . . . . . . . . . . . . . 42 - 10.9. SCTP . . . . . . . . . . . . . . . . . . . . . . . . . . 42 - 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44 - 12. Security Considerations . . . . . . . . . . . . . . . . . . . 45 - 12.1. Considerations for Candidate Gathering . . . . . . . . . 45 - 12.2. Considerations for Candidate Racing . . . . . . . . . . 45 - 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 45 - 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 46 - 14.1. Normative References . . . . . . . . . . . . . . . . . . 46 - 14.2. Informative References . . . . . . . . . . . . . . . . . 47 - Appendix A. Additional Properties . . . . . . . . . . . . . . . 48 - A.1. Properties Affecting Sorting of Branches . . . . . . . . 48 - Appendix B. Reasons for errors . . . . . . . . . . . . . . . . . 49 - Appendix C. Existing Implementations . . . . . . . . . . . . . . 50 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 51 + 6.2. Sender-side Message Framing . . . . . . . . . . . . . . . 28 + 6.3. Receiver-side Message Framing . . . . . . . . . . . . . . 28 + 7. Implementing Connection Management . . . . . . . . . . . . . 29 + 7.1. Pooled Connection . . . . . . . . . . . . . . . . . . . . 30 + 7.2. Handling Path Changes . . . . . . . . . . . . . . . . . . 30 + 8. Implementing Connection Termination . . . . . . . . . . . . . 31 + 9. Cached State . . . . . . . . . . . . . . . . . . . . . . . . 32 + 9.1. Protocol state caches . . . . . . . . . . . . . . . . . . 33 + 9.2. Performance caches . . . . . . . . . . . . . . . . . . . 33 + 10. Specific Transport Protocol Considerations . . . . . . . . . 34 + 10.1. TCP . . . . . . . . . . . . . . . . . . . . . . . . . . 35 + 10.2. MPTCP . . . . . . . . . . . . . . . . . . . . . . . . . 37 + 10.3. UDP . . . . . . . . . . . . . . . . . . . . . . . . . . 37 + 10.4. UDP-Lite . . . . . . . . . . . . . . . . . . . . . . . . 38 + 10.5. UDP Multicast Receive . . . . . . . . . . . . . . . . . 38 + 10.6. SCTP . . . . . . . . . . . . . . . . . . . . . . . . . . 40 + 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 42 + 12. Security Considerations . . . . . . . . . . . . . . . . . . . 42 + 12.1. Considerations for Candidate Gathering . . . . . . . . . 43 + 12.2. Considerations for Candidate Racing . . . . . . . . . . 43 + 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 43 + 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 43 + 14.1. Normative References . . . . . . . . . . . . . . . . . . 44 + 14.2. Informative References . . . . . . . . . . . . . . . . . 45 + Appendix A. API Mapping Template . . . . . . . . . . . . . . . . 46 + Appendix B. Additional Properties . . . . . . . . . . . . . . . 47 + B.1. Properties Affecting Sorting of Branches . . . . . . . . 47 + Appendix C. Reasons for errors . . . . . . . . . . . . . . . . . 47 + Appendix D. Existing Implementations . . . . . . . . . . . . . . 48 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 49 1. Introduction The Transport Services architecture [I-D.ietf-taps-arch] defines a system that allows applications to use transport networking protocols flexibly. The interface such a system exposes to applications is defined as the Transport Services API [I-D.ietf-taps-interface]. This API is designed to be generic across multiple transport protocols and sets of protocols features. @@ -709,81 +707,87 @@ This section covers the dynamic aspect of connection establishment. The tree described above is a useful conceptual and architectural model. However, an implementation is unable to know the full tree before it is formed and many of the possible branches ultimately might not be used. There are three different approaches to racing the attempts for different nodes of the connection establishment tree: - 1. Immediate + 1. Simultaneous - 2. Delayed + 2. Staggered 3. Failover Each approach is appropriate in different use-cases and branch types. However, to avoid consuming unnecessary network resources, - implementations should not use immediate racing as a default + implementations should not use simultaneous racing as a default approach. The timing algorithms for racing should remain independent across branches of the tree. Any timers or racing logic is isolated to a given parent node, and is not ordered precisely with regards to other children of other nodes. -4.2.1. Immediate +4.2.1. Simultaneous - Immediate racing is when multiple alternate branches are started + Simultaneous racing is when multiple alternate branches are started without waiting for any one branch to make progress before starting the next alternative. This means the attempts are effectively - simultaneous. Immediate racing should be avoided by implementations, - since it consumes extra network resources and establishes state that - might not be used. + simultaneous. Simultaneous racing should be avoided by + implementations, since it consumes extra network resources and + establishes state that might not be used. -4.2.2. Delayed +4.2.2. Staggered - Delayed racing can be used whenever a single node of the tree has + Staggered racing can be used whenever a single node of the tree has multiple child nodes. Based on the order determined when building the tree, the first child node will be initiated immediately, followed by the next child node after some delay. Once that second child node is initiated, the third child node (if present) will begin after another delay, and so on until all child nodes have been initiated, or one of the child nodes successfully completes its negotiation. - Delayed racing attempts occur in parallel. Implementations should - not terminate an earlier child connection attempt upon starting a - secondary child. + Staggered racing attempts can proceed in parallel. Implementations + should not terminate an earlier child connection attempt upon + starting a secondary child. - The delay between starting child nodes should be based on the - properties of the previously started child node. For example, if the - first child represents an IP address with a known route, and the - second child represents another IP address, the delay between - starting the first and second IP addresses can be based on the - expected retransmission cadence for the first child's connection - (derived from historical round-trip-time). Alternatively, if the - first child represents a branch on a Wi-Fi interface, and the second - child represents a branch on an LTE interface, the delay should be - based on the expected time in which the branch for the first - interface would be able to establish a connection, based on link - quality and historical round-trip-time. + If a child node fails to connect before the delay time has expired + for the next child, the next child should be started immediately. - Any delay should have a defined minimum and maximum value based on - the branch type. Generally, branches between paths and protocols - should have longer delays than branches between derived endpoints. - The maximum delay should be considered with regards to how long a - user is expected to wait for the connection to complete. + Staggered racing between IP addresses for a generic Connection should + follow the Happy Eyeballs algorithm described in [RFC8305]. + [RFC8421] provides guidance for racing when performing Interactive + Connectivity Establishment (ICE). - If a child node fails to connect before the delay timer has fired for - the next child, the next child should be started immediately. + Generally, the delay before starting a given child node ought to be + based on the length of time the previously started child node is + expected to take before it succeeds or makes progress in connection + establishment. Algorithms like Happy Eyeballs choose a delay based + on how long the transport connection handshake is expected to take. + When performing staggered races in multiple branch types (such as + racing between network interfaces, and then racing between IP + addresses), a longer delay may be chosen for some branch types. For + example, when racing between network interfaces, the delay should + also take into account the amount of time it takes to prepare the + network interface (such as radio association) and name resolution + over that interface, in addition to the delay that would be added for + a single transport connection handshake. + + Since the staggered delay can be chosen based on dynamic information, + such as predicted round-trip time, implementations should define + upper and lower bounds for delay times. These bounds are + implementation-specific, and may differ based on which branch type is + being used. 4.2.3. Failover If an implementation or application has a strong preference for one branch over another, the branching node may choose to wait until one child has failed before starting the next. Failure of a leaf node is determined by its protocol negotiation failing or timing out; failure of a parent branching node is determined by all of its children failing. @@ -1092,25 +1096,32 @@ content it wants to receive at one time. If a Connection becomes finished before a requested Receive action can be satisfied, the implementation should deliver any partial Message content outstanding, or if none is available, an indication that there will be no more received Messages. 5.3. Handling of data for fast-open protocols Several protocols allow sending higher-level protocol or application - data within the first packet of their protocol establishment, such as - TCP Fast Open [RFC7413] and TLS 1.3 [RFC8446]. This approach is - referred to as sending Zero-RTT (0-RTT) data. This is a desirable - property, but poses challenges to an implementation that uses racing - during connection establishment. + data during their protocol establishment, such as TCP Fast Open + [RFC7413] and TLS 1.3 [RFC8446]. This approach is referred to as + sending Zero-RTT (0-RTT) data. This is a desirable property, but + poses challenges to an implementation that uses racing during + connection establishment. + + The amount of data that can be sent as 0-RTT data varies by protocol + and can be queried by the application using the "Maximum Message Size + Concurrent with Connection Establishment" Connection Property. An + implementation can set this property according to the protocols that + it will race based on the given Selection Properties when the + application requests to establish a connection. If the application has 0-RTT data to send in any protocol handshakes, it needs to provide this data before the handshakes have begun. When racing, this means that the data should be provided before the process of connection establishment has begun. If the application wants to send 0-RTT data, it must indicate this to the implementation by setting the "Safely Replayable" send parameter to true when sending the data. In general, 0-RTT data may be replayed (for example, if a TCP SYN contains data, and the SYN is retransmitted, the data will be retransmitted as well but may be considered as a new @@ -1331,39 +1343,60 @@ These Pooled Connections can use multiple connections or multiple streams of multi-streaming connections between endpoints, as long as all of these satisfy the requirements, and prohibitions specified in the Selection Properties of the Pooled Connection. This enables implementations to realize transparent connection coalescing, connection migration, and to perform per-message endpoint and path selection by choosing among these underlying connections. 7.2. Handling Path Changes - When a path change occurs, the Transport Services implementation is - responsible for notifying Protocol Instances in the Protocol Stack. - If the Protocol Stack includes a transport protocol that supports - multipath connectivity, an update to the available paths should - inform the Protocol Instance of the new set of paths that are - permissible based on the Selection Properties passed by the - application. A multipath protocol can establish new subflows over - new paths, and should tear down subflows over paths that are no - longer available. Pooled Connections Section 7.1 may add or remove - underlying transport connections in a similar manner. If the - Protocol Stack includes a transport protocol that does not support - multipath, but support migrating between paths, the update to - available paths can be used as the trigger to migrating the - connection. For protocols that do not support multipath or - migration, the Protocol Instances may be informed of the path change, - but should not be forcibly disconnected if the previously used path - becomes unavailable. An exception to this case is if the System - Policy changes to prohibit traffic from the Connection based on its - properties, in which case the Protocol Stack should be disconnected. + When a path change occurs, e.g., when the IP address of an interface + changes or a new interface becomes available, the Transport Services + implementation is responsible for notifying the application of the + change. The path change may interrupt connectivity on a path for an + active connection or provide an opportunity for a transport that + supports multipath or migration to adapt to the new paths. + + For protocols that do not support multipath or migration, the + Protocol Instances should be informed of the path change, but should + not be forcibly disconnected if the previously used path becomes + unavailable. + + If the Protocol Stack includes a transport protocol that also + supports multipath connectivity with migration support, the Transport + Services implementation should also inform the Protocol Instance of + potentially new paths that become permissible based on the Selection + Properties passed by the application. A protocol can then establish + new subflows over new paths while an active path is still available + or, if migration is supported, also after a break has been detected, + and should attempt to tear down subflows over paths that are no + longer used. The Transport Services API provides an interface to set + a multipath policy that indicates when and how different paths should + be used. However, detailed handling of these policies is still + implementation-specific. The decision about when to create a new + path or to announce a new path or set of paths to the remote + endpoint, e.g., in the form of additional IP addresses, is + implementation-specific or could be be supported by future API + extensions. If the Protocol Stack includes a transport protocol that + does not support multipath, but does support migrating between paths, + the update to the set of available paths can trigger the connection + to be migrated. + + In case of Pooled Connections Section 7.1, the transport system may + add connections over new paths or different protocols to the pool if + permissible based on the multipath policy and Selection Properties. + In case a previously used path becomes unavailable, the transport + system may disconnect all connections that require this path, but + should not disconnect the pooled connection object exposed to the + application. The strategy how is implementation-specific, but should + be consistent with the behavior of multipath transports. 8. Implementing Connection Termination With TCP, when an application closes a connection, this means that it has no more data to send (but expects all data that has been handed over to be reliably delivered). However, with TCP only, "close" does not mean that the application will stop receiving data. This is related to TCP's ability to support half-closed connections. SCTP is an example of a protocol that does not support such half- @@ -1490,26 +1523,25 @@ over a relatively short time interval. For such values, the implementation should remove them from the cache more quickly, or treat older values with less confidence/weight. [I-D.ietf-tcpm-2140bis] provides guidance about sharing of TCP Control Block information between connections on initialization. 10. Specific Transport Protocol Considerations Each protocol that can run as part of a Transport Services - implementation defines both its API mapping as well as implementation - details. API mappings for a protocol apply most to Connections in - which the given protocol is the "top" of the Protocol Stack. For - example, the mapping of the "Send" function for TCP applies to - Connections in which the application directly sends over TCP. If - HTTP/2 is used on top of TCP, the HTTP/2 mappings take precendence. + implementation should have a well-defined API mapping. API mappings + for a protocol apply most to Connections in which the given protocol + is the "top" of the Protocol Stack. For example, the mapping of the + "Send" function for TCP applies to Connections in which the + application directly sends over TCP. Each protocol has a notion of Connectedness. Possible values for Connectedness are: * Unconnected. Unconnected protocols do not establish explicit state between endpoints, and do not perform a handshake during Connection establishment. * Connected. Connected protocols establish state between endpoints, and perform a handshake during Connection establishment. The @@ -1537,24 +1569,27 @@ sent and received either as complete or partial Messages. Maximum Message lengths can be defined, and Messages can be partially reliable. Below, terms in capitals with a dot (e.g., "CONNECT.SCTP") refer to the primitives with the same name in section 4 of [RFC8303]. For further implementation details, the description of these primitives in [RFC8303] points to section 3 of [RFC8303] and section 3 of [RFC8304], which refers back to the relevant specifications for each protocol. This back-tracking method applies to all elements of - [I-D.ietf-taps-minset] (see appendix D of [I-D.ietf-taps-interface]): - they are listed in appendix A of [I-D.ietf-taps-minset] with an - implementation hint in the same style, pointing back to section 4 of - [RFC8303]. + [RFC8923] (see appendix D of [I-D.ietf-taps-interface]): they are + listed in appendix A of [RFC8923] with an implementation hint in the + same style, pointing back to section 4 of [RFC8303]. + + This document defines the API mappings for protocols defined in + [RFC8923]. Other protocol mappings can be provided as separate + documents, following the mapping template Appendix A. 10.1. TCP Connectedness: Connected Data Unit: Byte-stream API mappings for TCP are as follows: Connection Object: TCP connections between two hosts map directly to @@ -1605,21 +1640,31 @@ Close: Calling "Close" on a TCP Connection indicates that the Connection should be gracefully closed (CLOSE.TCP) by sending a FIN to the peer and waiting for a FIN-ACK before delivering the "Closed" event. Abort: Calling "Abort" on a TCP Connection indicates that the Connection should be immediately closed by sending a RST to the peer (ABORT.TCP). -10.2. UDP +10.2. MPTCP + + Connectedness: Connected + + Data Unit: Byte-stream + + API mappings for MPTCP are identical to TCP. MPTCP adds support for + multipath properties, such as "Multi-Paths Transport" and "Policy for + using Multi-Path Transports". + +10.3. UDP Connectedness: Unconnected Data Unit: Datagram API mappings for UDP are as follows: Connection Object: UDP connections represent a pair of specific IP addresses and ports on two hosts. @@ -1662,21 +1707,33 @@ "Received", each of which represents a single datagram received in a UDP packet. Upon receiving a UDP datagram, the ECN flag from the IP header can be obtained (GET_ECN.UDP(-Lite)). Close: Calling "Close" on a UDP Connection (ABORT.UDP(-Lite)) releases the local port reservation. Abort: Calling "Abort" on a UDP Connection (ABORT.UDP(-Lite)) is identical to calling "Close". -10.3. UDP Multicast Receive +10.4. UDP-Lite + + Connectedness: Unconnected + + Data Unit: Datagram + + API mappings for UDP-Lite are identical to UDP. Properties that + require checksum coverage are not supported by UDP-Lite, such as + "Corruption Protection Length", "Full Checksum Coverage on Sending", + "Required Minimum Corruption Protection Coverage for Receiving", and + "Full Checksum Coverage on Receiving". + +10.5. UDP Multicast Receive Connectedness: Unconnected Data Unit: Datagram API mappings for Receiving Multicast UDP are as follows: Connection Object: Established UDP Multicast Receive connections represent a pair of specific IP addresses and ports. The "unidirectional receive" transport property is required, and the @@ -1728,212 +1785,21 @@ a UDP packet. Upon receiving a UDP datagram, the ECN flag from the IP header can be obtained (GET_ECN.UDP(-Lite)). Close: Calling "Close" on a UDP Multicast Receive Connection (ABORT.UDP(-Lite)) releases the local port reservation and leaves the group. Abort: Calling "Abort" on a UDP Multicast Receive Connection (ABORT.UDP(-Lite)) is identical to calling "Close". -10.4. TLS - - The mapping of a TLS stream abstraction into the application is - equivalent to the contract provided by TCP (see Section 10.1), and - builds upon many of the actions of TCP connections. - - Connectedness: Connected - - Data Unit: Byte-stream - - Connection Object: Connection objects represent a single TLS - connection running over a TCP connection between two hosts. - - Initiate: Calling "Initiate" on a TLS Connection causes it to first - initiate a TCP connection. Once the TCP protocol is Ready, the - TLS handshake will be performed as a client (starting by sending a - "client_hello", and so on). - - InitiateWithSend: Early safely replayable data is supported by TLS - 1.3, and sends encrypted application data in the first TLS message - when performing session resumption. For older versions of TLS, or - if a session is not being resumed, the initial data will be - delayed until the TLS handshake is complete. TCP Fast Open can - also be enabled automatically. - - Ready: A TLS Connection is ready once the underlying TCP connection - is Ready, and TLS handshake is also complete and keys have been - established to encrypt application data. - - InitiateError: In addition to TCP initiation errors, TLS can - generate errors during its handshake. Examples of error include a - failure of the peer to successfully authenticate, the peer - rejecting the local authentication, or a failure to match versions - or algorithms. - - ConnectionError: TLS connections will generate TCP errors, or errors - due to failures to rekey or decrypt received messages. - - Listen: Calling "Listen" for TLS listens on TCP, and sets up - received connections to perform server-side TLS handshakes. - - ConnectionReceived: TLS Listeners will deliver new connections once - they have successfully completed both TCP and TLS handshakes. - - Clone: As with TCP, calling "Clone" on a TLS Connection creates a - new Connection with equivalent parameters. The two Connections - are otherwise independent. - - Send: Like TCP, TLS does not preserve message boundaries. Although - application data is framed natively in TLS, there is not a general - guarantee that these TLS messages represent semantically - meaningful application stream boundaries. Rather, sending data on - a TLS Connection only guarantees that the application data will be - transmitted in an encrypted form. Marking Messages as Final - causes a "close_notify" to be generated once the data has been - written. - - Receive: Like TCP, TLS delivers a stream of bytes without any - Message delineation. The data is decrypted prior to being - delivered to the application. If a "close_notify" is received, - the stream-wide Message will be delivered with EndOfMessage set. - - Close: Calling "Close" on a TLS Connection indicates that the - Connection should be gracefully closed by sending a "close_notify" - to the peer and waiting for a corresponding "close_notify" before - delivering the "Closed" event. - - Abort: Calling "Abort" on a TCP Connection indicates that the - Connection should be immediately closed by sending a - "close_notify", optionally preceded by "user_canceled", to the - peer. Implementations do not need to wait to receive - "close_notify" before delivering the "Closed" event. - -10.5. DTLS - - DTLS follows the same behavior as TLS (Section 10.4), with the - notable exception of not inheriting behavior directly from TCP. - Differences from TLS are detailed below, and all cases not explicitly - mentioned should be considered the same as TLS. - - Connectedness: Connected - - Data Unit: Datagram - - Connection Object: Connection objects represent a single DTLS - connection running over a set of UDP ports between two hosts. - - Initiate: Calling "Initiate" on a DTLS Connection causes it reserve - a UDP local port, and begin sending handshake messages to the peer - over UDP. These messages are reliable, and will be automatically - retransmitted. - - Ready: A DTLS Connection is ready once the TLS handshake is complete - and keys have been established to encrypt application data. - - Send: Sending over DTLS does preserve message boundaries in the same - way that UDP datagrams do. Marking a Message as Final does send a - "close_notify" like TLS. - - Receive: Receiving over DTLS delivers one decrypted Message for each - received DTLS datagram. If a "close_notify" is received, a - Message will be delivered that is marked as Final. - -10.6. HTTP - - HTTP requests and responses map naturally into Messages, since they - are delineated chunks of data with metadata that can be sent over a - transport. To that end, HTTP can be seen as the most prevalent - framing protocol that runs on top of streams like TCP, TLS, etc. - - In order to use a transport Connection that provides HTTP Message - support, the establishment and closing of the connection can be - treated as it would without the framing protocol. Sending and - receiving of Messages, however, changes to treat each Message as a - well-delineated HTTP request or response, with the content of the - Message representing the body, and the Headers being provided in - Message metadata. - - Connectedness: Multiplexing Connected - - Data Unit: Message - Connection Object: Connection objects represent a flow of HTTP - messages between a client and a server, which may be an HTTP/1.1 - connection over TCP, or a single stream in an HTTP/2 connection. - - Initiate: Calling "Initiate" on an HTTP connection intiates a TCP or - TLS connection as a client. - - Clone: Calling "Clone" on an HTTP Connection opens a new stream on - an existing HTTP/2 connection when possible. If the underlying - version does not support multiplexed streams, calling "Clone" - simply creates a new parallel connection. - - Send: When an application sends an HTTP Message, it is expected to - provide HTTP header values as a MessageContext in a canonical - form, along with any associated HTTP message body as the Message - data. The HTTP header values are encoded in the specific version - format upon sending. - - Receive: HTTP Connections deliver Messages in which HTTP header - values attached to MessageContexts, and HTTP bodies in Message - data. - - Close: Calling "Close" on an HTTP Connection will only close the - underlying TLS or TCP connection if the HTTP version does not - support multiplexing. For HTTP/2, for example, closing the - connection only closes a specific stream. - -10.7. QUIC - - QUIC provides a multi-streaming interface to an encrypted transport. - Each stream can be viewed as equivalent to a TLS stream over TCP, so - a natural mapping is to present each QUIC stream as an individual - Connection. The protocol for the stream will be considered Ready - whenever the underlying QUIC connection is established to the point - that this stream's data can be sent. For streams after the first - stream, this will likely be an immediate operation. - - Closing a single QUIC stream, presented to the application as a - Connection, does not imply closing the underlying QUIC connection - itself. Rather, the implementation may choose to close the QUIC - connection once all streams have been closed (often after some - timeout), or after an individual stream Connection sends an Abort. - - Connectedness: Multiplexing Connected - - Data Unit: Stream - - Connection Object: Connection objects represent a single QUIC stream - on a QUIC connection. - -10.8. HTTP/2 transport - - Similar to QUIC (Section 10.7), HTTP/2 provides a multi-streaming - interface. This will generally use HTTP as the unit of Messages over - the streams, in which each stream can be represented as a transport - Connection. The lifetime of streams and the HTTP/2 connection should - be managed as described for QUIC. - - It is possible to treat each HTTP/2 stream as a raw byte-stream - instead of a carrier for HTTP messages, in which case the Messages - over the streams can be represented similarly to the TCP stream (one - Message per direction, see Section 10.1). - - Connectedness: Multiplexing Connected - - Data Unit: Stream - - Connection Object: Connection objects represent a single HTTP/2 - stream on a HTTP/2 connection. - -10.9. SCTP +10.6. SCTP Connectedness: Connected Data Unit: Message API mappings for SCTP are as follows: Connection Object: Connection objects represent a flow of SCTP messages between a client and a server, which may be an SCTP association or a stream in a SCTP association. How to map @@ -2093,46 +1959,38 @@ Sciences Research Council under grant EP/R04144X/1. This work has been supported by the Research Council of Norway under its "Toppforsk" programme through the "OCARINA" project. Thanks to Stuart Cheshire, Josh Graessley, David Schinazi, and Eric Kinnear for their implementation and design efforts, including Happy Eyeballs, that heavily influenced this work. 14. References - 14.1. Normative References [I-D.ietf-taps-arch] Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G., Perkins, C., Tiesel, P., and C. Wood, "An Architecture for Transport Services", Work in Progress, Internet-Draft, - draft-ietf-taps-arch-07, 9 March 2020, + draft-ietf-taps-arch-08, 13 July 2020, . + 08.txt>. [I-D.ietf-taps-interface] Trammell, B., Welzl, M., Enghardt, T., Fairhurst, G., Kuehlewind, M., Perkins, C., Tiesel, P., Wood, C., and T. Pauly, "An Abstract Application Layer Interface to Transport Services", Work in Progress, Internet-Draft, - draft-ietf-taps-interface-06, 9 March 2020, - . - - [I-D.ietf-taps-minset] - Welzl, M. and S. Gjessing, "A Minimal Set of Transport - Services for End Systems", Work in Progress, Internet- - Draft, draft-ietf-taps-minset-11, 27 September 2018, + draft-ietf-taps-interface-09, 27 July 2020, . + interface-09.txt>. [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, . [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext Transfer Protocol Version 2 (HTTP/2)", RFC 7540, DOI 10.17487/RFC7540, May 2015, . @@ -2150,32 +2008,42 @@ [RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the User Datagram Protocol (UDP) and Lightweight UDP (UDP- Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, . [RFC8305] Schinazi, D. and T. Pauly, "Happy Eyeballs Version 2: Better Connectivity Using Concurrency", RFC 8305, DOI 10.17487/RFC8305, December 2017, . + [RFC8421] Martinsen, P., Reddy, T., and P. Patil, "Guidelines for + Multihomed and IPv4/IPv6 Dual-Stack Interactive + Connectivity Establishment (ICE)", BCP 217, RFC 8421, + DOI 10.17487/RFC8421, July 2018, + . + [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, . + [RFC8923] Welzl, M. and S. Gjessing, "A Minimal Set of Transport + Services for End Systems", RFC 8923, DOI 10.17487/RFC8923, + October 2020, . + 14.2. Informative References [I-D.ietf-quic-transport] Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed and Secure Transport", Work in Progress, Internet-Draft, - draft-ietf-quic-transport-29, 9 June 2020, + draft-ietf-quic-transport-32, 20 October 2020, . + transport-32.txt>. [I-D.ietf-tcpm-2140bis] Touch, J., Welzl, M., and S. Islam, "TCP Control Block Interdependence", Work in Progress, Internet-Draft, draft- ietf-tcpm-2140bis-05, 29 April 2020, . [NEAT-flow-mapping] "Transparent Flow Mapping for NEAT", Workshop on Future of Internet Transport (FIT 2017) , 2017. @@ -2203,52 +2071,87 @@ (Diffserv) and Real-Time Communication", RFC 7657, DOI 10.17487/RFC7657, November 2015, . [RFC8445] Keranen, A., Holmberg, C., and J. Rosenberg, "Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal", RFC 8445, DOI 10.17487/RFC8445, July 2018, . -Appendix A. Additional Properties +Appendix A. API Mapping Template + + Any protocol mapping for the Transport Services API should follow a + common template. + + Connectedness: (Unconnected/Connected/Multiplexing Connected) + + Data Unit: (Byte-stream/Datagram/Message) + + Connection Object: + + Initiate: + + InitiateWithSend: + + Ready: + + InitiateError: + + ConnectionError: + + Listen: + + ConnectionReceived: + + Clone: + + Send: + + Receive: + + Close: + + Abort: + +Appendix B. Additional Properties This appendix discusses implementation considerations for additional parameters and properties that could be used to enhance transport protocol and/or path selection, or the transmission of messages given a Protocol Stack that implements them. These are not part of the interface, and may be removed from the final document, but are presented here to support discussion within the TAPS working group as to whether they should be added to a future revision of the base specification. -A.1. Properties Affecting Sorting of Branches +B.1. Properties Affecting Sorting of Branches In addition to the Protocol and Path Selection Properties discussed in Section 4.1.5, the following properties under discussion can influence branch sorting: * Bounds on Send or Receive Rate: If the application indicates a bound on the expected Send or Receive bitrate, an implementation may prefer a path that can likely provide the desired bandwidth, based on cached maximum throughput, see Section 9.2. The application may know the Send or Receive Bitrate from metadata in adaptive HTTP streaming, such as MPEG-DASH. * Cost Preferences: If the application indicates a preference to avoid expensive paths, and some paths are associated with a monetary cost, an implementation should decrease the ranking of such paths. If the application indicates that it prohibits using expensive paths, paths that are associated with a cost should be purged from the decision tree. -Appendix B. Reasons for errors +Appendix C. Reasons for errors The Transport Services API [I-D.ietf-taps-interface] allows for the several generic error types to specify a more detailed reason as to why an error occurred. This appendix lists some of the possible reasons. * InvalidConfiguration: The transport properties and endpoints provided by the application are either contradictory or incomplete. Examples include the lack of a remote endpoint on an active open or using a multicast group address while not @@ -2279,21 +2182,21 @@ contradictory to the transport properties or they can not be satisfied by the transport system. * DeframingFailed: The data that was received by the underlying protocol stack could not be deframed. * ConnectionAborted: The connection was aborted by the peer. * Timeout: Delivery of a message was not possible after a timeout. -Appendix C. Existing Implementations +Appendix D. Existing Implementations This appendix gives an overview of existing implementations, at the time of writing, of transport systems that are (to some degree) in line with this document. * Apple's Network.framework: - Network.framework is a transport-level API built for C, Objective-C, and Swift. It a connect-by-name API that supports transport security protocols. It provides userspace