draft-ietf-taps-minset-08.txt | draft-ietf-taps-minset-09.txt | |||
---|---|---|---|---|
TAPS M. Welzl | TAPS M. Welzl | |||
Internet-Draft S. Gjessing | Internet-Draft S. Gjessing | |||
Intended status: Informational University of Oslo | Intended status: Informational University of Oslo | |||
Expires: March 9, 2019 September 5, 2018 | Expires: March 17, 2019 September 13, 2018 | |||
A Minimal Set of Transport Services for End Systems | A Minimal Set of Transport Services for End Systems | |||
draft-ietf-taps-minset-08 | draft-ietf-taps-minset-09 | |||
Abstract | Abstract | |||
This draft recommends a minimal set of Transport Services offered by | This draft recommends a minimal set of Transport Services offered by | |||
end systems, and gives guidance on choosing among the available | end systems, and gives guidance on choosing among the available | |||
mechanisms and protocols. It is based on the set of transport | mechanisms and protocols. It is based on the set of transport | |||
features in RFC 8303. | features in RFC 8303. | |||
Status of This Memo | Status of This Memo | |||
skipping to change at page 1, line 33 ¶ | skipping to change at page 1, line 33 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on March 9, 2019. | This Internet-Draft will expire on March 17, 2019. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
3. The Minimal Set of Transport Features . . . . . . . . . . . . 5 | 3. Deriving the minimal set . . . . . . . . . . . . . . . . . . 5 | |||
3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 5 | 4. The Reduced Set of Transport Features . . . . . . . . . . . . 6 | |||
3.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 8 | 4.1. CONNECTION Related Transport Features . . . . . . . . . . 7 | |||
3.2.1. Connection groups . . . . . . . . . . . . . . . . . . 8 | 4.2. DATA Transfer Related Transport Features . . . . . . . . 8 | |||
3.2.2. Individual connections . . . . . . . . . . . . . . . 10 | 4.2.1. Sending Data . . . . . . . . . . . . . . . . . . . . 8 | |||
3.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 10 | 4.2.2. Receiving Data . . . . . . . . . . . . . . . . . . . 9 | |||
3.3.1. Sending Data . . . . . . . . . . . . . . . . . . . . 10 | 4.2.3. Errors . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
3.3.2. Receiving Data . . . . . . . . . . . . . . . . . . . 11 | 5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 | 5.1. Sending Messages, Receiving Bytes . . . . . . . . . . . . 9 | |||
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 | 5.2. Stream Schedulers Without Streams . . . . . . . . . . . . 10 | |||
6. Security Considerations . . . . . . . . . . . . . . . . . . . 12 | 5.3. Early Data Transmission . . . . . . . . . . . . . . . . . 11 | |||
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 | 5.4. Sender Running Dry . . . . . . . . . . . . . . . . . . . 12 | |||
7.1. Normative References . . . . . . . . . . . . . . . . . . 12 | 5.5. Capacity Profile . . . . . . . . . . . . . . . . . . . . 12 | |||
7.2. Informative References . . . . . . . . . . . . . . . . . 13 | 5.6. Security . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
Appendix A. Deriving the minimal set . . . . . . . . . . . . . . 14 | 5.7. Packet Size . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
A.1. Step 1: Categorization -- The Superset of Transport | 6. The Minimal Set of Transport Features . . . . . . . . . . . . 14 | |||
Features . . . . . . . . . . . . . . . . . . . . . . . . 15 | 6.1. ESTABLISHMENT, AVAILABILITY and TERMINATION . . . . . . . 14 | |||
A.1.1. CONNECTION Related Transport Features . . . . . . . . 17 | 6.2. MAINTENANCE . . . . . . . . . . . . . . . . . . . . . . . 17 | |||
A.1.2. DATA Transfer Related Transport Features . . . . . . 33 | 6.2.1. Connection groups . . . . . . . . . . . . . . . . . . 18 | |||
A.2. Step 2: Reduction -- The Reduced Set of Transport | 6.2.2. Individual connections . . . . . . . . . . . . . . . 19 | |||
Features . . . . . . . . . . . . . . . . . . . . . . . . 38 | 6.3. DATA Transfer . . . . . . . . . . . . . . . . . . . . . . 20 | |||
A.2.1. CONNECTION Related Transport Features . . . . . . . . 39 | 6.3.1. Sending Data . . . . . . . . . . . . . . . . . . . . 20 | |||
A.2.2. DATA Transfer Related Transport Features . . . . . . 40 | 6.3.2. Receiving Data . . . . . . . . . . . . . . . . . . . 21 | |||
A.3. Step 3: Discussion . . . . . . . . . . . . . . . . . . . 41 | 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 | |||
A.3.1. Sending Messages, Receiving Bytes . . . . . . . . . . 41 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 | |||
A.3.2. Stream Schedulers Without Streams . . . . . . . . . . 42 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 21 | |||
A.3.3. Early Data Transmission . . . . . . . . . . . . . . . 43 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
A.3.4. Sender Running Dry . . . . . . . . . . . . . . . . . 44 | 10.1. Normative References . . . . . . . . . . . . . . . . . . 22 | |||
A.3.5. Capacity Profile . . . . . . . . . . . . . . . . . . 44 | 10.2. Informative References . . . . . . . . . . . . . . . . . 22 | |||
A.3.6. Security . . . . . . . . . . . . . . . . . . . . . . 45 | Appendix A. The Superset of Transport Features . . . . . . . . . 24 | |||
A.3.7. Packet Size . . . . . . . . . . . . . . . . . . . . . 45 | A.1. CONNECTION Related Transport Features . . . . . . . . . . 25 | |||
Appendix B. Revision information . . . . . . . . . . . . . . . . 46 | A.2. DATA Transfer Related Transport Features . . . . . . . . 41 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 47 | A.2.1. Sending Data . . . . . . . . . . . . . . . . . . . . 41 | |||
A.2.2. Receiving Data . . . . . . . . . . . . . . . . . . . 45 | ||||
A.2.3. Errors . . . . . . . . . . . . . . . . . . . . . . . 46 | ||||
Appendix B. Revision information . . . . . . . . . . . . . . . . 47 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 48 | ||||
1. Introduction | 1. Introduction | |||
Currently, the set of transport services that most applications use | Currently, the set of transport services that most applications use | |||
is based on TCP and UDP (and protocols that are layered on top of | is based on TCP and UDP (and protocols that are layered on top of | |||
them); this limits the ability for the network stack to make use of | them); this limits the ability for the network stack to make use of | |||
features of other transport protocols. For example, if a protocol | features of other transport protocols. For example, if a protocol | |||
supports out-of-order message delivery but applications always assume | supports out-of-order message delivery but applications always assume | |||
that the network provides an ordered bytestream, then the network | that the network provides an ordered bytestream, then the network | |||
stack can not immediately deliver a message that arrives out-of- | stack can not immediately deliver a message that arrives out-of- | |||
skipping to change at page 3, line 15 ¶ | skipping to change at page 3, line 19 ¶ | |||
delay. | delay. | |||
By exposing the transport services of multiple transport protocols, a | By exposing the transport services of multiple transport protocols, a | |||
transport system can make it possible for applications to use these | transport system can make it possible for applications to use these | |||
services without being statically bound to a specific transport | services without being statically bound to a specific transport | |||
protocol. The first step towards the design of such a system was | protocol. The first step towards the design of such a system was | |||
taken by [RFC8095], which surveys a large number of transports, and | taken by [RFC8095], which surveys a large number of transports, and | |||
[RFC8303] as well as [RFC8304], which identify the specific transport | [RFC8303] as well as [RFC8304], which identify the specific transport | |||
features that are exposed to applications by the protocols TCP, | features that are exposed to applications by the protocols TCP, | |||
MPTCP, UDP(-Lite) and SCTP as well as the LEDBAT congestion control | MPTCP, UDP(-Lite) and SCTP as well as the LEDBAT congestion control | |||
mechanism. This memo is based on these documents and follows the | mechanism. LEDBAT was included as the only congestion control | |||
same terminology (also listed below). Because the considered | mechanism in this list because the "low extra delay background | |||
transport protocols conjointly cover a wide range of transport | transport" service that it offers is significantly different from the | |||
features, there is reason to hope that the resulting set (and the | typical service provided by other congestion control mechanisms. | |||
reasoning that led to it) will also apply to many aspects of other | This memo is based on these documents and follows the same | |||
transport protocols that may be in use today, or may be designed in | terminology (also listed below). Because the considered transport | |||
the future. | protocols conjointly cover a wide range of transport features, there | |||
is reason to hope that the resulting set (and the reasoning that led | ||||
to it) will also apply to many aspects of other transport protocols | ||||
that may be in use today, or may be designed in the future. | ||||
By decoupling applications from transport protocols, a transport | By decoupling applications from transport protocols, a transport | |||
system provides a different abstraction level than the Berkeley | system provides a different abstraction level than the Berkeley | |||
sockets interface. As with high- vs. low-level programming | sockets interface [POSIX]. As with high- vs. low-level programming | |||
languages, a higher abstraction level allows more freedom for | languages, a higher abstraction level allows more freedom for | |||
automation below the interface, yet it takes some control away from | automation below the interface, yet it takes some control away from | |||
the application programmer. This is the design trade-off that a | the application programmer. This is the design trade-off that a | |||
transport system developer is facing, and this document provides | transport system developer is facing, and this document provides | |||
guidance on the design of this abstraction level. Some transport | guidance on the design of this abstraction level. Some transport | |||
features are currently rarely offered by APIs, yet they must be | features are currently rarely offered by APIs, yet they must be | |||
offered or they can never be used. Other transport features are | offered or they can never be used. Other transport features are | |||
offered by the APIs of the protocols covered here, but not exposing | offered by the APIs of the protocols covered here, but not exposing | |||
them in an API would allow for more freedom to automate protocol | them in an API would allow for more freedom to automate protocol | |||
usage in a transport system. The minimal set presented in this | usage in a transport system. The minimal set presented here is an | |||
document is an effort to find a middle ground that can be recommended | effort to find a middle ground that can be recommended for transport | |||
for transport systems to implement, on the basis of the transport | systems to implement, on the basis of the transport features | |||
features discussed in [RFC8303]. | discussed in [RFC8303]. | |||
Applications use a wide variety of APIs today. The transport | Applications use a wide variety of APIs today. The transport | |||
features in the minimal set in this document must be reflected in | features in the minimal set in this document must be reflected in | |||
*all* network APIs in order for the underlying functionality to | *all* network APIs in order for the underlying functionality to | |||
become usable everywhere. For example, it does not help an | become usable everywhere. For example, it does not help an | |||
application that talks to a library which offers its own | application that talks to a library which offers its own | |||
communication interface if the underlying Berkeley Sockets API is | communication interface if the underlying Berkeley Sockets API is | |||
extended to offer "unordered message delivery", but the library only | extended to offer "unordered message delivery", but the library only | |||
exposes an ordered bytestream. Both the Berkeley Sockets API and the | exposes an ordered bytestream. Both the Berkeley Sockets API and the | |||
library would have to expose the "unordered message delivery" | library would have to expose the "unordered message delivery" | |||
transport feature (alternatively, there may be ways for certain types | transport feature (alternatively, there may be ways for certain types | |||
of libraries to use this transport feature without exposing it, based | of libraries to use this transport feature without exposing it, based | |||
on knowledge about the applications -- but this is not the general | on knowledge about the applications -- but this is not the general | |||
case). In most situations, in the interest of being as flexible and | case). Similarly, transport protocols such as SCTP offer multi- | |||
efficient as possible, the best choice will be for a library to | streaming, which cannot be utilized, e.g., to prioritize messages | |||
between streams, unless applications communicate the priorities and | ||||
the group of connections upon which these priorities should be | ||||
applied. In most situations, in the interest of being as flexible | ||||
and efficient as possible, the best choice will be for a library to | ||||
expose at least all of the transport features that are recommended as | expose at least all of the transport features that are recommended as | |||
a "minimal set" here. | a "minimal set" here. | |||
This "minimal set" can be implemented "one-sided" over TCP. This | This "minimal set" can be implemented "one-sided" over TCP. This | |||
means that a sender-side transport system can talk to a standard TCP | means that a sender-side transport system can talk to a standard TCP | |||
receiver, and a receiver-side transport system can talk to a standard | receiver, and a receiver-side transport system can talk to a standard | |||
TCP sender. If certain limitations are put in place, the "minimal | TCP sender. If certain limitations are put in place, the "minimal | |||
set" can also be implemented "one-sided" over UDP. | set" can also be implemented "one-sided" over UDP. While the | |||
possibility of such "one-sided" implementation may help deployment, | ||||
it comes at the cost of limiting the set to services that can also be | ||||
provided by TCP (or, with further limitations, UDP). Thus, the | ||||
minimal set of transport features here is applicable for many, but | ||||
not all, applications: some application protocols have requirements | ||||
that are not met by this "minimal set". | ||||
Note that, throughout this document, protocols are meant to be used | ||||
natively. For example, when transport features of UDP, or | ||||
"implementation over" UDP is discussed, this refers to native usage | ||||
of UDP. | ||||
2. Terminology | 2. Terminology | |||
Transport Feature: a specific end-to-end feature that the transport | Transport Feature: a specific end-to-end feature that the transport | |||
layer provides to an application. Examples include | layer provides to an application. Examples include | |||
confidentiality, reliable delivery, ordered delivery, message- | confidentiality, reliable delivery, ordered delivery, message- | |||
versus-stream orientation, etc. | versus-stream orientation, etc. | |||
Transport Service: a set of Transport Features, without an | Transport Service: a set of Transport Features, without an | |||
association to any given framing protocol, which provides a | association to any given framing protocol, which provides a | |||
complete service to an application. | complete service to an application. | |||
Transport Protocol: an implementation that provides one or more | Transport Protocol: an implementation that provides one or more | |||
different transport services using a specific framing and header | different transport services using a specific framing and header | |||
format on the wire. | format on the wire. | |||
Transport Service Instance: an arrangement of transport protocols | Application: an entity that uses a transport layer interface for | |||
with a selected set of features and configuration parameters that | end-to-end delivery of data across the network (this may also be | |||
implements a single transport service, e.g., a protocol stack (RTP | an upper layer protocol or tunnel encapsulation). | |||
over UDP). | ||||
Application: an entity that uses the transport layer for end-to-end | ||||
delivery data across the network (this may also be an upper layer | ||||
protocol or tunnel encapsulation). | ||||
Application-specific knowledge: knowledge that only applications | Application-specific knowledge: knowledge that only applications | |||
have. | have. | |||
Endpoint: an entity that communicates with one or more other | End system: an entity that communicates with one or more other end | |||
endpoints using a transport protocol. | systems using a transport protocol. An end system provides a | |||
Connection: shared state of two or more endpoints that persists | transport layer interface to applications. | |||
across messages that are transmitted between these endpoints. | Connection: shared state of two or more end systems that persists | |||
across messages that are transmitted between these end systems. | ||||
Connection Group: a set of connections which share the same | Connection Group: a set of connections which share the same | |||
configuration (configuring one of them causes all other | configuration (configuring one of them causes all other | |||
connections in the same group to be configured in the same way). | connections in the same group to be configured in the same way). | |||
We call connections that belong to a connection group "grouped", | We call connections that belong to a connection group "grouped", | |||
while "ungrouped" connections are not a part of a connection | while "ungrouped" connections are not a part of a connection | |||
group. | group. | |||
Socket: the combination of a destination IP address and a | Socket: the combination of a destination IP address and a | |||
destination port number. | destination port number. | |||
Moreover, throughout the document, the protocol name "UDP(-Lite)" is | Moreover, throughout the document, the protocol name "UDP(-Lite)" is | |||
used when discussing transport features that are equivalent for UDP | used when discussing transport features that are equivalent for UDP | |||
and UDP-Lite; similarly, the protocol name "TCP" refers to both TCP | and UDP-Lite; similarly, the protocol name "TCP" refers to both TCP | |||
and MPTCP. | and MPTCP. | |||
3. The Minimal Set of Transport Features | 3. Deriving the minimal set | |||
Based on the categorization, reduction, and discussion in Appendix A, | We assume that applications have no specific requirements that need | |||
knowledge about the network, e.g. regarding the choice of network | ||||
interface or the end-to-end path. Even with these assumptions, there | ||||
are certain requirements that are strictly kept by transport | ||||
protocols today, and these must also be kept by a transport system. | ||||
Some of these requirements relate to transport features that we call | ||||
"Functional". | ||||
Functional transport features provide functionality that cannot be | ||||
used without the application knowing about them, or else they violate | ||||
assumptions that might cause the application to fail. For example, | ||||
ordered message delivery is a functional transport feature: it cannot | ||||
be configured without the application knowing about it because the | ||||
application's assumption could be that messages always arrive in | ||||
order. Failure includes any change of the application behavior that | ||||
is not performance oriented, e.g. security. | ||||
"Change DSCP" and "Disable Nagle algorithm" are examples of transport | ||||
features that we call "Optimizing": if a transport system | ||||
autonomously decides to enable or disable them, an application will | ||||
not fail, but a transport system may be able to communicate more | ||||
efficiently if the application is in control of this optimizing | ||||
transport feature. These transport features require application- | ||||
specific knowledge (e.g., about delay/bandwidth requirements or the | ||||
length of future data blocks that are to be transmitted). | ||||
The transport features of IETF transport protocols that do not | ||||
require application-specific knowledge and could therefore be | ||||
utilized by a transport system on its own without involving the | ||||
application are called "Automatable". | ||||
We approach the construction of a minimal set of transport features | ||||
in the following way: | ||||
1. Categorization (Appendix A): the superset of transport features | ||||
from [RFC8303] is presented, and transport features are | ||||
categorized as Functional, Optimizing or Automatable for later | ||||
reduction. | ||||
2. Reduction (Section 4): a shorter list of transport features is | ||||
derived from the categorization in the first step. This removes | ||||
all transport features that do not require application-specific | ||||
knowledge or would result in semantically incorrect behavior if | ||||
they were implemented over TCP or UDP. | ||||
3. Discussion (Section 5): the resulting list shows a number of | ||||
peculiarities that are discussed, to provide a basis for | ||||
constructing the minimal set. | ||||
4. Construction (Section 6): Based on the reduced set and the | ||||
discussion of the transport features therein, a minimal set is | ||||
constructed. | ||||
Following [RFC8303] and retaining its terminology, we divide the | ||||
transport features into two main groups as follows: | ||||
1. CONNECTION related transport features | ||||
- ESTABLISHMENT | ||||
- AVAILABILITY | ||||
- MAINTENANCE | ||||
- TERMINATION | ||||
2. DATA Transfer related transport features | ||||
- Sending Data | ||||
- Receiving Data | ||||
- Errors | ||||
4. The Reduced Set of Transport Features | ||||
By hiding automatable transport features from the application, a | ||||
transport system can gain opportunities to automate the usage of | ||||
network-related functionality. This can facilitate using the | ||||
transport system for the application programmer and it allows for | ||||
optimizations that may not be possible for an application. For | ||||
instance, system-wide configurations regarding the usage of multiple | ||||
interfaces can better be exploited if the choice of the interface is | ||||
not entirely up to the application. Therefore, since they are not | ||||
strictly necessary to expose in a transport system, we do not include | ||||
automatable transport features in the reduced set of transport | ||||
features. This leaves us with only the transport features that are | ||||
either optimizing or functional. | ||||
A transport system should be able to communicate via TCP or UDP if | ||||
alternative transport protocols are found not to work. For many | ||||
transport features, this is possible -- often by simply not doing | ||||
anything when a specific request is made. For some transport | ||||
features, however, it was identified that direct usage of neither TCP | ||||
nor UDP is possible: in these cases, even not doing anything would | ||||
incur semantically incorrect behavior. Whenever an application would | ||||
make use of one of these transport features, this would eliminate the | ||||
possibility to use TCP or UDP. Thus, we only keep the functional and | ||||
optimizing transport features for which an implementation over either | ||||
TCP or UDP is possible in our reduced set. | ||||
The following list contains the transport features from Appendix A, | ||||
reduced using these rules. The "minimal set" derived in this | ||||
document is meant to be implementable "one-sided" over TCP, and, with | ||||
limitations, UDP. In the list, we therefore precede a transport | ||||
feature with "T:" if an implementation over TCP is possible, "U:" if | ||||
an implementation over UDP is possible, and "T,U:" if an | ||||
implementation over either TCP or UDP is possible. | ||||
4.1. CONNECTION Related Transport Features | ||||
ESTABLISHMENT: | ||||
o T,U: Connect | ||||
o T,U: Specify number of attempts and/or timeout for the first | ||||
establishment message | ||||
o T: Configure authentication | ||||
o T: Hand over a message to reliably transfer (possibly multiple | ||||
times) before connection establishment | ||||
o T: Hand over a message to reliably transfer during connection | ||||
establishment | ||||
AVAILABILITY: | ||||
o T,U: Listen | ||||
o T: Configure authentication | ||||
MAINTENANCE: | ||||
o T: Change timeout for aborting connection (using retransmit limit | ||||
or time value) | ||||
o T: Suggest timeout to the peer | ||||
o T,U: Disable Nagle algorithm | ||||
o T,U: Notification of Excessive Retransmissions (early warning | ||||
below abortion threshold) | ||||
o T,U: Specify DSCP field | ||||
o T,U: Notification of ICMP error message arrival | ||||
o T: Change authentication parameters | ||||
o T: Obtain authentication information | ||||
o T,U: Set Cookie life value | ||||
o T,U: Choose a scheduler to operate between streams of an | ||||
association | ||||
o T,U: Configure priority or weight for a scheduler | ||||
o T,U: Disable checksum when sending | ||||
o T,U: Disable checksum requirement when receiving | ||||
o T,U: Specify checksum coverage used by the sender | ||||
o T,U: Specify minimum checksum coverage required by receiver | ||||
o T,U: Specify DF field | ||||
o T,U: Get max. transport-message size that may be sent using a non- | ||||
fragmented IP packet from the configured interface | ||||
o T,U: Get max. transport-message size that may be received from the | ||||
configured interface | ||||
o T,U: Obtain ECN field | ||||
o T,U: Enable and configure a "Low Extra Delay Background Transfer" | ||||
TERMINATION: | ||||
o T: Close after reliably delivering all remaining data, causing an | ||||
event informing the application on the other side | ||||
o T: Abort without delivering remaining data, causing an event | ||||
informing the application on the other side | ||||
o T,U: Abort without delivering remaining data, not causing an event | ||||
informing the application on the other side | ||||
o T,U: Timeout event when data could not be delivered for too long | ||||
4.2. DATA Transfer Related Transport Features | ||||
4.2.1. Sending Data | ||||
o T: Reliably transfer data, with congestion control | ||||
o T: Reliably transfer a message, with congestion control | ||||
o T,U: Unreliably transfer a message | ||||
o T: Configurable Message Reliability | ||||
o T: Ordered message delivery (potentially slower than unordered) | ||||
o T,U: Unordered message delivery (potentially faster than ordered) | ||||
o T,U: Request not to bundle messages | ||||
o T: Specifying a key id to be used to authenticate a message | ||||
o T,U: Request not to delay the acknowledgement (SACK) of a message | ||||
4.2.2. Receiving Data | ||||
o T,U: Receive data (with no message delimiting) | ||||
o U: Receive a message | ||||
o T,U: Information about partial message arrival | ||||
4.2.3. Errors | ||||
This section describes sending failures that are associated with a | ||||
specific call to in the "Sending Data" category (Appendix A.2.1). | ||||
o T,U: Notification of send failures | ||||
o T,U: Notification that the stack has no more user data to send | ||||
o T,U: Notification to a receiver that a partial message delivery | ||||
has been aborted | ||||
5. Discussion | ||||
The reduced set in the previous section exhibits a number of | ||||
peculiarities, which we will discuss in the following. This section | ||||
focuses on TCP because, with the exception of one particular | ||||
transport feature ("Receive a message" -- we will discuss this in | ||||
Section 5.1), the list shows that UDP is strictly a subset of TCP. | ||||
We can first try to understand how to build a transport system that | ||||
can run over TCP, and then narrow down the result further to allow | ||||
that the system can always run over either TCP or UDP (which | ||||
effectively means removing everything related to reliability, | ||||
ordering, authentication and closing/aborting with a notification to | ||||
the peer). | ||||
Note that, because the functional transport features of UDP are -- | ||||
with the exception of "Receive a message" -- a subset of TCP, TCP can | ||||
be used as a replacement for UDP whenever an application does not | ||||
need message delimiting (e.g., because the application-layer protocol | ||||
already does it). This has been recognized by many applications that | ||||
already do this in practice, by trying to communicate with UDP at | ||||
first, and falling back to TCP in case of a connection failure. | ||||
5.1. Sending Messages, Receiving Bytes | ||||
For implementing a transport system over TCP, there are several | ||||
transport features related to sending, but only a single transport | ||||
feature related to receiving: "Receive data (with no message | ||||
delimiting)" (and, strangely, "information about partial message | ||||
arrival"). Notably, the transport feature "Receive a message" is | ||||
also the only non-automatable transport feature of UDP(-Lite) for | ||||
which no implementation over TCP is possible. | ||||
To support these TCP receiver semantics, we define an "Application- | ||||
Framed Bytestream" (AFra-Bytestream). AFra-Bytestreams allow senders | ||||
to operate on messages while minimizing changes to the TCP socket | ||||
API. In particular, nothing changes on the receiver side - data can | ||||
be accepted via a normal TCP socket. | ||||
In an AFra-Bytestream, the sending application can optionally inform | ||||
the transport about message boundaries and required properties per | ||||
message (configurable order and reliability, or embedding a request | ||||
not to delay the acknowledgement of a message). Whenever the sending | ||||
application specifies per-message properties that relax the notion of | ||||
reliable in-order delivery of bytes, it must assume that the | ||||
receiving application is 1) able to determine message boundaries, | ||||
provided that messages are always kept intact, and 2) able to accept | ||||
these relaxed per-message properties. Any signaling of such | ||||
information to the peer is up to an application-layer protocol and | ||||
considered out of scope of this document. | ||||
For example, if an application requests to transfer fixed-size | ||||
messages of 100 bytes with partial reliability, this needs the | ||||
receiving application to be prepared to accept data in chunks of 100 | ||||
bytes. If, then, some of these 100-byte messages are missing (e.g., | ||||
if SCTP with Configurable Reliability is used), this is the expected | ||||
application behavior. With TCP, no messages would be missing, but | ||||
this is also correct for the application, and the possible | ||||
retransmission delay is acceptable within the best-effort service | ||||
model (see [RFC7305], Section 3.5). Still, the receiving application | ||||
would separate the byte stream into 100-byte chunks. | ||||
Note that this usage of messages does not require all messages to be | ||||
equal in size. Many application protocols use some form of Type- | ||||
Length-Value (TLV) encoding, e.g. by defining a header including | ||||
length fields; another alternative is the use of byte stuffing | ||||
methods such as COBS [COBS]. If an application needs message | ||||
numbers, e.g. to restore the correct sequence of messages, these must | ||||
also be encoded by the application itself, as the sequence number | ||||
related transport features of SCTP are not provided by the "minimum | ||||
set" (in the interest of enabling usage of TCP). | ||||
5.2. Stream Schedulers Without Streams | ||||
We have already stated that multi-streaming does not require | ||||
application-specific knowledge. Potential benefits or disadvantages | ||||
of, e.g., using two streams of an SCTP association versus using two | ||||
separate SCTP associations or TCP connections are related to | ||||
knowledge about the network and the particular transport protocol in | ||||
use, not the application. However, the transport features "Choose a | ||||
scheduler to operate between streams of an association" and | ||||
"Configure priority or weight for a scheduler" operate on streams. | ||||
Here, streams identify communication channels between which a | ||||
scheduler operates, and they can be assigned a priority. Moreover, | ||||
the transport features in the MAINTENANCE category all operate on | ||||
assocations in case of SCTP, i.e. they apply to all streams in that | ||||
assocation. | ||||
With only these semantics necessary to represent, the interface to a | ||||
transport system becomes easier if we assume that connections may be | ||||
not only a transport protocol's connection or association, but could | ||||
also be a stream of an existing SCTP association, for example. We | ||||
only need to allow for a way to define a possible grouping of | ||||
connections. Then, all MAINTENANCE transport features can be said to | ||||
operate on connection groups, not connections, and a scheduler | ||||
operates on the connections within a group. | ||||
To be compatible with multiple transport protocols and uniformly | ||||
allow access to both transport connections and streams of a multi- | ||||
streaming protocol, the semantics of opening and closing need to be | ||||
the most restrictive subset of all of the underlying options. For | ||||
example, TCP's support of half-closed connections can be seen as a | ||||
feature on top of the more restrictive "ABORT"; this feature cannot | ||||
be supported because not all protocols used by a transport system | ||||
(including streams of an association) support half-closed | ||||
connections. | ||||
5.3. Early Data Transmission | ||||
There are two transport features related to transferring a message | ||||
early: "Hand over a message to reliably transfer (possibly multiple | ||||
times) before connection establishment", which relates to TCP Fast | ||||
Open [RFC7413], and "Hand over a message to reliably transfer during | ||||
connection establishment", which relates to SCTP's ability to | ||||
transfer data together with the COOKIE-Echo chunk. Also without TCP | ||||
Fast Open, TCP can transfer data during the handshake, together with | ||||
the SYN packet -- however, the receiver of this data may not hand it | ||||
over to the application until the handshake has completed. Also, | ||||
different from TCP Fast Open, this data is not delimited as a message | ||||
by TCP (thus, not visible as a ``message''). This functionality is | ||||
commonly available in TCP and supported in several implementations, | ||||
even though the TCP specification does not explain how to provide it | ||||
to applications. | ||||
A transport system could differentiate between the cases of | ||||
transmitting data "before" (possibly multiple times) or "during" the | ||||
handshake. Alternatively, it could also assume that data that are | ||||
handed over early will be transmitted as early as possible, and | ||||
"before" the handshake would only be used for messages that are | ||||
explicitly marked as "idempotent" (i.e., it would be acceptable to | ||||
transfer them multiple times). | ||||
The amount of data that can successfully be transmitted before or | ||||
during the handshake depends on various factors: the transport | ||||
protocol, the use of header options, the choice of IPv4 and IPv6 and | ||||
the Path MTU. A transport system should therefore allow a sending | ||||
application to query the maximum amount of data it can possibly | ||||
transmit before (or, if exposed, during) connection establishment. | ||||
5.4. Sender Running Dry | ||||
The transport feature "Notification that the stack has no more user | ||||
data to send" relates to SCTP's "SENDER DRY" notification. Such | ||||
notifications can, in principle, be used to avoid having an | ||||
unnecessarily large send buffer, yet ensure that the transport sender | ||||
always has data available when it has an opportunity to transmit it. | ||||
This has been found to be very beneficial for some applications | ||||
[WWDC2015]. However, "SENDER DRY" truly means that the entire send | ||||
buffer (including both unsent and unacknowledged data) has emptied -- | ||||
i.e., when it notifies the sender, it is already too late, the | ||||
transport protocol already missed an opportunity to send data. Some | ||||
modern TCP implementations now include the unspecified | ||||
"TCP_NOTSENT_LOWAT" socket option that was proposed in [WWDC2015], | ||||
which limits the amount of unsent data that TCP can keep in the | ||||
socket buffer; this allows to specify at which buffer filling level | ||||
the socket becomes writable, rather than waiting for the buffer to | ||||
run empty. | ||||
SCTP allows to configure the sender-side buffer too: the automatable | ||||
Transport Feature "Configure send buffer size" provides this | ||||
functionality, but only for the complete buffer, which includes both | ||||
unsent and unacknowledged data. SCTP does not allow to control these | ||||
two sizes separately. It therefore makes sense for a transport | ||||
system to allow for uniform access to "TCP_NOTSENT_LOWAT" as well as | ||||
the "SENDER DRY" notification. | ||||
5.5. Capacity Profile | ||||
The transport features: | ||||
o Disable Nagle algorithm | ||||
o Enable and configure a "Low Extra Delay Background Transfer" | ||||
o Specify DSCP field | ||||
all relate to a QoS-like application need such as "low latency" or | ||||
"scavenger". In the interest of flexibility of a transport system, | ||||
they could therefore be offered in a uniform, more abstract way, | ||||
where a transport system could e.g. decide by itself how to use | ||||
combinations of LEDBAT-like congestion control and certain DSCP | ||||
values, and an application would only specify a general "capacity | ||||
profile" (a description of how it wants to use the available | ||||
capacity). A need for "lowest possible latency at the expense of | ||||
overhead" could then translate into automatically disabling the Nagle | ||||
algorithm. | ||||
In some cases, the Nagle algorithm is best controlled directly by the | ||||
application because it is not only related to a general profile but | ||||
also to knowledge about the size of future messages. For fine-grain | ||||
control over Nagle-like functionality, the "Request not to bundle | ||||
messages" is available. | ||||
5.6. Security | ||||
Both TCP and SCTP offer authentication. TCP authenticates complete | ||||
segments. SCTP allows to configure which of SCTP's chunk types must | ||||
always be authenticated -- if this is exposed as such, it creates an | ||||
undesirable dependency on the transport protocol. For compatibility | ||||
with TCP, a transport system should only allow to configure complete | ||||
transport layer packets, including headers, IP pseudo-header (if any) | ||||
and payload. | ||||
Security is discussed in a separate document | ||||
[I-D.ietf-taps-transport-security]. The minimal set presented in the | ||||
present document excludes all security related transport features | ||||
from Appendix A: "Configure authentication", "Change authentication | ||||
parameters", "Obtain authentication information" and and "Set Cookie | ||||
life value" as well as "Specifying a key id to be used to | ||||
authenticate a message". | ||||
5.7. Packet Size | ||||
UDP(-Lite) has a transport feature called "Specify DF field". This | ||||
yields an error message in case of sending a message that exceeds the | ||||
Path MTU, which is necessary for a UDP-based application to be able | ||||
to implement Path MTU Discovery (a function that UDP-based | ||||
applications must do by themselves). The "Get max. transport-message | ||||
size that may be sent using a non-fragmented IP packet from the | ||||
configured interface" transport feature yields an upper limit for the | ||||
Path MTU (minus headers) and can therefore help to implement Path MTU | ||||
Discovery more efficiently. | ||||
6. The Minimal Set of Transport Features | ||||
Based on the categorization, reduction, and discussion in Section 3, | ||||
this section describes a minimal set of transport features that end | this section describes a minimal set of transport features that end | |||
systems should offer. The described transport system can be | systems should offer. Any configuration based the described minimum | |||
implemented over TCP. Elements of the system that are not marked | set of transport feature can always be realized over TCP but also | |||
with "!UDP" can also be implemented over UDP. | gives the transport system flexibility to choose another transport if | |||
implemented. In the text of this section, "not UDP" is used to | ||||
indicate elements of the system that cannot be implemented over UDP. | ||||
Conversely, all elements of the system that are not marked with "not | ||||
UDP" can also be implemented over UDP. | ||||
The arguments laid out in Appendix A.3 ("discussion") were used to | The arguments laid out in Section 5 ("discussion") were used to make | |||
make the final representation of the minimal set as short, simple and | the final representation of the minimal set as short, simple and | |||
general as possible. There may be situations where these arguments | general as possible. There may be situations where these arguments | |||
do not apply -- e.g., implementers may have specific reasons to | do not apply -- e.g., implementers may have specific reasons to | |||
expose multi-streaming as a visible functionality to applications, or | expose multi-streaming as a visible functionality to applications, or | |||
the restrictive open / close semantics may be problematic under some | the restrictive open / close semantics may be problematic under some | |||
circumstances. In such cases, the representation in Appendix A.2 | circumstances. In such cases, the representation in Section 4 | |||
("reduction") should be considered. | ("reduction") should be considered. | |||
As in Appendix A, Appendix A.2 and [RFC8303], we categorize the | As in Section 3, Section 4 and [RFC8303], we categorize the minimal | |||
minimal set of transport features as 1) CONNECTION related | set of transport features as 1) CONNECTION related (ESTABLISHMENT, | |||
(ESTABLISHMENT, AVAILABILITY, MAINTENANCE, TERMINATION) and 2) DATA | AVAILABILITY, MAINTENANCE, TERMINATION) and 2) DATA Transfer related | |||
Transfer related (Sending Data, Receiving Data, Errors). Here, the | (Sending Data, Receiving Data, Errors). Here, the focus is on | |||
focus is on connections that the transport system offers as an | connections that the transport system offers as an abstraction to the | |||
abstraction to the application, as opposed to connections of | application, as opposed to connections of transport protocols that | |||
transport protocols that the transport system uses. | the transport system uses. | |||
3.1. ESTABLISHMENT, AVAILABILITY and TERMINATION | 6.1. ESTABLISHMENT, AVAILABILITY and TERMINATION | |||
A connection must first be "created" to allow for some initial | A connection must first be "created" to allow for some initial | |||
configuration to be carried out before the transport system can | configuration to be carried out before the transport system can | |||
actively or passively establish communication with a remote endpoint. | actively or passively establish communication with a remote end | |||
All configuration parameters in Section 3.2 can be used initially, | system. All configuration parameters in Section 6.2 can be used | |||
although some of them may only take effect when a connection has been | initially, although some of them may only take effect when a | |||
established with a chosen transport protocol. Configuring a | connection has been established with a chosen transport protocol. | |||
connection early helps a transport system make the right decisions. | Configuring a connection early helps a transport system make the | |||
For example, grouping information can influence the transport system | right decisions. For example, grouping information can influence the | |||
to implement a connection as a stream of a multi-streaming protocol's | transport system to implement a connection as a stream of a multi- | |||
existing association or not. | streaming protocol's existing association or not. | |||
For ungrouped connections, early configuration is necessary because | For ungrouped connections, early configuration is necessary because | |||
it allows the transport system to know which protocols it should try | it allows the transport system to know which protocols it should try | |||
to use. In particular, a transport system that only makes a one-time | to use. In particular, a transport system that only makes a one-time | |||
choice for a particular protocol must know early about strict | choice for a particular protocol must know early about strict | |||
requirements that must be kept, or it can end up in a deadlock | requirements that must be kept, or it can end up in a deadlock | |||
situation (e.g., having chosen UDP and later be asked to support | situation (e.g., having chosen UDP and later be asked to support | |||
reliable transfer). As an example description of how to correctly | reliable transfer). As an example description of how to correctly | |||
handle these cases, we provide the following decision tree (this is | handle these cases, we provide the following decision tree (this is | |||
derived from Appendix A.2.1 excluding authentication, as explained in | derived from Section 4.1 excluding authentication, as explained in | |||
Section 6): | Section 9): | |||
- Will it ever be necessary to offer any of the following? | - Will it ever be necessary to offer any of the following? | |||
* Reliably transfer data | * Reliably transfer data | |||
* Notify the peer of closing/aborting | * Notify the peer of closing/aborting | |||
* Preserve data ordering | * Preserve data ordering | |||
Yes: SCTP or TCP can be used. | Yes: SCTP or TCP can be used. | |||
- Is any of the following useful to the application? | - Is any of the following useful to the application? | |||
* Choosing a scheduler to operate between connections | * Choosing a scheduler to operate between connections | |||
in a group, with the possibility to configure a priority | in a group, with the possibility to configure a priority | |||
skipping to change at page 7, line 4 ¶ | skipping to change at page 16, line 5 ¶ | |||
Yes: UDP-Lite is preferred. | Yes: UDP-Lite is preferred. | |||
No: UDP is preferred. | No: UDP is preferred. | |||
Note that this decision tree is not optimal for all cases. For | Note that this decision tree is not optimal for all cases. For | |||
example, if an application wants to use "Specify checksum coverage | example, if an application wants to use "Specify checksum coverage | |||
used by the sender", which is only offered by UDP-Lite, and | used by the sender", which is only offered by UDP-Lite, and | |||
"Configure priority or weight for a scheduler", which is only offered | "Configure priority or weight for a scheduler", which is only offered | |||
by SCTP, the above decision tree will always choose UDP-Lite, making | by SCTP, the above decision tree will always choose UDP-Lite, making | |||
it impossible to use SCTP's schedulers with priorities between | it impossible to use SCTP's schedulers with priorities between | |||
grouped connections. We caution implementers to be aware of the full | grouped connections. Also, several other factors may influence the | |||
set of trade-offs, for which we recommend consulting the list in | decisions for or against a protocol -- e.g. penetration rates, the | |||
Appendix A.2.1 when deciding how to initialize a connection. | ability to work through NATs, etc. We caution implementers to be | |||
aware of the full set of trade-offs, for which we recommend | ||||
consulting the list in Section 4.1 when deciding how to initialize a | ||||
connection. | ||||
To summarize, the following parameters serve as input for the | To summarize, the following parameters serve as input for the | |||
transport system to help it choose and configure a suitable protocol: | transport system to help it choose and configure a suitable protocol: | |||
o Reliability: a boolean that should be set to true when any of the | o Reliability: a boolean that should be set to true when any of the | |||
following will be useful to the application: reliably transfer | following will be useful to the application: reliably transfer | |||
data; notify the peer of closing/aborting; preserve data ordering. | data; notify the peer of closing/aborting; preserve data ordering. | |||
o Checksum coverage: a boolean to specify whether it will be useful | o Checksum coverage: a boolean to specify whether it will be useful | |||
to the application to specify checksum coverage when sending or | to the application to specify checksum coverage when sending or | |||
receiving. | receiving. | |||
skipping to change at page 7, line 36 ¶ | skipping to change at page 16, line 40 ¶ | |||
application: hand over a message to reliably transfer (possibly | application: hand over a message to reliably transfer (possibly | |||
multiple times) before connection establishment; suggest timeout | multiple times) before connection establishment; suggest timeout | |||
to the peer; notification of excessive retransmissions (early | to the peer; notification of excessive retransmissions (early | |||
warning below abortion threshold); notification of ICMP error | warning below abortion threshold); notification of ICMP error | |||
message arrival. | message arrival. | |||
Once a connection is created, it can be queried for the maximum | Once a connection is created, it can be queried for the maximum | |||
amount of data that an application can possibly expect to have | amount of data that an application can possibly expect to have | |||
reliably transmitted before or during transport connection | reliably transmitted before or during transport connection | |||
establishment (with zero being a possible answer) (see | establishment (with zero being a possible answer) (see | |||
Section 3.2.1). An application can also give the connection a | Section 6.2.1). An application can also give the connection a | |||
message for reliable transmission before or during connection | message for reliable transmission before or during connection | |||
establishment (!UDP); the transport system will then try to transmit | establishment (not UDP); the transport system will then try to | |||
it as early as possible. An application can facilitate sending a | transmit it as early as possible. An application can facilitate | |||
message particularly early by marking it as "idempotent" (see | sending a message particularly early by marking it as "idempotent" | |||
Section 3.3.1); in this case, the receiving application must be | (see Section 6.3.1); in this case, the receiving application must be | |||
prepared to potentially receive multiple copies of the message | prepared to potentially receive multiple copies of the message | |||
(because idempotent messages are reliably transferred, asking for | (because idempotent messages are reliably transferred, asking for | |||
idempotence is not necessary for systems that support UDP). | idempotence is not necessary for systems that support UDP). | |||
After creation, a transport system can actively establish | After creation, a transport system can actively establish | |||
communication with a peer, or it can passively listen for incoming | communication with a peer, or it can passively listen for incoming | |||
connection requests. Note that active establishment may or may not | connection requests. Note that active establishment may or may not | |||
trigger a notification on the listening side. It is possible that | trigger a notification on the listening side. It is possible that | |||
the first notification on the listening side is the arrival of the | the first notification on the listening side is the arrival of the | |||
first data that the active side sends (a receiver-side transport | first data that the active side sends (a receiver-side transport | |||
system could handle this by continuing to block a "Listen" call, | system could handle this by continuing to block a "Listen" call, | |||
immediately followed by issuing "Receive", for example; callback- | immediately followed by issuing "Receive", for example; callback- | |||
based implementations could simply skip the equivalent of "Listen"). | based implementations could simply skip the equivalent of "Listen"). | |||
This also means that the active opening side is assumed to be the | This also means that the active opening side is assumed to be the | |||
first side sending data. | first side sending data. | |||
A transport system can actively close a connection, i.e. terminate it | A transport system can actively close a connection, i.e. terminate it | |||
after reliably delivering all remaining data to the peer (if reliable | after reliably delivering all remaining data to the peer (if reliable | |||
data delivery was requested earlier (!UDP)), in which case the peer | data delivery was requested earlier (not UDP)), in which case the | |||
is notified that the connection is closed. Alternatively, a | peer is notified that the connection is closed. Alternatively, a | |||
connection can be aborted without delivering outstanding data to the | connection can be aborted without delivering outstanding data to the | |||
peer. In case reliable or partially reliable data delivery was | peer. In case reliable or partially reliable data delivery was | |||
requested earlier (!UDP), the peer is notified that the connection is | requested earlier (not UDP), the peer is notified that the connection | |||
aborted. A timeout can be configured to abort a connection when data | is aborted. A timeout can be configured to abort a connection when | |||
could not be delivered for too long (!UDP); however, timeout-based | data could not be delivered for too long (not UDP); however, timeout- | |||
abortion does not notify the peer application that the connection has | based abortion does not notify the peer application that the | |||
been aborted. Because half-closed connections are not supported, | connection has been aborted. Because half-closed connections are not | |||
when a host implementing a transport system receives a notification | supported, when a host implementing a transport system receives a | |||
that the peer is closing or aborting the connection (!UDP), its peer | notification that the peer is closing or aborting the connection (not | |||
may not be able to read outstanding data. This means that | UDP), its peer may not be able to read outstanding data. This means | |||
unacknowledged data residing a transport system's send buffer may | that unacknowledged data residing in a transport system's send buffer | |||
have to be dropped from that buffer upon arrival of a "close" or | may have to be dropped from that buffer upon arrival of a "close" or | |||
"abort" notification from the peer. | "abort" notification from the peer. | |||
3.2. MAINTENANCE | 6.2. MAINTENANCE | |||
A transport system must offer means to group connections, but it | A transport system must offer means to group connections, but it | |||
cannot guarantee truly grouping them using the transport protocols | cannot guarantee truly grouping them using the transport protocols | |||
that it uses (e.g., it cannot be guaranteed that connections become | that it uses (e.g., it cannot be guaranteed that connections become | |||
multiplexed as streams on a single SCTP association when SCTP may not | multiplexed as streams on a single SCTP association when SCTP may not | |||
be available). The transport system must therefore ensure that | be available). The transport system must therefore ensure that | |||
group- versus non-group-configurations are handled correctly in some | group- versus non-group-configurations are handled correctly in some | |||
way (e.g., by applying the configuration to all grouped connections | way (e.g., by applying the configuration to all grouped connections | |||
even when they are not multiplexed, or informing the application | even when they are not multiplexed, or informing the application | |||
about grouping success or failure). | about grouping success or failure). | |||
As a general rule, any configuration described below should be | As a general rule, any configuration described below should be | |||
carried out as early as possible to aid the transport system's | carried out as early as possible to aid the transport system's | |||
decision making. | decision making. | |||
3.2.1. Connection groups | 6.2.1. Connection groups | |||
The following transport features and notifications (some directly | The following transport features and notifications (some directly | |||
from Appendix A.2, some new or changed, based on the discussion in | from Section 4, some new or changed, based on the discussion in | |||
Appendix A.3) automatically apply to all grouped connections: | Section 5) automatically apply to all grouped connections: | |||
(!UDP) Configure a timeout: this can be done with the following | (not UDP) Configure a timeout: this can be done with the following | |||
parameters: | parameters: | |||
o A timeout value for aborting connections, in seconds | o A timeout value for aborting connections, in seconds | |||
o A timeout value to be suggested to the peer (if possible), in | o A timeout value to be suggested to the peer (if possible), in | |||
seconds | seconds | |||
o The number of retransmissions after which the application should | o The number of retransmissions after which the application should | |||
be notifed of "Excessive Retransmissions" | be notifed of "Excessive Retransmissions" | |||
Configure urgency: this can be done with the following parameters: | Configure urgency: this can be done with the following parameters: | |||
skipping to change at page 9, line 31 ¶ | skipping to change at page 18, line 40 ¶ | |||
[I-D.ietf-tsvwg-rtcweb-qos]). | [I-D.ietf-tsvwg-rtcweb-qos]). | |||
o A buffer limit (in bytes); when the sender has less than the | o A buffer limit (in bytes); when the sender has less than the | |||
provided limit of bytes in the buffer, the application may be | provided limit of bytes in the buffer, the application may be | |||
notified. Notifications are not guaranteed, and it is optional | notified. Notifications are not guaranteed, and it is optional | |||
for a transport system to support buffer limit values greater than | for a transport system to support buffer limit values greater than | |||
0. Note that this limit and its notification should operate | 0. Note that this limit and its notification should operate | |||
across the buffers of the whole transport system, i.e. also any | across the buffers of the whole transport system, i.e. also any | |||
potential buffers that the transport system itself may use on top | potential buffers that the transport system itself may use on top | |||
of the transport's send buffer. | of the transport's send buffer. | |||
Following Appendix A.3.7, these properties can be queried: | Following Section 5.7, these properties can be queried: | |||
o The maximum message size that may be sent without fragmentation | o The maximum message size that may be sent without fragmentation | |||
via the configured interface. This is optional for a transport | via the configured interface. This is optional for a transport | |||
system to offer, and may return an error ("not available"). It | system to offer, and may return an error ("not available"). It | |||
can aid applications implementing Path MTU Discovery. | can aid applications implementing Path MTU Discovery. | |||
o The maximum transport message size that can be sent, in bytes. | o The maximum transport message size that can be sent, in bytes. | |||
Irrespective of fragmentation, there is a size limit for the | Irrespective of fragmentation, there is a size limit for the | |||
messages that can be handed over to SCTP or UDP(-Lite); because | messages that can be handed over to SCTP or UDP(-Lite); because | |||
the service provided by a transport system is independent of the | the service provided by a transport system is independent of the | |||
transport protocol, it must allow an application to query this | transport protocol, it must allow an application to query this | |||
value -- the maximum size of a message in an Application-Framed- | value -- the maximum size of a message in an Application-Framed- | |||
Bytestream (see Appendix A.3.1). This may also return an error | Bytestream (see Section 5.1). This may also return an error when | |||
when data is not delimited ("not available"). | data is not delimited ("not available"). | |||
o The maximum transport message size that can be received from the | o The maximum transport message size that can be received from the | |||
configured interface, in bytes (or "not available"). | configured interface, in bytes (or "not available"). | |||
o The maximum amount of data that can possibly be sent before or | o The maximum amount of data that can possibly be sent before or | |||
during connection establishment, in bytes. | during connection establishment, in bytes. | |||
In addition to the already mentioned closing / aborting notifications | In addition to the already mentioned closing / aborting notifications | |||
and possible send errors, the following notifications can occur: | and possible send errors, the following notifications can occur: | |||
o Excessive Retransmissions: the configured (or a default) number of | o Excessive Retransmissions: the configured (or a default) number of | |||
retransmissions has been reached, yielding this early warning | retransmissions has been reached, yielding this early warning | |||
skipping to change at page 10, line 19 ¶ | skipping to change at page 19, line 28 ¶ | |||
the conveyed ICMP message has arrived. | the conveyed ICMP message has arrived. | |||
o ECN Arrival (parameter: ECN value): a packet carrying the conveyed | o ECN Arrival (parameter: ECN value): a packet carrying the conveyed | |||
ECN value has arrived. This can be useful for applications | ECN value has arrived. This can be useful for applications | |||
implementing congestion control. | implementing congestion control. | |||
o Timeout (parameter: s seconds): data could not be delivered for s | o Timeout (parameter: s seconds): data could not be delivered for s | |||
seconds. | seconds. | |||
o Drain: the send buffer has either drained below the configured | o Drain: the send buffer has either drained below the configured | |||
buffer limit or it has become completely empty. This is a generic | buffer limit or it has become completely empty. This is a generic | |||
notification that tries to enable uniform access to | notification that tries to enable uniform access to | |||
"TCP_NOTSENT_LOWAT" as well as the "SENDER DRY" notification (as | "TCP_NOTSENT_LOWAT" as well as the "SENDER DRY" notification (as | |||
discussed in Appendix A.3.4 -- SCTP's "SENDER DRY" is a special | discussed in Section 5.4 -- SCTP's "SENDER DRY" is a special case | |||
case where the threshold (for unsent data) is 0 and there is also | where the threshold (for unsent data) is 0 and there is also no | |||
no more unacknowledged data in the send buffer). | more unacknowledged data in the send buffer). | |||
3.2.2. Individual connections | 6.2.2. Individual connections | |||
Configure priority or weight for a scheduler, as described in | Configure priority or weight for a scheduler, as described in | |||
[RFC8260]. | [RFC8260]. | |||
Configure checksum usage: this can be done with the following | Configure checksum usage: this can be done with the following | |||
parameters, but there is no guarantee that any checksum limitations | parameters, but there is no guarantee that any checksum limitations | |||
will indeed be enforced (the default behavior is "full coverage, | will indeed be enforced (the default behavior is "full coverage, | |||
checksum enabled"): | checksum enabled"): | |||
o A boolean to enable / disable usage of a checksum when sending | o A boolean to enable / disable usage of a checksum when sending | |||
o The desired coverage (in bytes) of the checksum used when sending | o The desired coverage (in bytes) of the checksum used when sending | |||
o A boolean to enable / disable requiring a checksum when receiving | o A boolean to enable / disable requiring a checksum when receiving | |||
o The required minimum coverage (in bytes) of the checksum when | o The required minimum coverage (in bytes) of the checksum when | |||
receiving | receiving | |||
3.3. DATA Transfer | 6.3. DATA Transfer | |||
3.3.1. Sending Data | 6.3.1. Sending Data | |||
When sending a message, no guarantees are given about the | When sending a message, no guarantees are given about the | |||
preservation of message boundaries to the peer; if message boundaries | preservation of message boundaries to the peer; if message boundaries | |||
are needed, the receiving application at the peer must know about | are needed, the receiving application at the peer must know about | |||
them beforehand (or the transport system cannot use TCP). Note that | them beforehand (or the transport system cannot use TCP). Note that | |||
an application should already be able to hand over data before the | an application should already be able to hand over data before the | |||
transport system establishes a connection with a chosen transport | transport system establishes a connection with a chosen transport | |||
protocol. Regarding the message that is being handed over, the | protocol. Regarding the message that is being handed over, the | |||
following parameters can be used: | following parameters can be used: | |||
o Reliability: this parameter is used to convey a choice of: fully | o Reliability: this parameter is used to convey a choice of: fully | |||
reliable with congestion control (!UDP), unreliable without | reliable with congestion control (not UDP), unreliable without | |||
congestion control, unreliable with congestion control (!UDP), | congestion control, unreliable with congestion control (not UDP), | |||
partially reliable with congestion control (see [RFC3758] and | partially reliable with congestion control (see [RFC3758] and | |||
[RFC7496] for details on how to specify partial reliability) | [RFC7496] for details on how to specify partial reliability) (not | |||
(!UDP). The latter two choices are optional for a transport | UDP). The latter two choices are optional for a transport system | |||
system to offer and may result in full reliability. Note that | to offer and may result in full reliability. Note that | |||
applications sending unreliable data without congestion control | applications sending unreliable data without congestion control | |||
should themselves perform congestion control in accordance with | should themselves perform congestion control in accordance with | |||
[RFC2914]. | [RFC8085]. | |||
o (!UDP) Ordered: this boolean parameter lets an application choose | o (not UDP) Ordered: this boolean parameter lets an application | |||
between ordered message delivery (true) and possibly unordered, | choose between ordered message delivery (true) and possibly | |||
potentially faster message delivery (false). | unordered, potentially faster message delivery (false). | |||
o Bundle: a boolean that expresses a preference for allowing to | o Bundle: a boolean that expresses a preference for allowing to | |||
bundle messages (true) or not (false). No guarantees are given. | bundle messages (true) or not (false). No guarantees are given. | |||
o DelAck: a boolean that, if false, lets an application request that | o DelAck: a boolean that, if false, lets an application request that | |||
the peer would not delay the acknowledgement for this message. | the peer would not delay the acknowledgement for this message. | |||
o Fragment: a boolean that expresses a preference for allowing to | o Fragment: a boolean that expresses a preference for allowing to | |||
fragment messages (true) or not (false), at the IP level. No | fragment messages (true) or not (false), at the IP level. No | |||
guarantees are given. | guarantees are given. | |||
o (!UDP) Idempotent: a boolean that expresses whether a message is | o (not UDP) Idempotent: a boolean that expresses whether a message | |||
idempotent (true) or not (false). Idempotent messages may arrive | is idempotent (true) or not (false). Idempotent messages may | |||
multiple times at the receiver (but they will arrive at least | arrive multiple times at the receiver (but they will arrive at | |||
once). When data is idempotent it can be used by the receiver | least once). When data is idempotent it can be used by the | |||
immediately on a connection establishment attempt. Thus, if data | receiver immediately on a connection establishment attempt. Thus, | |||
is handed over before the transport system establishes a | if data is handed over before the transport system establishes a | |||
connection with a chosen transport protocol, stating that a | connection with a chosen transport protocol, stating that a | |||
message is idempotent facilitates transmitting it to the peer | message is idempotent facilitates transmitting it to the peer | |||
application particularly early. | application particularly early. | |||
An application can be notified of a failure to send a specific | An application can be notified of a failure to send a specific | |||
message. There is no guarantee of such notifications, i.e. send | message. There is no guarantee of such notifications, i.e. send | |||
failures can also silently occur. | failures can also silently occur. | |||
3.3.2. Receiving Data | 6.3.2. Receiving Data | |||
A receiving application obtains an "Application-Framed Bytestream" | A receiving application obtains an "Application-Framed Bytestream" | |||
(AFra-Bytestream); this concept is further described in | (AFra-Bytestream); this concept is further described in Section 5.1). | |||
Appendix A.3.1). In line with TCP's receiver semantics, an AFra- | In line with TCP's receiver semantics, an AFra-Bytestream is just a | |||
Bytestream is just a stream of bytes to the receiver. If message | stream of bytes to the receiver. If message boundaries were | |||
boundaries were specified by the sender, a receiver-side transport | specified by the sender, a receiver-side transport system | |||
system implementing only the minimum set of transport services | implementing only the minimum set of transport services defined here | |||
defined here will still not inform the receiving application about | will still not inform the receiving application about them (this | |||
them (this limitation is only needed for transport systems that are | limitation is only needed for transport systems that are implemented | |||
implemented to directly use TCP). | to directly use TCP). | |||
Different from TCP's semantics, if the sending application has | Different from TCP's semantics, if the sending application has | |||
allowed that messages are not fully reliably transferred, or | allowed that messages are not fully reliably transferred, or | |||
delivered out of order, then such re-ordering or unreliability may be | delivered out of order, then such re-ordering or unreliability may be | |||
reflected per message in the arriving data. Messages will always | reflected per message in the arriving data. Messages will always | |||
stay intact - i.e. if an incomplete message is contained at the end | stay intact - i.e. if an incomplete message is contained at the end | |||
of the arriving data block, this message is guaranteed to continue in | of the arriving data block, this message is guaranteed to continue in | |||
the next arriving data block. | the next arriving data block. | |||
4. Acknowledgements | 7. Acknowledgements | |||
The authors would like to thank all the participants of the TAPS | The authors would like to thank all the participants of the TAPS | |||
Working Group and the NEAT and MAMI research projects for valuable | Working Group and the NEAT and MAMI research projects for valuable | |||
input to this document. We especially thank Michael Tuexen for help | input to this document. We especially thank Michael Tuexen for help | |||
with connection connection establishment/teardown, Gorry Fairhurst | with connection connection establishment/teardown, Gorry Fairhurst | |||
for his suggestions regarding fragmentation and packet sizes, and | for his suggestions regarding fragmentation and packet sizes, and | |||
Spencer Dawkins for his extremely detailed and constructive review. | Spencer Dawkins for his extremely detailed and constructive review. | |||
This work has received funding from the European Union's Horizon 2020 | This work has received funding from the European Union's Horizon 2020 | |||
research and innovation programme under grant agreement No. 644334 | research and innovation programme under grant agreement No. 644334 | |||
(NEAT). | (NEAT). | |||
5. IANA Considerations | 8. IANA Considerations | |||
This memo includes no request to IANA. | This memo includes no request to IANA. | |||
6. Security Considerations | 9. Security Considerations | |||
Authentication, confidentiality protection, and integrity protection | Authentication, confidentiality protection, and integrity protection | |||
are identified as transport features by [RFC8095]. As currently | are identified as transport features by [RFC8095]. As currently | |||
deployed in the Internet, these features are generally provided by a | deployed in the Internet, these features are generally provided by a | |||
protocol or layer on top of the transport protocol; no current full- | protocol or layer on top of the transport protocol; no current full- | |||
featured standards-track transport protocol provides all of these | featured standards-track transport protocol provides all of these | |||
transport features on its own. Therefore, these transport features | transport features on its own. Therefore, these transport features | |||
are not considered in this document, with the exception of native | are not considered in this document, with the exception of native | |||
authentication capabilities of TCP and SCTP for which the security | authentication capabilities of TCP and SCTP for which the security | |||
considerations in [RFC5925] and [RFC4895] apply. The minimum | considerations in [RFC5925] and [RFC4895] apply. The minimum | |||
requirements for a secure transport system are discussed in a | requirements for a secure transport system are discussed in a | |||
separate document (Section 5 on Security Features and Transport | separate document (Section 5 on Security Features and Transport | |||
Dependencies of [I-D.ietf-taps-transport-security]). | Dependencies of [I-D.ietf-taps-transport-security]). | |||
7. References | 10. References | |||
7.1. Normative References | 10.1. Normative References | |||
[I-D.ietf-taps-transport-security] | [I-D.ietf-taps-transport-security] | |||
Pauly, T., Perkins, C., Rose, K., and C. Wood, "A Survey | Pauly, T., Perkins, C., Rose, K., and C. Wood, "A Survey | |||
of Transport Security Protocols", draft-ietf-taps- | of Transport Security Protocols", draft-ietf-taps- | |||
transport-security-02 (work in progress), June 2018. | transport-security-02 (work in progress), June 2018. | |||
[RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, | [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind, | |||
Ed., "Services Provided by IETF Transport Protocols and | Ed., "Services Provided by IETF Transport Protocols and | |||
Congestion Control Mechanisms", RFC 8095, | Congestion Control Mechanisms", RFC 8095, | |||
DOI 10.17487/RFC8095, March 2017, | DOI 10.17487/RFC8095, March 2017, | |||
<https://www.rfc-editor.org/info/rfc8095>. | <https://www.rfc-editor.org/info/rfc8095>. | |||
[RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of | [RFC8303] Welzl, M., Tuexen, M., and N. Khademi, "On the Usage of | |||
Transport Features Provided by IETF Transport Protocols", | Transport Features Provided by IETF Transport Protocols", | |||
RFC 8303, DOI 10.17487/RFC8303, February 2018, | RFC 8303, DOI 10.17487/RFC8303, February 2018, | |||
<https://www.rfc-editor.org/info/rfc8303>. | <https://www.rfc-editor.org/info/rfc8303>. | |||
7.2. Informative References | 10.2. Informative References | |||
[COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte | [COBS] Cheshire, S. and M. Baker, "Consistent Overhead Byte | |||
Stuffing", IEEE/ACM Transactions on Networking Vol. 7, No. | Stuffing", IEEE/ACM Transactions on Networking Vol. 7, No. | |||
2, April 1999. | 2, April 1999. | |||
[I-D.ietf-tsvwg-rtcweb-qos] | [I-D.ietf-tsvwg-rtcweb-qos] | |||
Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP | Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP | |||
Packet Markings for WebRTC QoS", draft-ietf-tsvwg-rtcweb- | Packet Markings for WebRTC QoS", draft-ietf-tsvwg-rtcweb- | |||
qos-18 (work in progress), August 2016. | qos-18 (work in progress), August 2016. | |||
[LBE-draft] | [LBE-draft] | |||
Bless, R., "A Lower Effort Per-Hop Behavior (LE PHB)", | Bless, R., "A Lower Effort Per-Hop Behavior (LE PHB)", | |||
Internet-draft draft-tsvwg-le-phb-03, February 2018. | Internet-draft draft-tsvwg-le-phb-03, February 2018. | |||
[RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, | [POSIX] "IEEE Standard for Information Technology--Portable | |||
RFC 2914, DOI 10.17487/RFC2914, September 2000, | Operating System Interface (POSIX(R)) Base Specifications, | |||
<https://www.rfc-editor.org/info/rfc2914>. | Issue 7", IEEE Std 1003.1-2017 (Revision of IEEE Std | |||
1003.1-2008), January 2018, | ||||
<http://www.opengroup.org/onlinepubs/9699919799/functions/ | ||||
contents.html>. | ||||
[RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. | [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. | |||
Conrad, "Stream Control Transmission Protocol (SCTP) | Conrad, "Stream Control Transmission Protocol (SCTP) | |||
Partial Reliability Extension", RFC 3758, | Partial Reliability Extension", RFC 3758, | |||
DOI 10.17487/RFC3758, May 2004, | DOI 10.17487/RFC3758, May 2004, | |||
<https://www.rfc-editor.org/info/rfc3758>. | <https://www.rfc-editor.org/info/rfc3758>. | |||
[RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, | [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, | |||
"Authenticated Chunks for the Stream Control Transmission | "Authenticated Chunks for the Stream Control Transmission | |||
Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August | Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August | |||
skipping to change at page 14, line 20 ¶ | skipping to change at page 23, line 39 ¶ | |||
[RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP | [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP | |||
Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, | Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, | |||
<https://www.rfc-editor.org/info/rfc7413>. | <https://www.rfc-editor.org/info/rfc7413>. | |||
[RFC7496] Tuexen, M., Seggelmann, R., Stewart, R., and S. Loreto, | [RFC7496] Tuexen, M., Seggelmann, R., Stewart, R., and S. Loreto, | |||
"Additional Policies for the Partially Reliable Stream | "Additional Policies for the Partially Reliable Stream | |||
Control Transmission Protocol Extension", RFC 7496, | Control Transmission Protocol Extension", RFC 7496, | |||
DOI 10.17487/RFC7496, April 2015, | DOI 10.17487/RFC7496, April 2015, | |||
<https://www.rfc-editor.org/info/rfc7496>. | <https://www.rfc-editor.org/info/rfc7496>. | |||
[RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage | ||||
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, | ||||
March 2017, <https://www.rfc-editor.org/info/rfc8085>. | ||||
[RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, | [RFC8260] Stewart, R., Tuexen, M., Loreto, S., and R. Seggelmann, | |||
"Stream Schedulers and User Message Interleaving for the | "Stream Schedulers and User Message Interleaving for the | |||
Stream Control Transmission Protocol", RFC 8260, | Stream Control Transmission Protocol", RFC 8260, | |||
DOI 10.17487/RFC8260, November 2017, | DOI 10.17487/RFC8260, November 2017, | |||
<https://www.rfc-editor.org/info/rfc8260>. | <https://www.rfc-editor.org/info/rfc8260>. | |||
[RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the | [RFC8304] Fairhurst, G. and T. Jones, "Transport Features of the | |||
User Datagram Protocol (UDP) and Lightweight UDP (UDP- | User Datagram Protocol (UDP) and Lightweight UDP (UDP- | |||
Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, | Lite)", RFC 8304, DOI 10.17487/RFC8304, February 2018, | |||
<https://www.rfc-editor.org/info/rfc8304>. | <https://www.rfc-editor.org/info/rfc8304>. | |||
[SCTP-stream-1] | ||||
Weinrank, F. and M. Tuexen, "Transparent Flow Mapping for | ||||
NEAT", IFIP NETWORKING Workshop on Future of Internet | ||||
Transport (FIT 2017), June 2017. | ||||
[SCTP-stream-2] | ||||
Welzl, M., Niederbacher, F., and S. Gjessing, "Beneficial | ||||
Transparent Deployment of SCTP", IEEE GlobeCom 2011, | ||||
December 2011. | ||||
[WWDC2015] | [WWDC2015] | |||
Lakhera, P. and S. Cheshire, "Your App and Next Generation | Lakhera, P. and S. Cheshire, "Your App and Next Generation | |||
Networks", Apple Worldwide Developers Conference 2015, San | Networks", Apple Worldwide Developers Conference 2015, San | |||
Francisco, USA, June 2015, | Francisco, USA, June 2015, | |||
<https://developer.apple.com/videos/wwdc/2015/?id=719>. | <https://developer.apple.com/videos/wwdc/2015/?id=719>. | |||
Appendix A. Deriving the minimal set | Appendix A. The Superset of Transport Features | |||
We approach the construction of a minimal set of transport features | ||||
in the following way: | ||||
1. Categorization (Appendix A.1): the superset of transport features | ||||
from [RFC8303] is presented, and transport features are | ||||
categorized for later reduction. | ||||
2. Reduction (Appendix A.2): a shorter list of transport features is | ||||
derived from the categorization in the first step. This removes | ||||
all transport features that do not require application-specific | ||||
knowledge or would result in semantically incorrect behavior if | ||||
they were implemented over TCP or UDP. | ||||
3. Discussion (Appendix A.3): the resulting list shows a number of | ||||
peculiarities that are discussed, to provide a basis for | ||||
constructing the minimal set. | ||||
4. Construction (Section 3): Based on the reduced set and the | ||||
discussion of the transport features therein, a minimal set is | ||||
constructed. | ||||
A.1. Step 1: Categorization -- The Superset of Transport Features | ||||
Following [RFC8303], we divide the transport features into two main | ||||
groups as follows: | ||||
1. CONNECTION related transport features | ||||
- ESTABLISHMENT | ||||
- AVAILABILITY | ||||
- MAINTENANCE | ||||
- TERMINATION | ||||
2. DATA Transfer related transport features | ||||
- Sending Data | ||||
- Receiving Data | ||||
- Errors | ||||
We assume that applications have no specific requirements that need | ||||
knowledge about the network, e.g. regarding the choice of network | ||||
interface or the end-to-end path. Even with these assumptions, there | ||||
are certain requirements that are strictly kept by transport | ||||
protocols today, and these must also be kept by a transport system. | ||||
Some of these requirements relate to transport features that we call | ||||
"Functional". | ||||
Functional transport features provide functionality that cannot be | ||||
used without the application knowing about them, or else they violate | ||||
assumptions that might cause the application to fail. For example, | ||||
ordered message delivery is a functional transport feature: it cannot | ||||
be configured without the application knowing about it because the | ||||
application's assumption could be that messages always arrive in | ||||
order. Failure includes any change of the application behavior that | ||||
is not performance oriented, e.g. security. | ||||
"Change DSCP" and "Disable Nagle algorithm" are examples of transport | ||||
features that we call "Optimizing": if a transport system | ||||
autonomously decides to enable or disable them, an application will | ||||
not fail, but a transport system may be able to communicate more | ||||
efficiently if the application is in control of this optimizing | ||||
transport feature. These transport features require application- | ||||
specific knowledge (e.g., about delay/bandwidth requirements or the | ||||
length of future data blocks that are to be transmitted). | ||||
The transport features of IETF transport protocols that do not | ||||
require application-specific knowledge and could therefore be | ||||
utilized by a transport system on its own without involving the | ||||
application are called "Automatable". | ||||
Finally, in three cases, transport features are aggregated and/or | ||||
slightly changed from [RFC8303] in the description below. These | ||||
transport features are marked as "ADDED". These do not add any new | ||||
functionality but just represent a simple refactoring step that helps | ||||
to streamline the derivation process (e.g., by removing a choice of a | ||||
parameter for the sake of applications that may not care about this | ||||
choice). The corresponding transport features are automatable, and | ||||
they are listed immediately below the "ADDED" transport feature. | ||||
In this description, transport services are presented following the | In this description, transport features are presented following the | |||
nomenclature "CATEGORY.[SUBCATEGORY].SERVICENAME.PROTOCOL", | nomenclature "CATEGORY.[SUBCATEGORY].FEATURENAME.PROTOCOL", | |||
equivalent to "pass 2" in [RFC8303]. We also sketch how functional | equivalent to "pass 2" in [RFC8303]. We also sketch how functional | |||
or optimizing transport features can be implemented by a transport | or optimizing transport features can be implemented by a transport | |||
system. The "minimal set" derived in this document is meant to be | system. The "minimal set" derived in this document is meant to be | |||
implementable "one-sided" over TCP, and, with limitations, UDP. | implementable "one-sided" over TCP, and, with limitations, UDP. | |||
Hence, for all transport features that are categorized as | Hence, for all transport features that are categorized as | |||
"functional" or "optimizing", and for which no matching TCP and/or | "functional" or "optimizing", and for which no matching TCP and/or | |||
UDP primitive exists in "pass 2" of [RFC8303], a brief discussion on | UDP primitive exists in "pass 2" of [RFC8303], a brief discussion on | |||
how to implement them over TCP and/or UDP is included. | how to implement them over TCP and/or UDP is included. | |||
We designate some transport features as "automatable" on the basis of | We designate some transport features as "automatable" on the basis of | |||
skipping to change at page 16, line 46 ¶ | skipping to change at page 24, line 50 ¶ | |||
application-specific knowledge. This means that a connection that | application-specific knowledge. This means that a connection that | |||
is exhibited to an application could be implemented by using a | is exhibited to an application could be implemented by using a | |||
single stream of an SCTP association instead of mapping it to a | single stream of an SCTP association instead of mapping it to a | |||
complete SCTP association or TCP connection. This could be | complete SCTP association or TCP connection. This could be | |||
achieved by using more than one stream when an SCTP association is | achieved by using more than one stream when an SCTP association is | |||
first established (CONNECT.SCTP parameter "outbound stream | first established (CONNECT.SCTP parameter "outbound stream | |||
count"), maintaining an internal stream number, and using this | count"), maintaining an internal stream number, and using this | |||
stream number when sending data (SEND.SCTP parameter "stream | stream number when sending data (SEND.SCTP parameter "stream | |||
number"). Closing or aborting a connection could then simply free | number"). Closing or aborting a connection could then simply free | |||
the stream number for future use. This is discussed further in | the stream number for future use. This is discussed further in | |||
Appendix A.3.2. | Section 5.2. | |||
o All transport features that are related to using multiple paths or | o All transport features that are related to using multiple paths or | |||
the choice of the network interface were designated as | the choice of the network interface were designated as | |||
"automatable". Choosing a path or an interface does not depend on | "automatable". Choosing a path or an interface does not depend on | |||
application-specific knowledge. For example, "Listen" could | application-specific knowledge. For example, "Listen" could | |||
always listen on all available interfaces and "Connect" could use | always listen on all available interfaces and "Connect" could use | |||
the default interface for the destination IP address. | the default interface for the destination IP address. | |||
A.1.1. CONNECTION Related Transport Features | Finally, in three cases, transport features are aggregated and/or | |||
slightly changed from [RFC8303] in the description below. These | ||||
transport features are marked as "CHANGED FROM RFC8303". These do | ||||
not add any new functionality but just represent a simple refactoring | ||||
step that helps to streamline the derivation process (e.g., by | ||||
removing a choice of a parameter for the sake of applications that | ||||
may not care about this choice). The corresponding transport | ||||
features are automatable, and they are listed immediately below the | ||||
"CHANGED FROM RFC8303" transport feature. | ||||
A.1. CONNECTION Related Transport Features | ||||
ESTABLISHMENT: | ESTABLISHMENT: | |||
o Connect | o Connect | |||
Protocols: TCP, SCTP, UDP(-Lite) | Protocols: TCP, SCTP, UDP(-Lite) | |||
Functional because the notion of a connection is often reflected | Functional because the notion of a connection is often reflected | |||
in applications as an expectation to be able to communicate after | in applications as an expectation to be able to communicate after | |||
a "Connect" succeeded, with a communication sequence relating to | a "Connect" succeeded, with a communication sequence relating to | |||
this transport feature that is defined by the application | this transport feature that is defined by the application | |||
protocol. | protocol. | |||
skipping to change at page 17, line 27 ¶ | skipping to change at page 25, line 41 ¶ | |||
Lite). | Lite). | |||
o Specify which IP Options must always be used | o Specify which IP Options must always be used | |||
Protocols: TCP, UDP(-Lite) | Protocols: TCP, UDP(-Lite) | |||
Automatable because IP Options relate to knowledge about the | Automatable because IP Options relate to knowledge about the | |||
network, not the application. | network, not the application. | |||
o Request multiple streams | o Request multiple streams | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because using multi-streaming does not require | Automatable because using multi-streaming does not require | |||
application-specific knowledge. | application-specific knowledge (example implementations of using | |||
Implementation: see Appendix A.3.2. | multi-streaming without involving the application are described in | |||
[SCTP-stream-1] and [SCTP-stream-2]). | ||||
Implementation: see Section 5.2. | ||||
o Limit the number of inbound streams | o Limit the number of inbound streams | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because using multi-streaming does not require | Automatable because using multi-streaming does not require | |||
application-specific knowledge. | application-specific knowledge. | |||
Implementation: see Appendix A.3.2. | Implementation: see Section 5.2. | |||
o Specify number of attempts and/or timeout for the first | o Specify number of attempts and/or timeout for the first | |||
establishment message | establishment message | |||
Protocols: TCP, SCTP | Protocols: TCP, SCTP | |||
Functional because this is closely related to potentially assumed | Functional because this is closely related to potentially assumed | |||
reliable data delivery for data that is sent before or during | reliable data delivery for data that is sent before or during | |||
connection establishment. | connection establishment. | |||
Implementation: Using a parameter of CONNECT.TCP and CONNECT.SCTP. | Implementation: Using a parameter of CONNECT.TCP and CONNECT.SCTP. | |||
Implementation over UDP: Do nothing (this is irrelevant in case of | Implementation over UDP: Do nothing (this is irrelevant in case of | |||
UDP because there, reliable data delivery is not assumed). | UDP because there, reliable data delivery is not assumed). | |||
skipping to change at page 19, line 10 ¶ | skipping to change at page 27, line 25 ¶ | |||
Implementation over TCP: not possible (TCP does not offer this | Implementation over TCP: not possible (TCP does not offer this | |||
functionality). | functionality). | |||
Implementation over UDP: not possible (UDP does not offer this | Implementation over UDP: not possible (UDP does not offer this | |||
functionality). | functionality). | |||
o Request to negotiate interleaving of user messages | o Request to negotiate interleaving of user messages | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because it requires using multiple streams, but | Automatable because it requires using multiple streams, but | |||
requesting multiple streams in the CONNECTION.ESTABLISHMENT | requesting multiple streams in the CONNECTION.ESTABLISHMENT | |||
category is automatable. | category is automatable. | |||
Implementation: via a parameter in CONNECT.SCTP. | Implementation: controlled via a parameter in CONNECT.SCTP. One | |||
possible implementation is to always try to enable interleaving. | ||||
o Hand over a message to reliably transfer (possibly multiple times) | o Hand over a message to reliably transfer (possibly multiple times) | |||
before connection establishment | before connection establishment | |||
Protocols: TCP | Protocols: TCP | |||
Functional because this is closely tied to properties of the data | Functional because this is closely tied to properties of the data | |||
that an application sends or expects to receive. | that an application sends or expects to receive. | |||
Implementation: via a parameter in CONNECT.TCP. | Implementation: via a parameter in CONNECT.TCP. | |||
Implementation over UDP: not possible (UDP does not provide | Implementation over UDP: not possible (UDP does not provide | |||
reliability). | reliability). | |||
skipping to change at page 20, line 6 ¶ | skipping to change at page 28, line 20 ¶ | |||
AVAILABILITY: | AVAILABILITY: | |||
o Listen | o Listen | |||
Protocols: TCP, SCTP, UDP(-Lite) | Protocols: TCP, SCTP, UDP(-Lite) | |||
Functional because the notion of accepting connection requests is | Functional because the notion of accepting connection requests is | |||
often reflected in applications as an expectation to be able to | often reflected in applications as an expectation to be able to | |||
communicate after a "Listen" succeeded, with a communication | communicate after a "Listen" succeeded, with a communication | |||
sequence relating to this transport feature that is defined by the | sequence relating to this transport feature that is defined by the | |||
application protocol. | application protocol. | |||
ADDED. This differs from the 3 automatable transport features | CHANGED FROM RFC8303. This differs from the 3 automatable | |||
below in that it leaves the choice of interfaces for listening | transport features below in that it leaves the choice of | |||
open. | interfaces for listening open. | |||
Implementation: by listening on all interfaces via LISTEN.TCP (not | Implementation: by listening on all interfaces via LISTEN.TCP (not | |||
providing a local IP address) or LISTEN.SCTP (providing SCTP port | providing a local IP address) or LISTEN.SCTP (providing SCTP port | |||
number / address pairs for all local IP addresses). LISTEN.UDP(- | number / address pairs for all local IP addresses). LISTEN.UDP(- | |||
Lite) supports both methods. | Lite) supports both methods. | |||
o Listen, 1 specified local interface | o Listen, 1 specified local interface | |||
Protocols: TCP, SCTP, UDP(-Lite) | Protocols: TCP, SCTP, UDP(-Lite) | |||
Automatable because decisions about local interfaces relate to | Automatable because decisions about local interfaces relate to | |||
knowledge about the network and the Operating System, not the | knowledge about the network and the Operating System, not the | |||
application. | application. | |||
skipping to change at page 21, line 25 ¶ | skipping to change at page 29, line 39 ¶ | |||
should therefore only allow to authenticate all chunk types. Key | should therefore only allow to authenticate all chunk types. Key | |||
material must be provided in a way that is compatible with both | material must be provided in a way that is compatible with both | |||
[RFC4895] and [RFC5925]. | [RFC4895] and [RFC5925]. | |||
Implementation over UDP: not possible (UDP does not offer | Implementation over UDP: not possible (UDP does not offer | |||
authentication). | authentication). | |||
o Obtain requested number of streams | o Obtain requested number of streams | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because using multi-streaming does not require | Automatable because using multi-streaming does not require | |||
application-specific knowledge. | application-specific knowledge. | |||
Implementation: see Appendix A.3.2. | Implementation: see Section 5.2. | |||
o Limit the number of inbound streams | o Limit the number of inbound streams | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because using multi-streaming does not require | Automatable because using multi-streaming does not require | |||
application-specific knowledge. | application-specific knowledge. | |||
Implementation: see Appendix A.3.2. | Implementation: see Section 5.2. | |||
o Indicate (and/or obtain upon completion) an Adaptation Layer via | o Indicate (and/or obtain upon completion) an Adaptation Layer via | |||
an adaptation code point | an adaptation code point | |||
Protocols: SCTP | Protocols: SCTP | |||
Functional because it allows to send extra data for the sake of | Functional because it allows to send extra data for the sake of | |||
identifying an adaptation layer, which by itself is application- | identifying an adaptation layer, which by itself is application- | |||
specific. | specific. | |||
Implementation: via a parameter in LISTEN.SCTP. | Implementation: via a parameter in LISTEN.SCTP. | |||
Implementation over TCP: not possible (TCP does not offer this | Implementation over TCP: not possible (TCP does not offer this | |||
functionality). | functionality). | |||
skipping to change at page 25, line 36 ¶ | skipping to change at page 34, line 9 ¶ | |||
rnext_key from a previously received segment. Key material must | rnext_key from a previously received segment. Key material must | |||
be provided in a way that is compatible with both [RFC4895] and | be provided in a way that is compatible with both [RFC4895] and | |||
[RFC5925]. | [RFC5925]. | |||
Implementation over UDP: not possible (UDP does not offer | Implementation over UDP: not possible (UDP does not offer | |||
authentication). | authentication). | |||
o Reset Stream | o Reset Stream | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because using multi-streaming does not require | Automatable because using multi-streaming does not require | |||
application-specific knowledge. | application-specific knowledge. | |||
Implementation: see Appendix A.3.2. | Implementation: see Section 5.2. | |||
o Notification of Stream Reset | o Notification of Stream Reset | |||
Protocols: STCP | Protocols: STCP | |||
Automatable because using multi-streaming does not require | Automatable because using multi-streaming does not require | |||
application-specific knowledge. | application-specific knowledge. | |||
Implementation: see Appendix A.3.2. | Implementation: see Section 5.2. | |||
o Reset Association | o Reset Association | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because deciding to reset an association does not | Automatable because deciding to reset an association does not | |||
require application-specific knowledge. | require application-specific knowledge. | |||
Implementation: via RESET_ASSOC.SCTP. | Implementation: via RESET_ASSOC.SCTP. | |||
o Notification of Association Reset | o Notification of Association Reset | |||
Protocols: STCP | Protocols: STCP | |||
Automatable because this notification does not relate to | Automatable because this notification does not relate to | |||
application-specific knowledge. | application-specific knowledge. | |||
o Add Streams | o Add Streams | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because using multi-streaming does not require | Automatable because using multi-streaming does not require | |||
application-specific knowledge. | application-specific knowledge. | |||
Implementation: see Appendix A.3.2. | Implementation: see Section 5.2. | |||
o Notification of Added Stream | o Notification of Added Stream | |||
Protocols: STCP | Protocols: STCP | |||
Automatable because using multi-streaming does not require | Automatable because using multi-streaming does not require | |||
application-specific knowledge. | application-specific knowledge. | |||
Implementation: see Appendix A.3.2. | Implementation: see Section 5.2. | |||
o Choose a scheduler to operate between streams of an association | o Choose a scheduler to operate between streams of an association | |||
Protocols: SCTP | Protocols: SCTP | |||
Optimizing because the scheduling decision requires application- | Optimizing because the scheduling decision requires application- | |||
specific knowledge. However, if a transport system would not use | specific knowledge. However, if a transport system would not use | |||
this, or wrongly configure it on its own, this would only affect | this, or wrongly configure it on its own, this would only affect | |||
the performance of data transfers; the outcome would still be | the performance of data transfers; the outcome would still be | |||
correct within the "best effort" service model. | correct within the "best effort" service model. | |||
Implementation: using SET_STREAM_SCHEDULER.SCTP. | Implementation: using SET_STREAM_SCHEDULER.SCTP. | |||
Implementation over TCP: do nothing (streams are not available in | Implementation over TCP: do nothing (streams are not available in | |||
skipping to change at page 27, line 19 ¶ | skipping to change at page 35, line 33 ¶ | |||
TCP, but no guarantee is given that this transport feature has any | TCP, but no guarantee is given that this transport feature has any | |||
effect). | effect). | |||
Implementation over UDP: do nothing (streams are not available in | Implementation over UDP: do nothing (streams are not available in | |||
UDP, but no guarantee is given that this transport feature has any | UDP, but no guarantee is given that this transport feature has any | |||
effect). | effect). | |||
o Configure send buffer size | o Configure send buffer size | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because this decision relates to knowledge about the | Automatable because this decision relates to knowledge about the | |||
network and the Operating System, not the application (see also | network and the Operating System, not the application (see also | |||
the discussion in Appendix A.3.4). | the discussion in Section 5.4). | |||
o Configure receive buffer (and rwnd) size | o Configure receive buffer (and rwnd) size | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because this decision relates to knowledge about the | Automatable because this decision relates to knowledge about the | |||
network and the Operating System, not the application. | network and the Operating System, not the application. | |||
o Configure message fragmentation | o Configure message fragmentation | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because fragmentation relates to knowledge about the | Automatable because this relates to knowledge about the network | |||
network and the Operating System, not the application. | and the Operating System, not the application. Note that this | |||
SCTP feature does not control IP-level fragmentation, but decides | ||||
on fragmentation of messages by SCTP, in the end system. | ||||
Implementation: by always enabling it with | Implementation: by always enabling it with | |||
CONFIG_FRAGMENTATION.SCTP and auto-setting the fragmentation size | CONFIG_FRAGMENTATION.SCTP and auto-setting the fragmentation size | |||
based on network or Operating System conditions. | based on network or Operating System conditions. | |||
o Configure PMTUD | o Configure PMTUD | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because Path MTU Discovery relates to knowledge about | Automatable because Path MTU Discovery relates to knowledge about | |||
the network, not the application. | the network, not the application. | |||
o Configure delayed SACK timer | o Configure delayed SACK timer | |||
skipping to change at page 29, line 5 ¶ | skipping to change at page 37, line 17 ¶ | |||
Functional because this is closely tied to properties of the data | Functional because this is closely tied to properties of the data | |||
that an application sends or expects to receive. | that an application sends or expects to receive. | |||
Implementation over TCP: not possible (TCP does not offer | Implementation over TCP: not possible (TCP does not offer | |||
identification of message boundaries). | identification of message boundaries). | |||
Implementation over UDP: not possible (UDP does not fragment | Implementation over UDP: not possible (UDP does not fragment | |||
messages). | messages). | |||
o Disable checksum when sending | o Disable checksum when sending | |||
Protocols: UDP | Protocols: UDP | |||
Functional because application-specific knowledge is necessary to | Functional because application-specific knowledge is necessary to | |||
decide whether it can be acceptable to lose data integrity. | decide whether it can be acceptable to lose data integrity with | |||
respect to random corruption. | ||||
Implementation: via SET_CHECKSUM_ENABLED.UDP. | Implementation: via SET_CHECKSUM_ENABLED.UDP. | |||
Implementation over TCP: do nothing (TCP does not offer to disable | Implementation over TCP: do nothing (TCP does not offer to disable | |||
the checksum, but transmitting data with an intact checksum will | the checksum, but transmitting data with an intact checksum will | |||
not yield a semantically wrong result). | not yield a semantically wrong result). | |||
o Disable checksum requirement when receiving | o Disable checksum requirement when receiving | |||
Protocols: UDP | Protocols: UDP | |||
Functional because application-specific knowledge is necessary to | Functional because application-specific knowledge is necessary to | |||
decide whether it can be acceptable to lose data integrity. | decide whether it can be acceptable to lose data integrity with | |||
respect to random corruption. | ||||
Implementation: via SET_CHECKSUM_REQUIRED.UDP. | Implementation: via SET_CHECKSUM_REQUIRED.UDP. | |||
Implementation over TCP: do nothing (TCP does not offer to disable | Implementation over TCP: do nothing (TCP does not offer to disable | |||
the checksum, but transmitting data with an intact checksum will | the checksum, but transmitting data with an intact checksum will | |||
not yield a semantically wrong result). | not yield a semantically wrong result). | |||
o Specify checksum coverage used by the sender | o Specify checksum coverage used by the sender | |||
Protocols: UDP-Lite | Protocols: UDP-Lite | |||
Functional because application-specific knowledge is necessary to | Functional because application-specific knowledge is necessary to | |||
decide for which parts of the data it can be acceptable to lose | decide for which parts of the data it can be acceptable to lose | |||
data integrity. | data integrity with respect to random corruption. | |||
Implementation: via SET_CHECKSUM_COVERAGE.UDP-Lite. | Implementation: via SET_CHECKSUM_COVERAGE.UDP-Lite. | |||
Implementation over TCP: do nothing (TCP does not offer to limit | Implementation over TCP: do nothing (TCP does not offer to limit | |||
the checksum length, but transmitting data with an intact checksum | the checksum length, but transmitting data with an intact checksum | |||
will not yield a semantically wrong result). | will not yield a semantically wrong result). | |||
Implementation over UDP: if checksum coverage is set to cover | Implementation over UDP: if checksum coverage is set to cover | |||
payload data, do nothing. Else, either do nothing (transmitting | payload data, do nothing. Else, either do nothing (transmitting | |||
data with an intact checksum will not yield a semantically wrong | data with an intact checksum will not yield a semantically wrong | |||
result), or use the transport feature "Disable checksum when | result), or use the transport feature "Disable checksum when | |||
sending". | sending". | |||
o Specify minimum checksum coverage required by receiver | o Specify minimum checksum coverage required by receiver | |||
Protocols: UDP-Lite | Protocols: UDP-Lite | |||
Functional because application-specific knowledge is necessary to | Functional because application-specific knowledge is necessary to | |||
decide for which parts of the data it can be acceptable to lose | decide for which parts of the data it can be acceptable to lose | |||
data integrity. | data integrity with respect to random corruption. | |||
Implementation: via SET_MIN_CHECKSUM_COVERAGE.UDP-Lite. | Implementation: via SET_MIN_CHECKSUM_COVERAGE.UDP-Lite. | |||
Implementation over TCP: do nothing (TCP does not offer to limit | Implementation over TCP: do nothing (TCP does not offer to limit | |||
the checksum length, but transmitting data with an intact checksum | the checksum length, but transmitting data with an intact checksum | |||
will not yield a semantically wrong result). | will not yield a semantically wrong result). | |||
Implementation over UDP: if checksum coverage is set to cover | Implementation over UDP: if checksum coverage is set to cover | |||
payload data, do nothing. Else, either do nothing (transmitting | payload data, do nothing. Else, either do nothing (transmitting | |||
data with an intact checksum will not yield a semantically wrong | data with an intact checksum will not yield a semantically wrong | |||
result), or use the transport feature "Disable checksum | result), or use the transport feature "Disable checksum | |||
requirement when receiving". | requirement when receiving". | |||
skipping to change at page 31, line 32 ¶ | skipping to change at page 40, line 8 ¶ | |||
network, not the application. | network, not the application. | |||
o Obtain IP Options | o Obtain IP Options | |||
Protocols: UDP(-Lite) | Protocols: UDP(-Lite) | |||
Automatable because IP Options relate to knowledge about the | Automatable because IP Options relate to knowledge about the | |||
network, not the application. | network, not the application. | |||
o Enable and configure a "Low Extra Delay Background Transfer" | o Enable and configure a "Low Extra Delay Background Transfer" | |||
Protocols: A protocol implementing the LEDBAT congestion control | Protocols: A protocol implementing the LEDBAT congestion control | |||
mechanism | mechanism | |||
Optimizing because whether this service is appropriate or not | Optimizing because whether this feature is appropriate or not | |||
depends on application-specific knowledge. However, wrongly using | depends on application-specific knowledge. However, wrongly using | |||
this will only affect the speed of data transfers (albeit | this will only affect the speed of data transfers (albeit | |||
including other transfers that may compete with the transport | including other transfers that may compete with the transport | |||
system's transfer in the network), so it is still correct within | system's transfer in the network), so it is still correct within | |||
the "best effort" service model. | the "best effort" service model. | |||
Implementation: via CONFIGURE.LEDBAT and/or SET_DSCP.TCP / | Implementation: via CONFIGURE.LEDBAT and/or SET_DSCP.TCP / | |||
SET_DSCP.SCTP / SET_DSCP.UDP(-Lite) [LBE-draft]. | SET_DSCP.SCTP / SET_DSCP.UDP(-Lite) [LBE-draft]. | |||
Implementation over TCP: do nothing (TCP does not support LEDBAT | Implementation over TCP: do nothing (TCP does not support LEDBAT | |||
congestion control, but not implementing this functionality will | congestion control, but not implementing this functionality will | |||
not yield a semantically wrong behavior). | not yield a semantically wrong behavior). | |||
skipping to change at page 33, line 11 ¶ | skipping to change at page 41, line 29 ¶ | |||
timeout. | timeout. | |||
o Timeout event when data could not be delivered for too long | o Timeout event when data could not be delivered for too long | |||
Protocols: TCP, SCTP | Protocols: TCP, SCTP | |||
Functional because this notifies that potentially assumed reliable | Functional because this notifies that potentially assumed reliable | |||
data delivery is no longer provided. | data delivery is no longer provided. | |||
Implementation: via TIMEOUT.TCP and TIMEOUT.SCTP. | Implementation: via TIMEOUT.TCP and TIMEOUT.SCTP. | |||
Implementation over UDP: do nothing (this event will not occur | Implementation over UDP: do nothing (this event will not occur | |||
with UDP). | with UDP). | |||
A.1.2. DATA Transfer Related Transport Features | A.2. DATA Transfer Related Transport Features | |||
A.1.2.1. Sending Data | A.2.1. Sending Data | |||
o Reliably transfer data, with congestion control | o Reliably transfer data, with congestion control | |||
Protocols: TCP, SCTP | Protocols: TCP, SCTP | |||
Functional because this is closely tied to properties of the data | Functional because this is closely tied to properties of the data | |||
that an application sends or expects to receive. | that an application sends or expects to receive. | |||
Implementation: via SEND.TCP and SEND.SCTP. | Implementation: via SEND.TCP and SEND.SCTP. | |||
Implementation over UDP: not possible (UDP is unreliable). | Implementation over UDP: not possible (UDP is unreliable). | |||
o Reliably transfer a message, with congestion control | o Reliably transfer a message, with congestion control | |||
Protocols: SCTP | Protocols: SCTP | |||
skipping to change at page 33, line 38 ¶ | skipping to change at page 42, line 17 ¶ | |||
boundaries will not be identifiable by the receiver, because TCP | boundaries will not be identifiable by the receiver, because TCP | |||
provides a byte stream service. | provides a byte stream service. | |||
Implementation over UDP: not possible (UDP is unreliable). | Implementation over UDP: not possible (UDP is unreliable). | |||
o Unreliably transfer a message | o Unreliably transfer a message | |||
Protocols: SCTP, UDP(-Lite) | Protocols: SCTP, UDP(-Lite) | |||
Optimizing because only applications know about the time | Optimizing because only applications know about the time | |||
criticality of their communication, and reliably transfering a | criticality of their communication, and reliably transfering a | |||
message is never incorrect for the receiver of a potentially | message is never incorrect for the receiver of a potentially | |||
unreliable data transfer, it is just slower. | unreliable data transfer, it is just slower. | |||
ADDED. This differs from the 2 automatable transport features | CHANGED FROM RFC8303. This differs from the 2 automatable | |||
below in that it leaves the choice of congestion control open. | transport features below in that it leaves the choice of | |||
congestion control open. | ||||
Implementation: via SEND.SCTP or SEND.UDP(-Lite). | Implementation: via SEND.SCTP or SEND.UDP(-Lite). | |||
Implementation over TCP: use SEND.TCP. With SEND.TCP, messages | Implementation over TCP: use SEND.TCP. With SEND.TCP, messages | |||
will be sent reliably, and message boundaries will not be | will be sent reliably, and message boundaries will not be | |||
identifiable by the receiver. | identifiable by the receiver. | |||
o Unreliably transfer a message, with congestion control | o Unreliably transfer a message, with congestion control | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because congestion control relates to knowledge about | Automatable because congestion control relates to knowledge about | |||
the network, not the application. | the network, not the application. | |||
skipping to change at page 34, line 33 ¶ | skipping to change at page 43, line 12 ¶ | |||
configuration: based on the assumption of the best-effort service | configuration: based on the assumption of the best-effort service | |||
model, unnecessarily delivering data does not violate application | model, unnecessarily delivering data does not violate application | |||
expectations. Moreover, it is not possible to associate the | expectations. Moreover, it is not possible to associate the | |||
requested reliability to a "message" in TCP anyway. | requested reliability to a "message" in TCP anyway. | |||
Implementation over UDP: not possible (UDP is unreliable). | Implementation over UDP: not possible (UDP is unreliable). | |||
o Choice of stream | o Choice of stream | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because it requires using multiple streams, but | Automatable because it requires using multiple streams, but | |||
requesting multiple streams in the CONNECTION.ESTABLISHMENT | requesting multiple streams in the CONNECTION.ESTABLISHMENT | |||
category is automatable. Implementation: see Appendix A.3.2. | category is automatable. Implementation: see Section 5.2. | |||
o Choice of path (destination address) | o Choice of path (destination address) | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because it requires using multiple sockets, but | Automatable because it requires using multiple sockets, but | |||
obtaining multiple sockets in the CONNECTION.ESTABLISHMENT | obtaining multiple sockets in the CONNECTION.ESTABLISHMENT | |||
category is automatable. | category is automatable. | |||
o Ordered message delivery (potentially slower than unordered) | o Ordered message delivery (potentially slower than unordered) | |||
Protocols: SCTP | Protocols: SCTP | |||
Functional because this is closely tied to properties of the data | Functional because this is closely tied to properties of the data | |||
skipping to change at page 36, line 27 ¶ | skipping to change at page 45, line 4 ¶ | |||
authentication). | authentication). | |||
o Request not to delay the acknowledgement (SACK) of a message | o Request not to delay the acknowledgement (SACK) of a message | |||
Protocols: SCTP | Protocols: SCTP | |||
Optimizing because only an application knows for which message it | Optimizing because only an application knows for which message it | |||
wants to quickly be informed about success / failure of its | wants to quickly be informed about success / failure of its | |||
delivery. | delivery. | |||
Implementation over TCP: do nothing (TCP does not offer this | Implementation over TCP: do nothing (TCP does not offer this | |||
functionality, but ignoring this request from the application will | functionality, but ignoring this request from the application will | |||
not yield a semantically wrong behavior). | not yield a semantically wrong behavior). | |||
Implementation over UDP: do nothing (UDP does not offer this | Implementation over UDP: do nothing (UDP does not offer this | |||
functionality, but ignoring this request from the application will | functionality, but ignoring this request from the application will | |||
not yield a semantically wrong behavior). | not yield a semantically wrong behavior). | |||
A.1.2.2. Receiving Data | A.2.2. Receiving Data | |||
o Receive data (with no message delimiting) | o Receive data (with no message delimiting) | |||
Protocols: TCP | Protocols: TCP | |||
Functional because a transport system must be able to send and | Functional because a transport system must be able to send and | |||
receive data. | receive data. | |||
Implementation: via RECEIVE.TCP. | Implementation: via RECEIVE.TCP. | |||
Implementation over UDP: do nothing (UDP only works on messages; | Implementation over UDP: do nothing (UDP only works on messages; | |||
these can be handed over, the application can still ignore the | these can be handed over, the application can still ignore the | |||
message boundaries). | message boundaries). | |||
skipping to change at page 37, line 15 ¶ | skipping to change at page 45, line 33 ¶ | |||
that an application sends or expects to receive. | that an application sends or expects to receive. | |||
Implementation: via RECEIVE.SCTP and RECEIVE.UDP(-Lite). | Implementation: via RECEIVE.SCTP and RECEIVE.UDP(-Lite). | |||
Implementation over TCP: not possible (TCP does not support | Implementation over TCP: not possible (TCP does not support | |||
identification of message boundaries). | identification of message boundaries). | |||
o Choice of stream to receive from | o Choice of stream to receive from | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because it requires using multiple streams, but | Automatable because it requires using multiple streams, but | |||
requesting multiple streams in the CONNECTION.ESTABLISHMENT | requesting multiple streams in the CONNECTION.ESTABLISHMENT | |||
category is automatable. | category is automatable. | |||
Implementation: see Appendix A.3.2. | Implementation: see Section 5.2. | |||
o Information about partial message arrival | o Information about partial message arrival | |||
Protocols: SCTP | Protocols: SCTP | |||
Functional because this is closely tied to properties of the data | Functional because this is closely tied to properties of the data | |||
that an application sends or expects to receive. | that an application sends or expects to receive. | |||
Implementation: via RECEIVE.SCTP. | Implementation: via RECEIVE.SCTP. | |||
Implementation over TCP: do nothing (this information is not | Implementation over TCP: do nothing (this information is not | |||
available with TCP). | available with TCP). | |||
Implementation over UDP: do nothing (this information is not | Implementation over UDP: do nothing (this information is not | |||
available with UDP). | available with UDP). | |||
A.1.2.3. Errors | A.2.3. Errors | |||
This section describes sending failures that are associated with a | This section describes sending failures that are associated with a | |||
specific call to in the "Sending Data" category (Appendix A.1.2.1). | specific call to in the "Sending Data" category (Appendix A.2.1). | |||
o Notification of send failures | o Notification of send failures | |||
Protocols: SCTP, UDP(-Lite) | Protocols: SCTP, UDP(-Lite) | |||
Functional because this notifies that potentially assumed reliable | Functional because this notifies that potentially assumed reliable | |||
data delivery is no longer provided. | data delivery is no longer provided. | |||
ADDED. This differs from the 2 automatable transport features | CHANGED FROM RFC8303. This differs from the 2 automatable | |||
below in that it does not distinugish between unsent and | transport features below in that it does not distinugish between | |||
unacknowledged messages. | unsent and unacknowledged messages. | |||
Implementation: via SENDFAILURE-EVENT.SCTP and SEND_FAILURE.UDP(- | Implementation: via SENDFAILURE-EVENT.SCTP and SEND_FAILURE.UDP(- | |||
Lite). | Lite). | |||
Implementation over TCP: do nothing (this notification is not | Implementation over TCP: do nothing (this notification is not | |||
available and will therefore not occur with TCP). | available and will therefore not occur with TCP). | |||
o Notification of an unsent (part of a) message | o Notification of an unsent (part of a) message | |||
Protocols: SCTP, UDP(-Lite) | Protocols: SCTP, UDP(-Lite) | |||
Automatable because the distinction between unsent and | Automatable because the distinction between unsent and | |||
unacknowledged is network-specific. | unacknowledged does not relate to application-specific knowledge. | |||
o Notification of an unacknowledged (part of a) message | o Notification of an unacknowledged (part of a) message | |||
Protocols: SCTP | Protocols: SCTP | |||
Automatable because the distinction between unsent and | Automatable because the distinction between unsent and | |||
unacknowledged is network-specific. | unacknowledged does not relate to application-specific knowledge. | |||
o Notification that the stack has no more user data to send | o Notification that the stack has no more user data to send | |||
Protocols: SCTP | Protocols: SCTP | |||
Optimizing because reacting to this notification requires the | Optimizing because reacting to this notification requires the | |||
application to be involved, and ensuring that the stack does not | application to be involved, and ensuring that the stack does not | |||
run dry of data (for too long) can improve performance. | run dry of data (for too long) can improve performance. | |||
Implementation over TCP: do nothing (see the discussion in | Implementation over TCP: do nothing (see the discussion in | |||
Appendix A.3.4). | Section 5.4). | |||
Implementation over UDP: do nothing (this notification is not | Implementation over UDP: do nothing (this notification is not | |||
available and will therefore not occur with UDP). | available and will therefore not occur with UDP). | |||
o Notification to a receiver that a partial message delivery has | o Notification to a receiver that a partial message delivery has | |||
been aborted | been aborted | |||
Protocols: SCTP | Protocols: SCTP | |||
Functional because this is closely tied to properties of the data | Functional because this is closely tied to properties of the data | |||
that an application sends or expects to receive. | that an application sends or expects to receive. | |||
Implementation over TCP: do nothing (this notification is not | Implementation over TCP: do nothing (this notification is not | |||
available and will therefore not occur with TCP). | available and will therefore not occur with TCP). | |||
Implementation over UDP: do nothing (this notification is not | Implementation over UDP: do nothing (this notification is not | |||
available and will therefore not occur with UDP). | available and will therefore not occur with UDP). | |||
A.2. Step 2: Reduction -- The Reduced Set of Transport Features | ||||
By hiding automatable transport features from the application, a | ||||
transport system can gain opportunities to automate the usage of | ||||
network-related functionality. This can facilitate using the | ||||
transport system for the application programmer and it allows for | ||||
optimizations that may not be possible for an application. For | ||||
instance, system-wide configurations regarding the usage of multiple | ||||
interfaces can better be exploited if the choice of the interface is | ||||
not entirely up to the application. Therefore, since they are not | ||||
strictly necessary to expose in a transport system, we do not include | ||||
automatable transport features in the reduced set of transport | ||||
features. This leaves us with only the transport features that are | ||||
either optimizing or functional. | ||||
A transport system should be able to communicate via TCP or UDP if | ||||
alternative transport protocols are found not to work. For many | ||||
transport features, this is possible -- often by simply not doing | ||||
anything when a specific request is made. For some transport | ||||
features, however, it was identified that direct usage of neither TCP | ||||
nor UDP is possible: in these cases, even not doing anything would | ||||
incur semantically incorrect behavior. Whenever an application would | ||||
make use of one of these transport features, this would eliminate the | ||||
possibility to use TCP or UDP. Thus, we only keep the functional and | ||||
optimizing transport features for which an implementation over either | ||||
TCP or UDP is possible in our reduced set. | ||||
The "minimal set" derived in this document is meant to be | ||||
implementable "one-sided" over TCP, and, with limitations, UDP. In | ||||
the following list, we therefore precede a transport feature with | ||||
"T:" if an implementation over TCP is possible, "U:" if an | ||||
implementation over UDP is possible, and "TU:" if an implementation | ||||
over either TCP or UDP is possible. | ||||
A.2.1. CONNECTION Related Transport Features | ||||
ESTABLISHMENT: | ||||
o T,U: Connect | ||||
o T,U: Specify number of attempts and/or timeout for the first | ||||
establishment message | ||||
o T: Configure authentication | ||||
o T: Hand over a message to reliably transfer (possibly multiple | ||||
times) before connection establishment | ||||
o T: Hand over a message to reliably transfer during connection | ||||
establishment | ||||
AVAILABILITY: | ||||
o T,U: Listen | ||||
o T: Configure authentication | ||||
MAINTENANCE: | ||||
o T: Change timeout for aborting connection (using retransmit limit | ||||
or time value) | ||||
o T: Suggest timeout to the peer | ||||
o T,U: Disable Nagle algorithm | ||||
o T,U: Notification of Excessive Retransmissions (early warning | ||||
below abortion threshold) | ||||
o T,U: Specify DSCP field | ||||
o T,U: Notification of ICMP error message arrival | ||||
o T: Change authentication parameters | ||||
o T: Obtain authentication information | ||||
o T,U: Set Cookie life value | ||||
o T,U: Choose a scheduler to operate between streams of an | ||||
association | ||||
o T,U: Configure priority or weight for a scheduler | ||||
o T,U: Disable checksum when sending | ||||
o T,U: Disable checksum requirement when receiving | ||||
o T,U: Specify checksum coverage used by the sender | ||||
o T,U: Specify minimum checksum coverage required by receiver | ||||
o T,U: Specify DF field | ||||
o T,U: Get max. transport-message size that may be sent using a non- | ||||
fragmented IP packet from the configured interface | ||||
o T,U: Get max. transport-message size that may be received from the | ||||
configured interface | ||||
o T,U: Obtain ECN field | ||||
o T,U: Enable and configure a "Low Extra Delay Background Transfer" | ||||
TERMINATION: | ||||
o T: Close after reliably delivering all remaining data, causing an | ||||
event informing the application on the other side | ||||
o T: Abort without delivering remaining data, causing an event | ||||
informing the application on the other side | ||||
o T,U: Abort without delivering remaining data, not causing an event | ||||
informing the application on the other side | ||||
o T,U: Timeout event when data could not be delivered for too long | ||||
A.2.2. DATA Transfer Related Transport Features | ||||
A.2.2.1. Sending Data | ||||
o T: Reliably transfer data, with congestion control | ||||
o T: Reliably transfer a message, with congestion control | ||||
o T,U: Unreliably transfer a message | ||||
o T: Configurable Message Reliability | ||||
o T: Ordered message delivery (potentially slower than unordered) | ||||
o T,U: Unordered message delivery (potentially faster than ordered) | ||||
o T,U: Request not to bundle messages | ||||
o T: Specifying a key id to be used to authenticate a message | ||||
o T,U: Request not to delay the acknowledgement (SACK) of a message | ||||
A.2.2.2. Receiving Data | ||||
o T,U: Receive data (with no message delimiting) | ||||
o U: Receive a message | ||||
o T,U: Information about partial message arrival | ||||
A.2.2.3. Errors | ||||
This section describes sending failures that are associated with a | ||||
specific call to in the "Sending Data" category (Appendix A.1.2.1). | ||||
o T,U: Notification of send failures | ||||
o T,U: Notification that the stack has no more user data to send | ||||
o T,U: Notification to a receiver that a partial message delivery | ||||
has been aborted | ||||
A.3. Step 3: Discussion | ||||
The reduced set in the previous section exhibits a number of | ||||
peculiarities, which we will discuss in the following. This section | ||||
focuses on TCP because, with the exception of one particular | ||||
transport feature ("Receive a message" -- we will discuss this in | ||||
Appendix A.3.1), the list shows that UDP is strictly a subset of TCP. | ||||
We can first try to understand how to build a transport system that | ||||
can run over TCP, and then narrow down the result further to allow | ||||
that the system can always run over either TCP or UDP (which | ||||
effectively means removing everything related to reliability, | ||||
ordering, authentication and closing/aborting with a notification to | ||||
the peer). | ||||
Note that, because the functional transport features of UDP are -- | ||||
with the exception of "Receive a message" -- a subset of TCP, TCP can | ||||
be used as a replacement for UDP whenever an application does not | ||||
need message delimiting (e.g., because the application-layer protocol | ||||
already does it). This has been recognized by many applications that | ||||
already do this in practice, by trying to communicate with UDP at | ||||
first, and falling back to TCP in case of a connection failure. | ||||
A.3.1. Sending Messages, Receiving Bytes | ||||
For implementing a transport system over TCP, there are several | ||||
transport features related to sending, but only a single transport | ||||
feature related to receiving: "Receive data (with no message | ||||
delimiting)" (and, strangely, "information about partial message | ||||
arrival"). Notably, the transport feature "Receive a message" is | ||||
also the only non-automatable transport feature of UDP(-Lite) for | ||||
which no implementation over TCP is possible. | ||||
To support these TCP receiver semantics, we define an "Application- | ||||
Framed Bytestream" (AFra-Bytestream). AFra-Bytestreams allow senders | ||||
to operate on messages while minimizing changes to the TCP socket | ||||
API. In particular, nothing changes on the receiver side - data can | ||||
be accepted via a normal TCP socket. | ||||
In an AFra-Bytestream, the sending application can optionally inform | ||||
the transport about message boundaries and required properties per | ||||
message (configurable order and reliability, or embedding a request | ||||
not to delay the acknowledgement of a message). Whenever the sending | ||||
application specifies per-message properties that relax the notion of | ||||
reliable in-order delivery of bytes, it must assume that the | ||||
receiving application is 1) able to determine message boundaries, | ||||
provided that messages are always kept intact, and 2) able to accept | ||||
these relaxed per-message properties. Any signaling of such | ||||
information to the peer is up to an application-layer protocol and | ||||
considered out of scope of this document. | ||||
For example, if an application requests to transfer fixed-size | ||||
messages of 100 bytes with partial reliability, this needs the | ||||
receiving application to be prepared to accept data in chunks of 100 | ||||
bytes. If, then, some of these 100-byte messages are missing (e.g., | ||||
if SCTP with Configurable Reliability is used), this is the expected | ||||
application behavior. With TCP, no messages would be missing, but | ||||
this is also correct for the application, and the possible | ||||
retransmission delay is acceptable within the best effort service | ||||
model (see [RFC7305], Section 3.5). Still, the receiving application | ||||
would separate the byte stream into 100-byte chunks. | ||||
Note that this usage of messages does not require all messages to be | ||||
equal in size. Many application protocols use some form of Type- | ||||
Length-Value (TLV) encoding, e.g. by defining a header including | ||||
length fields; another alternative is the use of byte stuffing | ||||
methods such as COBS [COBS]. If an application needs message | ||||
numbers, e.g. to restore the correct sequence of messages, these must | ||||
also be encoded by the application itself, as the sequence number | ||||
related transport features of SCTP are not provided by the "minimum | ||||
set" (in the interest of enabling usage of TCP). | ||||
A.3.2. Stream Schedulers Without Streams | ||||
We have already stated that multi-streaming does not require | ||||
application-specific knowledge. Potential benefits or disadvantages | ||||
of, e.g., using two streams of an SCTP association versus using two | ||||
separate SCTP associations or TCP connections are related to | ||||
knowledge about the network and the particular transport protocol in | ||||
use, not the application. However, the transport features "Choose a | ||||
scheduler to operate between streams of an association" and | ||||
"Configure priority or weight for a scheduler" operate on streams. | ||||
Here, streams identify communication channels between which a | ||||
scheduler operates, and they can be assigned a priority. Moreover, | ||||
the transport features in the MAINTENANCE category all operate on | ||||
assocations in case of SCTP, i.e. they apply to all streams in that | ||||
assocation. | ||||
With only these semantics necessary to represent, the interface to a | ||||
transport system becomes easier if we assume that connections may be | ||||
a transport protocol's connection or association, but could also be a | ||||
stream of an existing SCTP association, for example. We only need to | ||||
allow for a way to define a possible grouping of connections. Then, | ||||
all MAINTENANCE transport features can be said to operate on | ||||
connection groups, not connections, and a scheduler operates on the | ||||
connections within a group. | ||||
To be compatible with multiple transport protocols and uniformly | ||||
allow access to both transport connections and streams of a multi- | ||||
streaming protocol, the semantics of opening and closing need to be | ||||
the most restrictive subset of all of the underlying options. For | ||||
example, TCP's support of half-closed connections can be seen as a | ||||
feature on top of the more restrictive "ABORT"; this feature cannot | ||||
be supported because not all protocols used by a transport system | ||||
(including streams of an association) support half-closed | ||||
connections. | ||||
A.3.3. Early Data Transmission | ||||
There are two transport features related to transferring a message | ||||
early: "Hand over a message to reliably transfer (possibly multiple | ||||
times) before connection establishment", which relates to TCP Fast | ||||
Open [RFC7413], and "Hand over a message to reliably transfer during | ||||
connection establishment", which relates to SCTP's ability to | ||||
transfer data together with the COOKIE-Echo chunk. Also without TCP | ||||
Fast Open, TCP can transfer data during the handshake, together with | ||||
the SYN packet -- however, the receiver of this data may not hand it | ||||
over to the application until the handshake has completed. Also, | ||||
different from TCP Fast Open, this data is not delimited as a message | ||||
by TCP (thus, not visible as a ``message''). This functionality is | ||||
commonly available in TCP and supported in several implementations, | ||||
even though the TCP specification does not explain how to provide it | ||||
to applications. | ||||
A transport system could differentiate between the cases of | ||||
transmitting data "before" (possibly multiple times) or "during" the | ||||
handshake. Alternatively, it could also assume that data that are | ||||
handed over early will be transmitted as early as possible, and | ||||
"before" the handshake would only be used for messages that are | ||||
explicitly marked as "idempotent" (i.e., it would be acceptable to | ||||
transfer them multiple times). | ||||
The amount of data that can successfully be transmitted before or | ||||
during the handshake depends on various factors: the transport | ||||
protocol, the use of header options, the choice of IPv4 and IPv6 and | ||||
the Path MTU. A transport system should therefore allow a sending | ||||
application to query the maximum amount of data it can possibly | ||||
transmit before (or, if exposed, during) connection establishment. | ||||
A.3.4. Sender Running Dry | ||||
The transport feature "Notification that the stack has no more user | ||||
data to send" relates to SCTP's "SENDER DRY" notification. Such | ||||
notifications can, in principle, be used to avoid having an | ||||
unnecessarily large send buffer, yet ensure that the transport sender | ||||
always has data available when it has an opportunity to transmit it. | ||||
This has been found to be very beneficial for some applications | ||||
[WWDC2015]. However, "SENDER DRY" truly means that the entire send | ||||
buffer (including both unsent and unacknowledged data) has emptied -- | ||||
i.e., when it notifies the sender, it is already too late, the | ||||
transport protocol already missed an opportunity to send data. Some | ||||
modern TCP implementations now include the unspecified | ||||
"TCP_NOTSENT_LOWAT" socket option that was proposed in [WWDC2015], | ||||
which limits the amount of unsent data that TCP can keep in the | ||||
socket buffer; this allows to specify at which buffer filling level | ||||
the socket becomes writable, rather than waiting for the buffer to | ||||
run empty. | ||||
SCTP allows to configure the sender-side buffer too: the automatable | ||||
Transport Feature "Configure send buffer size" provides this | ||||
functionality, but only for the complete buffer, which includes both | ||||
unsent and unacknowledged data. SCTP does not allow to control these | ||||
two sizes separately. It therefore makes sense for a transport | ||||
system to allow for uniform access to "TCP_NOTSENT_LOWAT" as well as | ||||
the "SENDER DRY" notification. | ||||
A.3.5. Capacity Profile | ||||
The transport features: | ||||
o Disable Nagle algorithm | ||||
o Enable and configure a "Low Extra Delay Background Transfer" | ||||
o Specify DSCP field | ||||
all relate to a QoS-like application need such as "low latency" or | ||||
"scavenger". In the interest of flexibility of a transport system, | ||||
they could therefore be offered in a uniform, more abstract way, | ||||
where a transport system could e.g. decide by itself how to use | ||||
combinations of LEDBAT-like congestion control and certain DSCP | ||||
values, and an application would only specify a general "capacity | ||||
profile" (a description of how it wants to use the available | ||||
capacity). A need for "lowest possible latency at the expense of | ||||
overhead" could then translate into automatically disabling the Nagle | ||||
algorithm. | ||||
In some cases, the Nagle algorithm is best controlled directly by the | ||||
application because it is not only related to a general profile but | ||||
also to knowledge about the size of future messages. For fine-grain | ||||
control over Nagle-like functionality, the "Request not to bundle | ||||
messages" is available. | ||||
A.3.6. Security | ||||
Both TCP and SCTP offer authentication. TCP authenticates complete | ||||
segments. SCTP allows to configure which of SCTP's chunk types must | ||||
always be authenticated -- if this is exposed as such, it creates an | ||||
undesirable dependency on the transport protocol. For compatibility | ||||
with TCP, a transport system should only allow to configure complete | ||||
transport layer packets, including headers, IP pseudo-header (if any) | ||||
and payload. | ||||
Security is discussed in a separate document | ||||
[I-D.ietf-taps-transport-security]. The minimal set presented in the | ||||
present document excludes all security related transport features: | ||||
"Configure authentication", "Change authentication parameters", | ||||
"Obtain authentication information" and and "Set Cookie life value" | ||||
as well as "Specifying a key id to be used to authenticate a | ||||
message". | ||||
A.3.7. Packet Size | ||||
UDP(-Lite) has a transport feature called "Specify DF field". This | ||||
yields an error message in case of sending a message that exceeds the | ||||
Path MTU, which is necessary for a UDP-based application to be able | ||||
to implement Path MTU Discovery (a function that UDP-based | ||||
applications must do by themselves). The "Get max. transport-message | ||||
size that may be sent using a non-fragmented IP packet from the | ||||
configured interface" transport feature yields an upper limit for the | ||||
Path MTU (minus headers) and can therefore help to implement Path MTU | ||||
Discovery more efficiently. | ||||
Appendix B. Revision information | Appendix B. Revision information | |||
XXX RFC-Ed please remove this section prior to publication. | XXX RFC-Ed please remove this section prior to publication. | |||
-02: implementation suggestions added, discussion section added, | -02: implementation suggestions added, discussion section added, | |||
terminology extended, DELETED category removed, various other fixes; | terminology extended, DELETED category removed, various other fixes; | |||
list of Transport Features adjusted to -01 version of [RFC8303] | list of Transport Features adjusted to -01 version of [RFC8303] | |||
except that MPTCP is not included. | except that MPTCP is not included. | |||
-03: updated to be consistent with -02 version of [RFC8303]. | -03: updated to be consistent with -02 version of [RFC8303]. | |||
skipping to change at page 47, line 32 ¶ | skipping to change at page 48, line 42 ¶ | |||
..). | ..). | |||
WG -05: addressed comments from Spencer Dawkins. | WG -05: addressed comments from Spencer Dawkins. | |||
WG -06: Fixed nits. | WG -06: Fixed nits. | |||
WG -07: Addressed Genart comments from Robert Sparks. | WG -07: Addressed Genart comments from Robert Sparks. | |||
WG -08: Addressed one more Genart comment from Robert Sparks. | WG -08: Addressed one more Genart comment from Robert Sparks. | |||
Authors' Addresses | WG -09: Addressed comments from Mirja Kuehlewind, Alvaro Retana, Ben | |||
Campbell, Benjamin Kaduk and Eric Rescorla. | ||||
Authors' Addresses | ||||
Michael Welzl | Michael Welzl | |||
University of Oslo | University of Oslo | |||
PO Box 1080 Blindern | PO Box 1080 Blindern | |||
Oslo N-0316 | Oslo N-0316 | |||
Norway | Norway | |||
Phone: +47 22 85 24 20 | Phone: +47 22 85 24 20 | |||
Email: michawe@ifi.uio.no | Email: michawe@ifi.uio.no | |||
Stein Gjessing | Stein Gjessing | |||
End of changes. 87 change blocks. | ||||
612 lines changed or deleted | 670 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |