draft-ietf-tcpm-accurate-ecn-10.txt   draft-ietf-tcpm-accurate-ecn-11.txt 
TCP Maintenance & Minor Extensions (tcpm) B. Briscoe TCP Maintenance & Minor Extensions (tcpm) B. Briscoe
Internet-Draft Independent Internet-Draft Independent
Intended status: Experimental M. Kuehlewind Updates: 3168 (if approved) M. Kuehlewind
Expires: September 6, 2020 Ericsson Intended status: Standards Track Ericsson
R. Scheffenegger Expires: September 6, 2020 R. Scheffenegger
NetApp NetApp
March 5, 2020 March 5, 2020
More Accurate ECN Feedback in TCP More Accurate ECN Feedback in TCP
draft-ietf-tcpm-accurate-ecn-10 draft-ietf-tcpm-accurate-ecn-11
Abstract Abstract
Explicit Congestion Notification (ECN) is a mechanism where network Explicit Congestion Notification (ECN) is a mechanism where network
nodes can mark IP packets instead of dropping them to indicate nodes can mark IP packets instead of dropping them to indicate
incipient congestion to the end-points. Receivers with an ECN- incipient congestion to the end-points. Receivers with an ECN-
capable transport protocol feed back this information to the sender. capable transport protocol feed back this information to the sender.
ECN is specified for TCP in such a way that only one feedback signal ECN is specified for TCP in such a way that only one feedback signal
can be transmitted per Round-Trip Time (RTT). Recent new TCP can be transmitted per Round-Trip Time (RTT). Recent new TCP
mechanisms like Congestion Exposure (ConEx), Data Center TCP (DCTCP) mechanisms like Congestion Exposure (ConEx), Data Center TCP (DCTCP)
or Low Latency Low Loss Scalable Throughput (L4S) need more accurate or Low Latency Low Loss Scalable Throughput (L4S) need more accurate
ECN feedback information whenever more than one marking is received ECN feedback information whenever more than one marking is received
in one RTT. This document specifies an experimental scheme to in one RTT. This document specifies a scheme to provide more than
provide more than one feedback signal per RTT in the TCP header. one feedback signal per RTT in the TCP header. Given TCP header
Given TCP header space is scarce, it allocates a reserved header bit, space is scarce, it allocates a reserved header bit, that was
that was previously used for the ECN-Nonce which has now been previously used for the ECN-Nonce which has now been declared
declared historic. It also overloads the two existing ECN flags in historic. It also overloads the two existing ECN flags in the TCP
the TCP header. The resulting extra space is exploited to feed back header. The resulting extra space is exploited to feed back the IP-
the IP-ECN field received during the 3-way handshake as well. ECN field received during the 3-way handshake as well. Supplementary
Supplementary feedback information can optionally be provided in a feedback information can optionally be provided in a new TCP option,
new TCP option, which is never used on the TCP SYN. which is never used on the TCP SYN.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
skipping to change at page 2, line 24 skipping to change at page 2, line 24
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Document Roadmap . . . . . . . . . . . . . . . . . . . . 4 1.1. Document Roadmap . . . . . . . . . . . . . . . . . . . . 5
1.2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3. Experiment Goals . . . . . . . . . . . . . . . . . . . . 5 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5
1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 1.4. Recap of Existing ECN feedback in IP/TCP . . . . . . . . 6
1.5. Recap of Existing ECN feedback in IP/TCP . . . . . . . . 7
2. AccECN Protocol Overview and Rationale . . . . . . . . . . . 8 2. AccECN Protocol Overview and Rationale . . . . . . . . . . . 8
2.1. Capability Negotiation . . . . . . . . . . . . . . . . . 9 2.1. Capability Negotiation . . . . . . . . . . . . . . . . . 9
2.2. Feedback Mechanism . . . . . . . . . . . . . . . . . . . 9 2.2. Feedback Mechanism . . . . . . . . . . . . . . . . . . . 9
2.3. Delayed ACKs and Resilience Against ACK Loss . . . . . . 10 2.3. Delayed ACKs and Resilience Against ACK Loss . . . . . . 9
2.4. Feedback Metrics . . . . . . . . . . . . . . . . . . . . 11 2.4. Feedback Metrics . . . . . . . . . . . . . . . . . . . . 10
2.5. Generic (Dumb) Reflector . . . . . . . . . . . . . . . . 11 2.5. Generic (Dumb) Reflector . . . . . . . . . . . . . . . . 11
3. AccECN Protocol Specification . . . . . . . . . . . . . . . . 12 3. AccECN Protocol Specification . . . . . . . . . . . . . . . . 12
3.1. Negotiating to use AccECN . . . . . . . . . . . . . . . . 12 3.1. Negotiating to use AccECN . . . . . . . . . . . . . . . . 12
3.1.1. Negotiation during the TCP handshake . . . . . . . . 12 3.1.1. Negotiation during the TCP handshake . . . . . . . . 12
3.1.2. Backward Compatibility . . . . . . . . . . . . . . . 13 3.1.2. Backward Compatibility . . . . . . . . . . . . . . . 13
3.1.3. Forward Compatibility . . . . . . . . . . . . . . . . 15 3.1.3. Forward Compatibility . . . . . . . . . . . . . . . . 15
3.1.4. Retransmission of the SYN . . . . . . . . . . . . . . 16 3.1.4. Retransmission of the SYN . . . . . . . . . . . . . . 15
3.1.5. Implications of AccECN Mode . . . . . . . . . . . . . 17 3.1.5. Implications of AccECN Mode . . . . . . . . . . . . . 16
3.2. AccECN Feedback . . . . . . . . . . . . . . . . . . . . . 18 3.2. AccECN Feedback . . . . . . . . . . . . . . . . . . . . . 18
3.2.1. Initialization of Feedback Counters . . . . . . . . . 19 3.2.1. Initialization of Feedback Counters . . . . . . . . . 18
3.2.2. The ACE Field . . . . . . . . . . . . . . . . . . . . 19 3.2.2. The ACE Field . . . . . . . . . . . . . . . . . . . . 19
3.2.3. The AccECN Option . . . . . . . . . . . . . . . . . . 27 3.2.3. The AccECN Option . . . . . . . . . . . . . . . . . . 26
3.3. Requirements for TCP Proxies, Offload Engines and other 3.3. Requirements for TCP Proxies, Offload Engines and other
Middleboxes on AccECN Compliance . . . . . . . . . . . . 36 Middleboxes on AccECN Compliance . . . . . . . . . . . . 35
4. Interaction with Other TCP Variants . . . . . . . . . . . . . 37 4. Updates to RFC 3168 . . . . . . . . . . . . . . . . . . . . . 36
4.1. Compatibility with SYN Cookies . . . . . . . . . . . . . 37 5. Interaction with TCP Variants . . . . . . . . . . . . . . . . 37
4.2. Compatibility with Other TCP Options and Experiments . . 38 5.1. Compatibility with SYN Cookies . . . . . . . . . . . . . 37
4.3. Compatibility with Feedback Integrity Mechanisms . . . . 38 5.2. Compatibility with TCP Experiments and Common TCP Options 38
5.3. Compatibility with Feedback Integrity Mechanisms . . . . 39
5. Protocol Properties . . . . . . . . . . . . . . . . . . . . . 40 6. Protocol Properties . . . . . . . . . . . . . . . . . . . . . 40
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 42 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 42
7. Security Considerations . . . . . . . . . . . . . . . . . . . 43 8. Security Considerations . . . . . . . . . . . . . . . . . . . 43
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 43 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 44
9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . 44 10. Comments Solicited . . . . . . . . . . . . . . . . . . . . . 44
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 44 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 44
10.1. Normative References . . . . . . . . . . . . . . . . . . 44 11.1. Normative References . . . . . . . . . . . . . . . . . . 44
10.2. Informative References . . . . . . . . . . . . . . . . . 45 11.2. Informative References . . . . . . . . . . . . . . . . . 45
Appendix A. Example Algorithms . . . . . . . . . . . . . . . . . 47 Appendix A. Example Algorithms . . . . . . . . . . . . . . . . . 48
A.1. Example Algorithm to Encode/Decode the AccECN Option . . 47 A.1. Example Algorithm to Encode/Decode the AccECN Option . . 48
A.2. Example Algorithm for Safety Against Long Sequences of A.2. Example Algorithm for Safety Against Long Sequences of
ACK Loss . . . . . . . . . . . . . . . . . . . . . . . . 48 ACK Loss . . . . . . . . . . . . . . . . . . . . . . . . 49
A.2.1. Safety Algorithm without the AccECN Option . . . . . 48 A.2.1. Safety Algorithm without the AccECN Option . . . . . 49
A.2.2. Safety Algorithm with the AccECN Option . . . . . . . 50 A.2.2. Safety Algorithm with the AccECN Option . . . . . . . 51
A.3. Example Algorithm to Estimate Marked Bytes from Marked A.3. Example Algorithm to Estimate Marked Bytes from Marked
Packets . . . . . . . . . . . . . . . . . . . . . . . . . 52 Packets . . . . . . . . . . . . . . . . . . . . . . . . . 53
A.4. Example Algorithm to Beacon AccECN Options . . . . . . . 52 A.4. Example Algorithm to Beacon AccECN Options . . . . . . . 53
A.5. Example Algorithm to Count Not-ECT Bytes . . . . . . . . 53 A.5. Example Algorithm to Count Not-ECT Bytes . . . . . . . . 54
Appendix B. Rationale for Usage of TCP Header Flags . . . . . . 54 Appendix B. Rationale for Usage of TCP Header Flags . . . . . . 55
B.1. Three TCP Header Flags in the SYN-SYN/ACK Handshake . . . 54 B.1. Three TCP Header Flags in the SYN-SYN/ACK Handshake . . . 55
B.2. Four Codepoints in the SYN/ACK . . . . . . . . . . . . . 55 B.2. Four Codepoints in the SYN/ACK . . . . . . . . . . . . . 56
B.3. Space for Future Evolution . . . . . . . . . . . . . . . 55 B.3. Space for Future Evolution . . . . . . . . . . . . . . . 56
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 57 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 58
1. Introduction 1. Introduction
Explicit Congestion Notification (ECN) [RFC3168] is a mechanism where Explicit Congestion Notification (ECN) [RFC3168] is a mechanism where
network nodes can mark IP packets instead of dropping them to network nodes can mark IP packets instead of dropping them to
indicate incipient congestion to the end-points. Receivers with an indicate incipient congestion to the end-points. Receivers with an
ECN-capable transport protocol feed back this information to the ECN-capable transport protocol feed back this information to the
sender. ECN is specified for TCP in such a way that only one sender. In RFC 3168, ECN was specified for TCP in such a way that
feedback signal can be transmitted per Round-Trip Time (RTT). only one feedback signal could be transmitted per Round-Trip Time
Recently, proposed mechanisms like Congestion Exposure (ConEx (RTT). Recently, proposed mechanisms like Congestion Exposure (ConEx
[RFC7713]), DCTCP [RFC8257] or L4S [I-D.ietf-tsvwg-l4s-arch] need to [RFC7713]), DCTCP [RFC8257] or L4S [I-D.ietf-tsvwg-l4s-arch] need to
know when more than one marking is received in one RTT which is know when more than one marking is received in one RTT which is
information that cannot be provided by the feedback scheme as information that cannot be provided by the feedback scheme as
specified in [RFC3168]. This document specifies an alternative specified in [RFC3168]. This document specifies an update to the ECN
feedback scheme that provides more accurate information and could be feedback scheme of RFC 3168 that provides more accurate information
used by these new TCP extensions. A fuller treatment of the and could be used by these and potentially other future TCP
motivation for this specification is given in the associated extensions. A fuller treatment of the motivation for this
requirements document [RFC7560]. specification is given in the associated requirements document
[RFC7560].
This documents specifies an experimental scheme for ECN feedback in This documents specifies a standards track scheme for ECN feedback in
the TCP header to provide more than one feedback signal per RTT. It the TCP header to provide more than one feedback signal per RTT. It
will be called the more accurate ECN feedback scheme, or AccECN for will be called the more accurate ECN feedback scheme, or AccECN for
short. If AccECN progresses from experimental to the standards short. This document updates RFC 3168 with respect to negotiation
track, it is intended to be a complete replacement for classic TCP/ and use of the feedback scheme for TCP. All aspects of RFC 3168
ECN feedback, not a fork in the design of TCP. AccECN feedback other than the TCP feedback scheme, in particular the definition of
complements TCP's loss feedback and it supplements classic TCP/ECN ECN at the IP layer, remain unchanged by this specification.
feedback, so its applicability is intended to include all public and Section 4 gives a more detailed specification of exactly which
private IP networks (and even any non-IP networks over which TCP is aspects of RFC 3168 this document updates.
used today), whether or not any nodes on the path support ECN of
whatever flavour.
Until the AccECN experiment succeeds, [RFC3168] will remain as the AccECN is intended to be a complete replacement for classic TCP/ECN
only standards track specification for adding ECN to TCP. To avoid feedback, not a fork in the design of TCP. AccECN feedback
confusion, in this document we use the term 'classic ECN' for the complements TCP's loss feedback and it can coexist alongside
pre-existing ECN specification [RFC3168]. 'classic' [RFC3168] TCP/ECN feedback. So its applicability is
intended to include all public and private IP networks (and even any
non-IP networks over which TCP is used today), whether or not any
nodes on the path support ECN, of whatever flavour. This document
uses the term Classic ECN when it needs to distinguish the RFC 3168
ECN TCP feedback scheme from the AccECN TCP feedback scheme.
AccECN feedback overloads the two existing ECN flags and allocates AccECN feedback overloads the two existing ECN flags in the TCP
the currently reserved flag (previously called NS) in the TCP header, header and allocates the currently reserved flag (previously called
to be used as one field indicating the number of congestion NS) in the TCP header, to be used as one three-bit counter field
experienced marked packets. Given the new definitions of these three indicating the number of congestion experienced marked packets.
bits, both ends have to support the new wire protocol before it can Given the new definitions of these three bits, both ends have to
be used. Therefore during the TCP handshake the two ends use these support the new wire protocol before it can be used. Therefore
three bits in the TCP header to negotiate the most advanced feedback during the TCP handshake the two ends use these three bits in the TCP
protocol that they can both support, in a way that is backward header to negotiate the most advanced feedback protocol that they can
compatible with [RFC3168]. both support, in a way that is backward compatible with [RFC3168].
AccECN is solely an (experimental) change to the TCP wire protocol; AccECN is solely a change to the TCP wire protocol; it covers the
it only specifies the negotiation and signaling of more accurate ECN negotiation and signaling of more accurate ECN feedback from a TCP
feedback from a TCP Data Receiver to a Data Sender. It is completely Data Receiver to a Data Sender. It is completely independent of how
independent of how TCP might respond to congestion feedback, which is TCP might respond to congestion feedback, which is out of scope, but
out of scope. For that we refer to [RFC3168] or any RFC that ultimately the motivation for accurate ECN feedback. Like Classic
specifies a different response to TCP ECN feedback, for example: ECN feedback, AccECN can be used by standard Reno congestion control
[RFC8257]; or ECN experiments such as those referred to in [RFC8311], [RFC5681] to respond to the existence of at least one congestion
namely: a TCP-based Low Latency Low Loss Scalable (L4S) congestion notification within a round trip. Or, unlike Reno, AccECN can be
control [I-D.ietf-tsvwg-l4s-arch]; ECN-capable TCP control packets used to respond to the extent of congestion notification over a round
[I-D.ietf-tcpm-generalized-ecn], or Alternative Backoff with ECN trip, as for example DCTCP does in controlled environments [RFC8257].
(ABE) [RFC8511]. For congestion response, this specification refers to RFC 3168, or
ECN experiments such as those referred to in [RFC8311], namely: a
TCP-based Low Latency Low Loss Scalable (L4S) congestion control
[I-D.ietf-tsvwg-l4s-arch]; or Alternative Backoff with ECN (ABE)
[RFC8511].
It is recommended that the AccECN protocol is implemented alongside It is recommended that the AccECN protocol is implemented alongside
SACK [RFC2018] and the experimental ECN++ protocol SACK [RFC2018] and the experimental ECN++ protocol
[I-D.ietf-tcpm-generalized-ecn], which allows the ECN capability to [I-D.ietf-tcpm-generalized-ecn], which allows the ECN capability to
be used on TCP control packets. Therefore, this specification does be used on TCP control packets. Therefore, this specification does
not discuss implementing AccECN alongside [RFC5562], which was an not discuss implementing AccECN alongside [RFC5562], which was an
earlier experimental protocol with narrower scope than ECN++. earlier experimental protocol with narrower scope than ECN++.
1.1. Document Roadmap 1.1. Document Roadmap
The following introductory sections outline the goals of AccECN The following introductory section outlines the goals of AccECN
(Section 1.2) and the goal of experiments with ECN (Section 1.3) so (Section 1.2). Then terminology is defined (Section 1.3) and a recap
that it is clear what success would look like. Then terminology is of existing prerequisite technology is given (Section 1.4).
defined (Section 1.4) and a recap of existing prerequisite technology
is given (Section 1.5).
Section 2 gives an informative overview of the AccECN protocol. Then Section 2 gives an informative overview of the AccECN protocol. Then
Section 3 gives the normative protocol specification. Section 4 Section 3 gives the normative protocol specification, and Section 4
assesses the interaction of AccECN with commonly used variants of clarifies which aspects of RFC 3168 are updated by this
TCP, whether standardized or not. Section 5 summarizes the features specification. Section 5 assesses the interaction of AccECN with
and properties of AccECN. commonly used variants of TCP, whether standardized or not.
Section 6 summarizes the features and properties of AccECN.
Section 6 summarizes the protocol fields and numbers that IANA will Section 7 summarizes the protocol fields and numbers that IANA will
need to assign and Section 7 points to the aspects of the protocol need to assign and Section 8 points to the aspects of the protocol
that will be of interest to the security community. that will be of interest to the security community.
Appendix A gives pseudocode examples for the various algorithms that Appendix A gives pseudocode examples for the various algorithms that
AccECN uses. AccECN uses and Appendix B explains why AccECN uses flags in the main
TCP header and quantifies the space left for future use.
1.2. Goals 1.2. Goals
[RFC7560] enumerates requirements that a candidate feedback scheme [RFC7560] enumerates requirements that a candidate feedback scheme
will need to satisfy, under the headings: resilience, timeliness, will need to satisfy, under the headings: resilience, timeliness,
integrity, accuracy (including ordering and lack of bias), integrity, accuracy (including ordering and lack of bias),
complexity, overhead and compatibility (both backward and forward). complexity, overhead and compatibility (both backward and forward).
It recognizes that a perfect scheme that fully satisfies all the It recognizes that a perfect scheme that fully satisfies all the
requirements is unlikely and trade-offs between requirements are requirements is unlikely and trade-offs between requirements are
likely. Section 5 presents the properties of AccECN against these likely. Section 6 presents the properties of AccECN against these
requirements and discusses the trade-offs made. requirements and discusses the trade-offs made.
The requirements document recognizes that a protocol as ubiquitous as The requirements document recognizes that a protocol as ubiquitous as
TCP needs to be able to serve as-yet-unspecified requirements. TCP needs to be able to serve as-yet-unspecified requirements.
Therefore an AccECN receiver aims to act as a generic (dumb) Therefore an AccECN receiver aims to act as a generic (dumb)
reflector of congestion information so that in future new sender reflector of congestion information so that in future new sender
behaviours can be deployed unilaterally. behaviours can be deployed unilaterally.
1.3. Experiment Goals 1.3. Terminology
TCP is critical to the robust functioning of the Internet, therefore
any proposed modifications to TCP need to be thoroughly tested. The
present specification describes an experimental protocol that adds
more accurate ECN feedback to the TCP protocol. The intention is to
specify the protocol sufficiently so that more than one
implementation can be built in order to test its function, robustness
and interoperability (with itself and with previous version of ECN
and TCP).
The experimental protocol will be considered successful if testing
confirms that the proposed mechanism can be deployed at large scale.
Testing will mostly focus on fall-back strategies in case of
middlebox interference. Current recommended strategies are specified
in Sections 3.1.4, 3.2.2.3, 3.2.2.4 and 3.2.3.2. The effectiveness
of these strategies depends on the actual deployment situation of
middleboxes. Therefore experimental verification to confirm large-
scale path traversal in the Internet is needed before finalizing this
specification on the Standards Track.
Another experimentation focus is the implementation feasibiliy of
change-triggered ACKs as described in section 3.2.3.3. While on
average this should not lead to a higher ACK rate, it changes the ACK
pattern which can particularly have an impact on hardware offload.
It is currently specified as a hard requirement, because the sender
can exploit the predictability of the receiver's behaviour. However,
further experimentation is needed to advise if will have to become
just preferred behavior.
1.4. Terminology
AccECN: The more accurate ECN feedback scheme will be called AccECN AccECN: The more accurate ECN feedback scheme will be called AccECN
for short. for short.
Classic ECN: the ECN protocol specified in [RFC3168]. Classic ECN: the ECN protocol specified in [RFC3168].
Classic ECN feedback: the feedback aspect of the ECN protocol Classic ECN feedback: the feedback aspect of the ECN protocol
specified in [RFC3168], including generation, encoding, specified in [RFC3168], including generation, encoding,
transmission and decoding of feedback, but not the Data Sender's transmission and decoding of feedback, but not the Data Sender's
subsequent response to that feedback. subsequent response to that feedback.
skipping to change at page 7, line 5 skipping to change at page 6, line 30
Data Sender: The endpoint of a TCP half-connection that sends data Data Sender: The endpoint of a TCP half-connection that sends data
and receives AccECN feedback. and receives AccECN feedback.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14 [RFC2119] document are to be interpreted as described in BCP 14 [RFC2119]
[RFC8174] when, and only when, they appear in all capitals, as shown [RFC8174] when, and only when, they appear in all capitals, as shown
here. here.
1.5. Recap of Existing ECN feedback in IP/TCP 1.4. Recap of Existing ECN feedback in IP/TCP
ECN [RFC3168] uses two bits in the IP header. Once ECN has been ECN [RFC3168] uses two bits in the IP header. Once ECN has been
negotiated with the receiver at the transport layer, an ECN sender negotiated with the receiver at the transport layer, an ECN sender
can set two possible codepoints (ECT(0) or ECT(1)) in the IP header can set two possible codepoints (ECT(0) or ECT(1)) in the IP header
to indicate an ECN-capable transport (ECT). If both ECN bits are to indicate an ECN-capable transport (ECT). If both ECN bits are
zero, the packet is considered to have been sent by a Not-ECN-capable zero, the packet is considered to have been sent by a Not-ECN-capable
Transport (Not-ECT). When a network node experiences congestion, it Transport (Not-ECT). When a network node experiences congestion, it
will occasionally either drop or mark a packet, with the choice will occasionally either drop or mark a packet, with the choice
depending on the packet's ECN codepoint. If the codepoint is Not- depending on the packet's ECN codepoint. If the codepoint is Not-
ECT, only drop is appropriate. If the codepoint is ECT(0) or ECT(1), ECT, only drop is appropriate. If the codepoint is ECT(0) or ECT(1),
skipping to change at page 12, line 27 skipping to change at page 12, line 14
3. AccECN Protocol Specification 3. AccECN Protocol Specification
3.1. Negotiating to use AccECN 3.1. Negotiating to use AccECN
3.1.1. Negotiation during the TCP handshake 3.1.1. Negotiation during the TCP handshake
Given the ECN Nonce [RFC3540] has been reclassified as historic Given the ECN Nonce [RFC3540] has been reclassified as historic
[RFC8311], the present specification re-allocates the TCP flag at bit [RFC8311], the present specification re-allocates the TCP flag at bit
7 of the TCP header, which was previously called NS (Nonce Sum), as 7 of the TCP header, which was previously called NS (Nonce Sum), as
the AE (Accurate ECN) flag (see IANA Considerations in Section 6) as the AE (Accurate ECN) flag (see IANA Considerations in Section 7) as
shown below. shown below.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| | | A | C | E | U | A | P | R | S | F | | | | A | C | E | U | A | P | R | S | F |
| Header Length | Reserved | E | W | C | R | C | S | S | Y | I | | Header Length | Reserved | E | W | C | R | C | S | S | Y | I |
| | | | R | E | G | K | H | T | N | N | | | | | R | E | G | K | H | T | N | N |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Figure 2: The (post-AccECN) definition of the TCP header flags during Figure 2: The (post-AccECN) definition of the TCP header flags during
skipping to change at page 14, line 47 skipping to change at page 14, line 21
variant of TCP feedback, indicated in its SYN/ACK. Therefore, as variant of TCP feedback, indicated in its SYN/ACK. Therefore, as
soon as an AccECN-capable TCP client (A) receives the SYN/ACK soon as an AccECN-capable TCP client (A) receives the SYN/ACK
shown it MUST set both its half connections into the feedback shown it MUST set both its half connections into the feedback
mode shown in the rightmost column. If it has set itself into mode shown in the rightmost column. If it has set itself into
classic ECN feedback mode it MUST then comply with [RFC3168]. classic ECN feedback mode it MUST then comply with [RFC3168].
The server response called 'Nonce' in the table is now historic. The server response called 'Nonce' in the table is now historic.
For an AccECN implementation, there is no need to recognize or For an AccECN implementation, there is no need to recognize or
support ECN Nonce feedback [RFC3540], which has been reclassified support ECN Nonce feedback [RFC3540], which has been reclassified
as historic [RFC8311]. AccECN is compatible with alternative ECN as historic [RFC8311]. AccECN is compatible with alternative ECN
feedback integrity approaches (see Section 4.3). feedback integrity approaches (see Section 5.3).
3. The third block shows the cases where the TCP server (B) supports 3. The third block shows the cases where the TCP server (B) supports
AccECN but the TCP client (A) supports some earlier variant of AccECN but the TCP client (A) supports some earlier variant of
TCP feedback, indicated in its SYN. TCP feedback, indicated in its SYN.
When an AccECN-enabled TCP server (B) receives a SYN with When an AccECN-enabled TCP server (B) receives a SYN with
AE,CWR,ECE = 0,1,1 it MUST do one of the following: AE,CWR,ECE = 0,1,1 it MUST do one of the following:
* set both its half connections into the classic ECN feedback * set both its half connections into the classic ECN feedback
mode and return a SYN/ACK with AE, CWR, ECE = 0,0,1 as shown. mode and return a SYN/ACK with AE, CWR, ECE = 0,0,1 as shown.
skipping to change at page 19, line 34 skipping to change at page 19, line 8
When a host first enters AccECN mode, in its role as a Data Receiver When a host first enters AccECN mode, in its role as a Data Receiver
it initializes its counters to r.cep = 5 and r.ceb = 0, The initial it initializes its counters to r.cep = 5 and r.ceb = 0, The initial
values of the other two byte counters depend on the Data Receiver's values of the other two byte counters depend on the Data Receiver's
choice of the order of fields it will use in the AccECN TCP Option choice of the order of fields it will use in the AccECN TCP Option
(see Section 3.2.3). If field order 0, it will initialize the (see Section 3.2.3). If field order 0, it will initialize the
remaining counters to r.e0b = 1; r.e1b.= 0. If field order 1, it remaining counters to r.e0b = 1; r.e1b.= 0. If field order 1, it
will initialize them to r.e0b = 0 and r.e1b.= 0x800001. will initialize them to r.e0b = 0 and r.e1b.= 0x800001.
Non-zero initial values are used to support a stateless handshake Non-zero initial values are used to support a stateless handshake
(see Section 4.1) and to be distinct from cases where the fields are (see Section 5.1) and to be distinct from cases where the fields are
incorrectly zeroed (e.g. by middleboxes - see Section 3.2.3.2.4). incorrectly zeroed (e.g. by middleboxes - see Section 3.2.3.2.4).
When a host enters AccECN mode, in its role as a Data Sender it When a host enters AccECN mode, in its role as a Data Sender it
initializes its counters to s.cep = 5 and s.ceb = 0. The initial initializes its counters to s.cep = 5 and s.ceb = 0. The initial
values of the other two byte counters depend on the peer's choice of values of the other two byte counters depend on the peer's choice of
the order of fields it will use in the AccECN TCP Option (see the order of fields it will use in the AccECN TCP Option (see
Section 3.2.3). If field order 0, it will initialize the remaining Section 3.2.3). If field order 0, it will initialize the remaining
counters to s.e0b = 1; s.e1b.= 0. If field order 1, it will counters to s.e0b = 1; s.e1b.= 0. If field order 1, it will
initialize them to s.e0b = 0 and s.e1b.= 0x800001. initialize them to s.e0b = 0 and s.e1b.= 0x800001.
skipping to change at page 21, line 5 skipping to change at page 20, line 28
Table 2). This shall be called the handshake encoding of the ACE Table 2). This shall be called the handshake encoding of the ACE
field, and it is the only exception to the rule that the ACE field field, and it is the only exception to the rule that the ACE field
carries the 3 least significant bits of the r.cep counter on packets carries the 3 least significant bits of the r.cep counter on packets
with SYN=0. with SYN=0.
Normally, a TCP client acknowledges a SYN/ACK with an ACK that Normally, a TCP client acknowledges a SYN/ACK with an ACK that
satisfies the above conditions anyway (SYN=0, no data, no SACK satisfies the above conditions anyway (SYN=0, no data, no SACK
blocks). If an AccECN TCP client intends to acknowledge the SYN/ACK blocks). If an AccECN TCP client intends to acknowledge the SYN/ACK
with a packet that does not satisfy these conditions (e.g. it has with a packet that does not satisfy these conditions (e.g. it has
data to include on the ACK), it SHOULD first send a pure ACK that data to include on the ACK), it SHOULD first send a pure ACK that
does satisfy these conditions (see Section 4.2), so that it can feed does satisfy these conditions (see Section 5.2), so that it can feed
back which of the four values of the IP-ECN field arrived on the SYN/ back which of the four values of the IP-ECN field arrived on the SYN/
ACK. A valid exception to this "SHOULD" would be where the ACK. A valid exception to this "SHOULD" would be where the
implementation will only be used in an environment where mangling of implementation will only be used in an environment where mangling of
the ECN field is unlikely. the ECN field is unlikely.
+---------------------+---------------------+-----------------------+ +---------------------+---------------------+-----------------------+
| IP-ECN codepoint on | ACE on pure ACK of | r.cep of client in | | IP-ECN codepoint on | ACE on pure ACK of | r.cep of client in |
| SYN/ACK | SYN/ACK | AccECN mode | | SYN/ACK | SYN/ACK | AccECN mode |
+---------------------+---------------------+-----------------------+ +---------------------+---------------------+-----------------------+
| Not-ECT | 0b010 | 5 | | Not-ECT | 0b010 | 5 |
skipping to change at page 22, line 18 skipping to change at page 21, line 39
{Note 2}: If the server is in AccECN mode, these values are Currently {Note 2}: If the server is in AccECN mode, these values are Currently
Unused but the AccECN server's behaviour is still defined for forward Unused but the AccECN server's behaviour is still defined for forward
compatibility. Then the designer of a future protocol can know for compatibility. Then the designer of a future protocol can know for
certain what AccECN servers will do with these codepoints. certain what AccECN servers will do with these codepoints.
{Note 3}: In the case where a server that implements AccECN is also {Note 3}: In the case where a server that implements AccECN is also
using a stateless handshake (termed a SYN cookie) it will not using a stateless handshake (termed a SYN cookie) it will not
remember whether it entered AccECN mode. The values 0b000 or 0b001 remember whether it entered AccECN mode. The values 0b000 or 0b001
will remind it that it did not enter AccECN mode, because AccECN does will remind it that it did not enter AccECN mode, because AccECN does
not use them (see Section 4.1 for details). If a stateless server not use them (see Section 5.1 for details). If a stateless server
that implements AccECN receives either of these two values in the that implements AccECN receives either of these two values in the
ACK, its action is implementation-dependent and outside the scope of ACK, its action is implementation-dependent and outside the scope of
this spec, It will certainly not take the action in the third column this spec, It will certainly not take the action in the third column
because, after it receives either of these values, it is not in because, after it receives either of these values, it is not in
AccECN mode. I.e., it will not disable ECN (at least not just AccECN mode. I.e., it will not disable ECN (at least not just
because ACE is 0b000) and it will not set s.cep. because ACE is 0b000) and it will not set s.cep.
3.2.2.2. Encoding and Decoding Feedback in the ACE Field 3.2.2.2. Encoding and Decoding Feedback in the ACE Field
Whenever the Data Receiver sends an ACK with SYN=0 (with or without Whenever the Data Receiver sends an ACK with SYN=0 (with or without
skipping to change at page 25, line 27 skipping to change at page 24, line 43
The server can compare this with how it originally set the IP/ECN The server can compare this with how it originally set the IP/ECN
field on the SYN/ACK. If this comparison implies an unsafe field on the SYN/ACK. If this comparison implies an unsafe
transition of the IP/ECN field, for the remainder of the connection transition of the IP/ECN field, for the remainder of the connection
the server MUST NOT send ECN-capable packets, but it MUST continue to the server MUST NOT send ECN-capable packets, but it MUST continue to
feedback any ECN markings on arriving packets. feedback any ECN markings on arriving packets.
The ACK of the SYN/ACK is not reliably delivered (nonetheless, the The ACK of the SYN/ACK is not reliably delivered (nonetheless, the
count of CE marks is still eventually delivered reliably). If this count of CE marks is still eventually delivered reliably). If this
ACK does not arrive, the server can continue to send ECN-capable ACK does not arrive, the server can continue to send ECN-capable
packets without having tested for mangling of the IP/ECN field on the packets without having tested for mangling of the IP/ECN field on the
SYN/ACK. Experiments with AccECN deployment will assess whether this SYN/ACK.
limitation has any effect in practice.
Invalid transitions of the IP/ECN field are defined in [RFC3168] and Invalid transitions of the IP/ECN field are defined in [RFC3168] and
repeated here for convenience: repeated here for convenience:
o the not-ECT codepoint changes; o the not-ECT codepoint changes;
o either ECT codepoint transitions to not-ECT; o either ECT codepoint transitions to not-ECT;
o the CE codepoint changes. o the CE codepoint changes.
RFC 3168 says that a router that changes ECT to not-ECT is invalid RFC 3168 says that a router that changes ECT to not-ECT is invalid
but safe. However, from a host's viewpoint, this transition is but safe. However, from a host's viewpoint, this transition is
unsafe because it could be the result of two transitions at different unsafe because it could be the result of two transitions at different
routers on the path: ECT to CE (safe) then CE to not-ECT (unsafe). routers on the path: ECT to CE (safe) then CE to not-ECT (unsafe).
This scenario could well happen where an ECN-enabled home router This scenario could well happen where an ECN-enabled home router
congests its upstream mobile broadband bottleneck link, then the congests its upstream mobile broadband bottleneck link, then the
ingress to the mobile network clears the ECN field [Mandalari18]. ingress to the mobile network clears the ECN field [Mandalari18].
skipping to change at page 26, line 36 skipping to change at page 26, line 4
For the avoidance of doubt, the change-triggered ACK mechanism is For the avoidance of doubt, the change-triggered ACK mechanism is
deliberately worded to solely apply to data packets, and to ignore deliberately worded to solely apply to data packets, and to ignore
the arrival of a control packet with no payload, because it is the arrival of a control packet with no payload, because it is
important that TCP does not acknowledge pure ACKs. The change- important that TCP does not acknowledge pure ACKs. The change-
triggered ACK approach can lead to some additional ACKs but it feeds triggered ACK approach can lead to some additional ACKs but it feeds
back the timing and the order in which ECN marks are received with back the timing and the order in which ECN marks are received with
minimal additional complexity. If only CE marks are infrequent, or minimal additional complexity. If only CE marks are infrequent, or
there are multiple marks in a row, the additional load will be low. there are multiple marks in a row, the additional load will be low.
Other marking patterns could increase the load significantly. Other marking patterns could increase the load significantly.
Investigating the additional load is a goal of the proposed
experiment.
Even though the first bullet is stated as a "SHOULD", it is important Even though the first bullet is stated as a "SHOULD", it is important
for a transition to immediately trigger an ACK if at all possible, so for a transition to immediately trigger an ACK if at all possible, so
that the Data Sender can rely on change-triggered ACKs to detect that the Data Sender can rely on change-triggered ACKs to detect
queue growth as soon as possible, e.g. at the start of a flow. This queue growth as soon as possible, e.g. at the start of a flow. This
requirement can only be relaxed if certain offload hardware needed requirement can only be relaxed if certain offload hardware needed
for high performance cannot support change-triggered ACKs (although for high performance cannot support change-triggered ACKs (although
high performance protocols such as DCTCP already successfully use high performance protocols such as DCTCP already successfully use
change-triggered ACKs). One possible experimental compromise would change-triggered ACKs). One possible compromise would be for the
be for the receiver to heuristically detect whether the sender is in receiver to heuristically detect whether the sender is in slow-start,
slow-start, then to implement change-triggered ACKs while the sender then to implement change-triggered ACKs while the sender is in slow-
is in slow-start, and offload otherwise. start, and offload otherwise.
3.2.2.5.2. Data Sender Safety Procedures 3.2.2.5.2. Data Sender Safety Procedures
If the Data Sender has not received AccECN TCP Options to give it If the Data Sender has not received AccECN TCP Options to give it
more dependable information, and it detects that the ACE field could more dependable information, and it detects that the ACE field could
have cycled, it SHOULD deem whether it cycled by taking the safest have cycled, it SHOULD deem whether it cycled by taking the safest
likely case under the prevailing conditions. It can detect if the likely case under the prevailing conditions. It can detect if the
counter could have cycled by using the jump in the acknowledgement counter could have cycled by using the jump in the acknowledgement
number since the last ACK to calculate or estimate how many segments number since the last ACK to calculate or estimate how many segments
could have been acknowledged. An example algorithm to implement this could have been acknowledged. An example algorithm to implement this
skipping to change at page 28, line 28 skipping to change at page 27, line 28
| Kind = TBD1 | Length = 11 | EE1B field | | Kind = TBD1 | Length = 11 | EE1B field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| EE1B (cont'd) | ECEB field | | EE1B (cont'd) | ECEB field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| EE0B field | Order 1 | EE0B field | Order 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: The AccECN TCP Option Figure 4: The AccECN TCP Option
When a Data Receiver sends an AccECN Option, it MUST set the Kind When a Data Receiver sends an AccECN Option, it MUST set the Kind
field to TBD1, which is registered in Section 6 as a new TCP option field to TBD1, which is registered in Section 7 as a new TCP option
Kind called AccECN. An experimental TCP option with Kind=254 MAY be Kind called AccECN.
used for initial experiments, with magic number 0xACCE.
Figure 4 shows two option field orders; order 0 and order 1. They Figure 4 shows two option field orders; order 0 and order 1. They
both consists of three 24-bit fields. Order 0 provides the 24 least both consists of three 24-bit fields. Order 0 provides the 24 least
significant bits of the r.e0b, r.ceb and r.e1b counters, significant bits of the r.e0b, r.ceb and r.e1b counters,
respectively. Order 1 provides the same fields, but in the opposite respectively. Order 1 provides the same fields, but in the opposite
order. Each half-connection can use a different field order, but a order. Each half-connection can use a different field order, but a
Data Receiver MUST consistently send the same field order within the Data Receiver MUST consistently send the same field order within the
same half-connection. same half-connection.
The field order to use for each half-connection is up to the Data The field order to use for each half-connection is up to the Data
skipping to change at page 29, line 37 skipping to change at page 28, line 34
| Length | Type 0 | Type 1 | | Length | Type 0 | Type 1 |
+--------+------------------+------------------+ +--------+------------------+------------------+
| 11 | EE0B, ECEB, EE1B | EE1B, ECEB, EE0B | | 11 | EE0B, ECEB, EE1B | EE1B, ECEB, EE0B |
| 8 | EE0B, ECEB | EE1B, ECEB | | 8 | EE0B, ECEB | EE1B, ECEB |
| 5 | EE0B | EE1B | | 5 | EE0B | EE1B |
| 2 | (empty) | (empty) | | 2 | (empty) | (empty) |
+--------+------------------+------------------+ +--------+------------------+------------------+
The empty option of Length=2 is provided to allow for a case where an The empty option of Length=2 is provided to allow for a case where an
AccECN Option has to be sent (e.g. on the SYN/ACK to test the path), AccECN Option has to be sent (e.g. on the SYN/ACK to test the path),
but there is very limited space for the option. For initial but there is very limited space for the option.
experiments, the Length field MUST be 2 greater to accommodate the
16-bit magic number.
All implementations of a Data Sender that read any AccECN Option MUST All implementations of a Data Sender that read any AccECN Option MUST
be able to read in AccECN Options of any of the above lengths. For be able to read in AccECN Options of any of the above lengths. For
forward compatibility, if the AccECN Option is of any other length, forward compatibility, if the AccECN Option is of any other length,
implementations MUST use those whole 3 octet fields that fit within implementations MUST use those whole 3 octet fields that fit within
the length and ignore the remainder of the option. the length and ignore the remainder of the option.
The AccECN Option has to be optional to implement, because both The AccECN Option has to be optional to implement, because both
sender and receiver have to be able to cope without the option anyway sender and receiver have to be able to cope without the option anyway
- in cases where it does not traverse a network path. It is - in cases where it does not traverse a network path. It is
skipping to change at page 33, line 50 skipping to change at page 32, line 48
AccECN Option is also available. AccECN Option is also available.
If the AccECN option is present, the s.cep counter might increase If the AccECN option is present, the s.cep counter might increase
while the s.ceb counter does not (e.g. due to a CE-marked control while the s.ceb counter does not (e.g. due to a CE-marked control
packet). The sender's response to such a situation is out of scope, packet). The sender's response to such a situation is out of scope,
and needs to be dealt with in a specification that uses ECN-capable and needs to be dealt with in a specification that uses ECN-capable
control packets. Theoretically, this situation could also occur if a control packets. Theoretically, this situation could also occur if a
middlebox mangled the AccECN Option but not the ACE field. However, middlebox mangled the AccECN Option but not the ACE field. However,
the Data Sender has to assume that the integrity of the AccECN Option the Data Sender has to assume that the integrity of the AccECN Option
is sound, based on the above test of the well-known initial values is sound, based on the above test of the well-known initial values
and optionally other integrity tests (Section 4.3). and optionally other integrity tests (Section 5.3).
If either end-point detects that the s.ceb counter has increased but If either end-point detects that the s.ceb counter has increased but
the s.cep has not (and by testing ACK coverage it is certain how much the s.cep has not (and by testing ACK coverage it is certain how much
the ACE field has wrapped), this invalid protocol transition has to the ACE field has wrapped), this invalid protocol transition has to
be due to some form of feedback mangling. So, the Data Sender MUST be due to some form of feedback mangling. So, the Data Sender MUST
disable sending ECN-capable packets for the remainder of the half- disable sending ECN-capable packets for the remainder of the half-
connection by setting the IP/ECN field in all subsequent packets to connection by setting the IP/ECN field in all subsequent packets to
Not-ECT. Not-ECT.
3.2.3.3. Usage of the AccECN TCP Option 3.2.3.3. Usage of the AccECN TCP Option
skipping to change at page 37, line 35 skipping to change at page 36, line 32
move beyond step marking. Before this can happen, offload hardware move beyond step marking. Before this can happen, offload hardware
will have to explicitly address the variability of ECN feedback. will have to explicitly address the variability of ECN feedback.
ECN encodes a varying signal in the ACK stream, so it is inevitable ECN encodes a varying signal in the ACK stream, so it is inevitable
that offload hardware will ultimately need to handle any form of ECN that offload hardware will ultimately need to handle any form of ECN
feedback exceptionally. The purpose of working towards standardized feedback exceptionally. The purpose of working towards standardized
TCP ECN feedback is to reduce the risk for hardware developers, who TCP ECN feedback is to reduce the risk for hardware developers, who
would otherwise have to guess which scheme is likely to become would otherwise have to guess which scheme is likely to become
dominant. dominant.
4. Interaction with Other TCP Variants 4. Updates to RFC 3168
Normative statements in the following sections of RFC3168 are updated
by the present AccECN specification:
o The whole of "6.1.1 TCP Initialization" of [RFC3168] is updated by
Section 3.1 of the present specification.
o In "6.1.2. The TCP Sender" of [RFC3168], all mentions of a
congestion response to an ECN-Echo (ECE) ACK packet are updated by
Section 3.2 of the present specification to mean an increment to
the sender's count of CE-marked packets, s.cep. And the
requirements to set the CWR flag no longer apply, as specified in
Section 3.1.5 of the present specification. Otherwise, the
remaining requirements in "6.1.2. The TCP Sender" still stand.
It will be noted that RFC 8311 already updates, or potentially
updates, a number of the requirements in "6.1.2. The TCP Sender".
Section 6.1.2 of RFC 3168 extended standard TCP congestion control
[RFC5681] to cover ECN marking as well as packet drop. Whereas,
RFC 8311 enables experimentation with alternative responses to ECN
marking, if specified for instance by an experimental RFC on the
IETF document stream. RFC 8311 also strengthened the statement
that "ECT(0) SHOULD be used" to a "MUST" (see [RFC8311] for the
details).
o The whole of "6.1.3. The TCP Receiver" of [RFC3168] is updated by
Section 3.2 of the present specification, with the exception of
the last paragraph (about congestion response to drop and ECN in
the same round trip), which still stands. Incidentally, this last
paragraph is in the wrong section, because it relates to TCP
sender behaviour.
o The following text within "6.1.5. Retransmitted TCP packets":
"the TCP data receiver SHOULD ignore the ECN field on arriving
data packets that are outside of the receiver's current
window."
is updated by more stringent acceptability tests for any packet
(not just data packets) in the present specification.
Specifically, in the normative specification of AccECN (Section 3)
only 'Acceptable' packets contribute to the ECN counters at the
AccECN receiver and Section 1.3 defines an Acceptable packet as
one that passes the acceptability tests in both [RFC0793] and
[RFC5961].
o Sections 5.2, 6.1.1, 6.1.4, 6.1.5 and 6.1.6 of [RFC3168] prohibit
use of ECN on TCP control packets and retransmissions. The
present specification does not update that aspect of RFC 3168, but
it does say what feedback an AccECN Data Receiver should provide
if it receives an ECN-capable control packet or retransmission.
This ensures AccECN is forward compatible with any future scheme
that allows ECN on these packets, as provided for in section 4.3
of [RFC8311] and as proposed in [I-D.ietf-tcpm-generalized-ecn].
5. Interaction with TCP Variants
This section is informative, not normative. This section is informative, not normative.
4.1. Compatibility with SYN Cookies 5.1. Compatibility with SYN Cookies
A TCP server can use SYN Cookies (see Appendix A of [RFC4987]) to A TCP server can use SYN Cookies (see Appendix A of [RFC4987]) to
protect itself from SYN flooding attacks. It places minimal commonly protect itself from SYN flooding attacks. It places minimal commonly
used connection state in the SYN/ACK, and deliberately does not hold used connection state in the SYN/ACK, and deliberately does not hold
any state while waiting for the subsequent ACK (e.g. it closes the any state while waiting for the subsequent ACK (e.g. it closes the
thread). Therefore it cannot record the fact that it entered AccECN thread). Therefore it cannot record the fact that it entered AccECN
mode for both half-connections. Indeed, it cannot even remember mode for both half-connections. Indeed, it cannot even remember
whether it negotiated the use of classic ECN [RFC3168]. whether it negotiated the use of classic ECN [RFC3168].
Nonetheless, such a server can determine that it negotiated AccECN as Nonetheless, such a server can determine that it negotiated AccECN as
skipping to change at page 38, line 23 skipping to change at page 38, line 29
earlier. earlier.
If the pure ACK that acknowledges a SYN cookie contains an ACE field If the pure ACK that acknowledges a SYN cookie contains an ACE field
with the value 0b000 or 0b001, these values indicate that the client with the value 0b000 or 0b001, these values indicate that the client
did not request support for AccECN and therefore the server does not did not request support for AccECN and therefore the server does not
enter AccECN mode for this connection. Further, 0b001 on the ACK enter AccECN mode for this connection. Further, 0b001 on the ACK
implies that the server sent an ECN-capable SYN/ACK, which was marked implies that the server sent an ECN-capable SYN/ACK, which was marked
CE in the network, and the non-AccECN client fed this back by setting CE in the network, and the non-AccECN client fed this back by setting
ECE on the ACK of the SYN/ACK. ECE on the ACK of the SYN/ACK.
4.2. Compatibility with Other TCP Options and Experiments 5.2. Compatibility with TCP Experiments and Common TCP Options
AccECN is compatible (at least on paper) with the most commonly used AccECN is compatible (at least on paper) with the most commonly used
TCP options: MSS, time-stamp, window scaling, SACK and TCP-AO. It is TCP options: MSS, time-stamp, window scaling, SACK and TCP-AO. It is
also compatible with the recent promising experimental TCP options also compatible with the recent promising experimental TCP options
TCP Fast Open (TFO [RFC7413]) and Multipath TCP (MPTCP [RFC6824]). TCP Fast Open (TFO [RFC7413]) and Multipath TCP (MPTCP [RFC6824]).
AccECN is friendly to all these protocols, because space for TCP AccECN is friendly to all these protocols, because space for TCP
options is particularly scarce on the SYN, where AccECN consumes zero options is particularly scarce on the SYN, where AccECN consumes zero
additional header space. additional header space.
When option space is under pressure from other options, When option space is under pressure from other options,
Section 3.2.3.3 provides guidance on how important it is to send an Section 3.2.3.3 provides guidance on how important it is to send an
AccECN Option and whether it needs to be a full-length option. AccECN Option and whether it needs to be a full-length option.
Implementers of TFO need to take careful note of the recommendation Implementers of TFO need to take careful note of the recommendation
in Section 3.2.2.1. That section recommends that, if the client has in Section 3.2.2.1. That section recommends that, if the client has
successfully negotiated AccECN, when acknowledging the SYN/ACK, even successfully negotiated AccECN, when acknowledging the SYN/ACK, even
if it has data to send, it sends a pure ACK immediately before the if it has data to send, it sends a pure ACK immediately before the
data. Then it can reflect the IP-ECN field of the SYN/ACK on this data. Then it can reflect the IP-ECN field of the SYN/ACK on this
pure ACK, which allows the server to detect ECN mangling. pure ACK, which allows the server to detect ECN mangling.
4.3. Compatibility with Feedback Integrity Mechanisms 5.3. Compatibility with Feedback Integrity Mechanisms
Three alternative mechanisms are available to assure the integrity of Three alternative mechanisms are available to assure the integrity of
ECN and/or loss signals. AccECN is compatible with any of these ECN and/or loss signals. AccECN is compatible with any of these
approaches: approaches:
o The Data Sender can test the integrity of the receiver's ECN (or o The Data Sender can test the integrity of the receiver's ECN (or
loss) feedback by occasionally setting the IP-ECN field to a value loss) feedback by occasionally setting the IP-ECN field to a value
normally only set by the network (and/or deliberately leaving a normally only set by the network (and/or deliberately leaving a
sequence number gap). Then it can test whether the Data sequence number gap). Then it can test whether the Data
Receiver's feedback faithfully reports what it expects (similar to Receiver's feedback faithfully reports what it expects (similar to
skipping to change at page 40, line 5 skipping to change at page 40, line 8
resegmentation or shifting the sequence space. resegmentation or shifting the sequence space.
Originally the ECN Nonce [RFC3540] was proposed to ensure integrity Originally the ECN Nonce [RFC3540] was proposed to ensure integrity
of congestion feedback. With minor changes AccECN could be optimized of congestion feedback. With minor changes AccECN could be optimized
for the possibility that the ECT(1) codepoint might be used as an ECN for the possibility that the ECT(1) codepoint might be used as an ECN
Nonce. However, given RFC 3540 has been reclassified as historic, Nonce. However, given RFC 3540 has been reclassified as historic,
the AccECN design has been generalized so that it ought to be able to the AccECN design has been generalized so that it ought to be able to
support other possible uses of the ECT(1) codepoint, such as a lower support other possible uses of the ECT(1) codepoint, such as a lower
severity or a more instant congestion signal than CE. severity or a more instant congestion signal than CE.
5. Protocol Properties 6. Protocol Properties
This section is informative not normative. It describes how well the This section is informative not normative. It describes how well the
protocol satisfies the agreed requirements for a more accurate ECN protocol satisfies the agreed requirements for a more accurate ECN
feedback protocol [RFC7560]. feedback protocol [RFC7560].
Accuracy: From each ACK, the Data Sender can infer the number of new Accuracy: From each ACK, the Data Sender can infer the number of new
CE marked segments since the previous ACK. This provides better CE marked segments since the previous ACK. This provides better
accuracy on CE feedback than classic ECN. In addition if the accuracy on CE feedback than classic ECN. In addition if the
AccECN Option is present (not blocked by the network path) the AccECN Option is present (not blocked by the network path) the
number of bytes marked with CE, ECT(1) and ECT(0) are provided. number of bytes marked with CE, ECT(1) and ECT(0) are provided.
skipping to change at page 42, line 9 skipping to change at page 42, line 15
Forward Compatibility: The behaviour of endpoints and middleboxes is Forward Compatibility: The behaviour of endpoints and middleboxes is
carefully defined for all reserved or currently unused codepoints carefully defined for all reserved or currently unused codepoints
in the scheme. Then, the designers of security devices can in the scheme. Then, the designers of security devices can
understand which currently unused values might appear in future. understand which currently unused values might appear in future.
So, even if they choose to treat such values as anomalous while So, even if they choose to treat such values as anomalous while
they are not widely used, any blocking will at least be under they are not widely used, any blocking will at least be under
policy control not hard-coded. Then, if previously unused values policy control not hard-coded. Then, if previously unused values
start to appear on the Internet (or in standards), such policies start to appear on the Internet (or in standards), such policies
could be quickly reversed. could be quickly reversed.
6. IANA Considerations 7. IANA Considerations
This document reassigns bit 7 of the TCP header flags to the AccECN This document reassigns bit 7 of the TCP header flags to the AccECN
experiment. This bit was previously called the Nonce Sum (NS) flag experiment. This bit was previously called the Nonce Sum (NS) flag
[RFC3540], but RFC 3540 has been reclassified as historic [RFC8311]. [RFC3540], but RFC 3540 has been reclassified as historic [RFC8311].
The flag will now be defined as: The flag will now be defined as:
+-----+-------------------+-----------+ +-----+-------------------+-----------+
| Bit | Name | Reference | | Bit | Name | Reference |
+-----+-------------------+-----------+ +-----+-------------------+-----------+
| 7 | AE (Accurate ECN) | RFC XXXX | | 7 | AE (Accurate ECN) | RFC XXXX |
skipping to change at page 42, line 44 skipping to change at page 42, line 50
+------+--------+-----------------------+-----------+ +------+--------+-----------------------+-----------+
| Kind | Length | Meaning | Reference | | Kind | Length | Meaning | Reference |
+------+--------+-----------------------+-----------+ +------+--------+-----------------------+-----------+
| TBD1 | N | Accurate ECN (AccECN) | RFC XXXX | | TBD1 | N | Accurate ECN (AccECN) | RFC XXXX |
+------+--------+-----------------------+-----------+ +------+--------+-----------------------+-----------+
[TO BE REMOVED: This registration should take place at the following [TO BE REMOVED: This registration should take place at the following
location: http://www.iana.org/assignments/tcp-parameters/tcp- location: http://www.iana.org/assignments/tcp-parameters/tcp-
parameters.xhtml#tcp-parameters-1 ] parameters.xhtml#tcp-parameters-1 ]
Early implementation before the IANA allocation MUST follow [RFC6994] Early implementations using experimental option 254 per [RFC6994]
and use experimental option 254 and magic number 0xACCE (16 bits), with magic number 0xACCE (16 bits), as allocated in the IANA "TCP
then migrate to the new option after the allocation. Experimental Option Experiment Identifiers (TCP ExIDs)" registry,
SHOULD migrate to use this new option kind (TBD1).
7. Security Considerations [TO BE REMOVED: The description of the 0xACCE value in the TCP ExIDs
registry should be changed to "AccECN (current and new
implementations SHOULD use option kind TBD1)" at the following
location: https://www.iana.org/assignments/tcp-parameters/tcp-
parameters.xhtml#tcp-exids ]
8. Security Considerations
If ever the supplementary part of AccECN based on the new AccECN TCP If ever the supplementary part of AccECN based on the new AccECN TCP
Option is unusable (due for example to middlebox interference) the Option is unusable (due for example to middlebox interference) the
essential part of AccECN's congestion feedback offers only limited essential part of AccECN's congestion feedback offers only limited
resilience to long runs of ACK loss (see Section 3.2.2.5). These resilience to long runs of ACK loss (see Section 3.2.2.5). These
problems are unlikely to be due to malicious intervention (because if problems are unlikely to be due to malicious intervention (because if
an attacker could strip a TCP option or discard a long run of ACKs it an attacker could strip a TCP option or discard a long run of ACKs it
could wreak other arbitrary havoc). However, it would be of concern could wreak other arbitrary havoc). However, it would be of concern
if AccECN's resilience could be indirectly compromised during a if AccECN's resilience could be indirectly compromised during a
flooding attack. AccECN is still considered safe though, because if flooding attack. AccECN is still considered safe though, because if
the option is not presented, the AccECN Data Sender is then required the option is not presented, the AccECN Data Sender is then required
to switch to more conservative assumptions about wrap of congestion to switch to more conservative assumptions about wrap of congestion
indication counters (see Section 3.2.2.5 and Appendix A.2). indication counters (see Section 3.2.2.5 and Appendix A.2).
Section 4.1 describes how a TCP server can negotiate AccECN and use Section 5.1 describes how a TCP server can negotiate AccECN and use
the SYN cookie method for mitigating SYN flooding attacks. the SYN cookie method for mitigating SYN flooding attacks.
There is concern that ECN markings could be altered or suppressed, There is concern that ECN markings could be altered or suppressed,
particularly because a misbehaving Data Receiver could increase its particularly because a misbehaving Data Receiver could increase its
own throughput at the expense of others. AccECN is compatible with own throughput at the expense of others. AccECN is compatible with
the three schemes known to assure the integrity of ECN feedback (see the three schemes known to assure the integrity of ECN feedback (see
Section 4.3 for details). If the AccECN Option is stripped by an Section 5.3 for details). If the AccECN Option is stripped by an
incorrectly implemented middlebox, the resolution of the feedback incorrectly implemented middlebox, the resolution of the feedback
will be degraded, but the integrity of this degraded information can will be degraded, but the integrity of this degraded information can
still be assured. still be assured.
There is a potential concern that a receiver could deliberately omit There is a potential concern that a receiver could deliberately omit
the AccECN Option pretending that it had been stripped by a the AccECN Option pretending that it had been stripped by a
middlebox. No known way can yet be contrived to take advantage of middlebox. No known way can yet be contrived to take advantage of
this downgrade attack, but it is mentioned here in case someone else this downgrade attack, but it is mentioned here in case someone else
can contrive one. can contrive one.
The AccECN protocol is not believed to introduce any new privacy The AccECN protocol is not believed to introduce any new privacy
concerns, because it merely counts and feeds back signals at the concerns, because it merely counts and feeds back signals at the
transport layer that had already been visible at the IP layer. transport layer that had already been visible at the IP layer.
8. Acknowledgements 9. Acknowledgements
We want to thank Koen De Schepper, Praveen Balasubramanian, Michael We want to thank Koen De Schepper, Praveen Balasubramanian, Michael
Welzl, Gorry Fairhurst, David Black, Spencer Dawkins, Michael Scharf, Welzl, Gorry Fairhurst, David Black, Spencer Dawkins, Michael Scharf,
Michael Tuexen, Yuchung Cheng, Kenjiro Cho, Olivier Tilmans and Ilpo Michael Tuexen, Yuchung Cheng, Kenjiro Cho, Olivier Tilmans and Ilpo
Jaervinen for their input and discussion. The idea of using the Jaervinen for their input and discussion. The idea of using the
three ECN-related TCP flags as one field for more accurate TCP-ECN three ECN-related TCP flags as one field for more accurate TCP-ECN
feedback was first introduced in the re-ECN protocol that was the feedback was first introduced in the re-ECN protocol that was the
ancestor of ConEx. ancestor of ConEx.
Bob Briscoe was part-funded by the Comcast Innovation Fund, the Bob Briscoe was part-funded by the Comcast Innovation Fund, the
skipping to change at page 44, line 18 skipping to change at page 44, line 28
through the Trilogy 2 project (ICT-317756), and the Research Council through the Trilogy 2 project (ICT-317756), and the Research Council
of Norway through the TimeIn project. The views expressed here are of Norway through the TimeIn project. The views expressed here are
solely those of the authors. solely those of the authors.
Mirja Kuehlewind was partly supported by the European Commission Mirja Kuehlewind was partly supported by the European Commission
under Horizon 2020 grant agreement no. 688421 Measurement and under Horizon 2020 grant agreement no. 688421 Measurement and
Architecture for a Middleboxed Internet (MAMI), and by the Swiss Architecture for a Middleboxed Internet (MAMI), and by the Swiss
State Secretariat for Education, Research, and Innovation under State Secretariat for Education, Research, and Innovation under
contract no. 15.0268. This support does not imply endorsement. contract no. 15.0268. This support does not imply endorsement.
9. Comments Solicited 10. Comments Solicited
Comments and questions are encouraged and very welcome. They can be Comments and questions are encouraged and very welcome. They can be
addressed to the IETF TCP maintenance and minor modifications working addressed to the IETF TCP maintenance and minor modifications working
group mailing list <tcpm@ietf.org>, and/or to the authors. group mailing list <tcpm@ietf.org>, and/or to the authors.
10. References 11. References
10.1. Normative References 11.1. Normative References
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7, [RFC0793] Postel, J., "Transmission Control Protocol", STD 7,
RFC 793, DOI 10.17487/RFC0793, September 1981, RFC 793, DOI 10.17487/RFC0793, September 1981,
<https://www.rfc-editor.org/info/rfc793>. <https://www.rfc-editor.org/info/rfc793>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
skipping to change at page 45, line 5 skipping to change at page 45, line 13
<https://www.rfc-editor.org/info/rfc3168>. <https://www.rfc-editor.org/info/rfc3168>.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
<https://www.rfc-editor.org/info/rfc5681>. <https://www.rfc-editor.org/info/rfc5681>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
10.2. Informative References 11.2. Informative References
[I-D.ietf-tcpm-2140bis] [I-D.ietf-tcpm-2140bis]
Touch, J., Welzl, M., and S. Islam, "TCP Control Block Touch, J., Welzl, M., and S. Islam, "TCP Control Block
Interdependence", draft-ietf-tcpm-2140bis-02 (work in Interdependence", draft-ietf-tcpm-2140bis-02 (work in
progress), February 2020. progress), February 2020.
[I-D.ietf-tcpm-generalized-ecn] [I-D.ietf-tcpm-generalized-ecn]
Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
Congestion Notification (ECN) to TCP Control Packets", Congestion Notification (ECN) to TCP Control Packets",
draft-ietf-tcpm-generalized-ecn-05 (work in progress), draft-ietf-tcpm-generalized-ecn-05 (work in progress),
 End of changes. 54 change blocks. 
173 lines changed or deleted 207 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/