draft-ietf-tcpm-accurate-ecn-05.txt   draft-ietf-tcpm-accurate-ecn-06.txt 
TCP Maintenance & Minor Extensions (tcpm) B. Briscoe TCP Maintenance & Minor Extensions (tcpm) B. Briscoe
Internet-Draft CableLabs Internet-Draft CableLabs
Intended status: Experimental M. Kuehlewind Intended status: Experimental M. Kuehlewind
Expires: May 15, 2018 ETH Zurich Expires: September 6, 2018 ETH Zurich
R. Scheffenegger R. Scheffenegger
November 11, 2017 March 5, 2018
More Accurate ECN Feedback in TCP More Accurate ECN Feedback in TCP
draft-ietf-tcpm-accurate-ecn-05 draft-ietf-tcpm-accurate-ecn-06
Abstract Abstract
Explicit Congestion Notification (ECN) is a mechanism where network Explicit Congestion Notification (ECN) is a mechanism where network
nodes can mark IP packets instead of dropping them to indicate nodes can mark IP packets instead of dropping them to indicate
incipient congestion to the end-points. Receivers with an ECN- incipient congestion to the end-points. Receivers with an ECN-
capable transport protocol feed back this information to the sender. capable transport protocol feed back this information to the sender.
ECN is specified for TCP in such a way that only one feedback signal ECN is specified for TCP in such a way that only one feedback signal
can be transmitted per Round-Trip Time (RTT). Recently, new TCP can be transmitted per Round-Trip Time (RTT). Recently,ew TCP
mechanisms like Congestion Exposure (ConEx) or Data Center TCP mechanisms like Congestion Exposure (ConEx) or Data Center TCP
(DCTCP) need more accurate ECN feedback information whenever more (DCTCP) need more accurate ECN feedback information whenever more
than one marking is received in one RTT. This document specifies an than one marking is received in one RTT. This document specifies an
experimental scheme to provide more than one feedback signal per RTT experimental scheme to provide more than one feedback signal per RTT
in the TCP header. Given TCP header space is scarce, it overloads in the TCP header. Given TCP header space is scarce, it overloads
the three existing ECN-related flags in the TCP header and provides the three existing ECN-related flags in the TCP header and provides
additional information in a new TCP option. additional information in a new TCP option.
Status of This Memo Status of This Memo
skipping to change at page 1, line 44 skipping to change at page 1, line 44
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 15, 2018. This Internet-Draft will expire on September 6, 2018.
Copyright Notice Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Document Roadmap . . . . . . . . . . . . . . . . . . . . 4 1.1. Document Roadmap . . . . . . . . . . . . . . . . . . . . 4
1.2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3. Experiment Goals . . . . . . . . . . . . . . . . . . . . 5 1.3. Experiment Goals . . . . . . . . . . . . . . . . . . . . 5
1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6
1.5. Recap of Existing ECN feedback in IP/TCP . . . . . . . . 6 1.5. Recap of Existing ECN feedback in IP/TCP . . . . . . . . 7
2. AccECN Protocol Overview and Rationale . . . . . . . . . . . 7 2. AccECN Protocol Overview and Rationale . . . . . . . . . . . 8
2.1. Capability Negotiation . . . . . . . . . . . . . . . . . 9 2.1. Capability Negotiation . . . . . . . . . . . . . . . . . 9
2.2. Feedback Mechanism . . . . . . . . . . . . . . . . . . . 9 2.2. Feedback Mechanism . . . . . . . . . . . . . . . . . . . 9
2.3. Delayed ACKs and Resilience Against ACK Loss . . . . . . 9 2.3. Delayed ACKs and Resilience Against ACK Loss . . . . . . 10
2.4. Feedback Metrics . . . . . . . . . . . . . . . . . . . . 10 2.4. Feedback Metrics . . . . . . . . . . . . . . . . . . . . 10
2.5. Generic (Dumb) Reflector . . . . . . . . . . . . . . . . 11 2.5. Generic (Dumb) Reflector . . . . . . . . . . . . . . . . 11
3. AccECN Protocol Specification . . . . . . . . . . . . . . . . 12 3. AccECN Protocol Specification . . . . . . . . . . . . . . . . 12
3.1. Negotiating to use AccECN . . . . . . . . . . . . . . . . 12 3.1. Negotiating to use AccECN . . . . . . . . . . . . . . . . 12
3.1.1. Negotiation during the TCP handshake . . . . . . . . 12 3.1.1. Negotiation during the TCP handshake . . . . . . . . 12
3.1.2. Retransmission of the SYN . . . . . . . . . . . . . . 14 3.1.2. Retransmission of the SYN . . . . . . . . . . . . . . 14
3.2. AccECN Feedback . . . . . . . . . . . . . . . . . . . . . 15 3.2. AccECN Feedback . . . . . . . . . . . . . . . . . . . . . 15
3.2.1. Initialization of Feedback Counters at the Data 3.2.1. Initialization of Feedback Counters at the Data
Sender . . . . . . . . . . . . . . . . . . . . . . . 15 Sender . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.2. The ACE Field . . . . . . . . . . . . . . . . . . . . 16 3.2.2. The ACE Field . . . . . . . . . . . . . . . . . . . . 16
3.2.3. Testing for Zeroing of the ACE Field . . . . . . . . 17 3.2.3. Testing for Zeroing of the ACE Field . . . . . . . . 18
3.2.4. Testing for Mangling of the IP/ECN Field . . . . . . 18 3.2.4. Testing for Mangling of the IP/ECN Field . . . . . . 18
3.2.5. Safety against Ambiguity of the ACE Field . . . . . . 19 3.2.5. Safety against Ambiguity of the ACE Field . . . . . . 19
3.2.6. The AccECN Option . . . . . . . . . . . . . . . . . . 20 3.2.6. The AccECN Option . . . . . . . . . . . . . . . . . . 20
3.2.7. Path Traversal of the AccECN Option . . . . . . . . . 21 3.2.7. Path Traversal of the AccECN Option . . . . . . . . . 21
3.2.8. Usage of the AccECN TCP Option . . . . . . . . . . . 24 3.2.8. Usage of the AccECN TCP Option . . . . . . . . . . . 24
3.3. AccECN Compliance by TCP Proxies, Offload Engines and 3.3. Requirements for TCP Proxies, Offload Engines and other
other Middleboxes . . . . . . . . . . . . . . . . . . . . 26 Middleboxes on AccECN Compliance . . . . . . . . . . . . 26
4. Interaction with Other TCP Variants . . . . . . . . . . . . . 26 4. Interaction with Other TCP Variants . . . . . . . . . . . . . 27
4.1. Compatibility with SYN Cookies . . . . . . . . . . . . . 27 4.1. Compatibility with SYN Cookies . . . . . . . . . . . . . 27
4.2. Compatibility with Other TCP Options and Experiments . . 27 4.2. Compatibility with Other TCP Options and Experiments . . 28
4.3. Compatibility with Feedback Integrity Mechanisms . . . . 28 4.3. Compatibility with Feedback Integrity Mechanisms . . . . 28
5. Protocol Properties . . . . . . . . . . . . . . . . . . . . . 29 5. Protocol Properties . . . . . . . . . . . . . . . . . . . . . 29
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31
7. Security Considerations . . . . . . . . . . . . . . . . . . . 31 7. Security Considerations . . . . . . . . . . . . . . . . . . . 32
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 32 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 32
9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . 33 9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . 33
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 33
10.1. Normative References . . . . . . . . . . . . . . . . . . 33 10.1. Normative References . . . . . . . . . . . . . . . . . . 33
10.2. Informative References . . . . . . . . . . . . . . . . . 33 10.2. Informative References . . . . . . . . . . . . . . . . . 33
Appendix A. Example Algorithms . . . . . . . . . . . . . . . . . 36 Appendix A. Example Algorithms . . . . . . . . . . . . . . . . . 36
A.1. Example Algorithm to Encode/Decode the AccECN Option . . 36 A.1. Example Algorithm to Encode/Decode the AccECN Option . . 36
A.2. Example Algorithm for Safety Against Long Sequences of A.2. Example Algorithm for Safety Against Long Sequences of
ACK Loss . . . . . . . . . . . . . . . . . . . . . . . . 37 ACK Loss . . . . . . . . . . . . . . . . . . . . . . . . 37
A.2.1. Safety Algorithm without the AccECN Option . . . . . 37 A.2.1. Safety Algorithm without the AccECN Option . . . . . 37
skipping to change at page 3, line 35 skipping to change at page 3, line 35
1. Introduction 1. Introduction
Explicit Congestion Notification (ECN) [RFC3168] is a mechanism where Explicit Congestion Notification (ECN) [RFC3168] is a mechanism where
network nodes can mark IP packets instead of dropping them to network nodes can mark IP packets instead of dropping them to
indicate incipient congestion to the end-points. Receivers with an indicate incipient congestion to the end-points. Receivers with an
ECN-capable transport protocol feed back this information to the ECN-capable transport protocol feed back this information to the
sender. ECN is specified for TCP in such a way that only one sender. ECN is specified for TCP in such a way that only one
feedback signal can be transmitted per Round-Trip Time (RTT). feedback signal can be transmitted per Round-Trip Time (RTT).
Recently, proposed mechanisms like Congestion Exposure (ConEx Recently, proposed mechanisms like Congestion Exposure (ConEx
[RFC7713]), DCTCP [RFC8257] or L4S [I-D.ietf-tsvwg-l4s-arch] need [RFC7713]), DCTCP [RFC8257] or L4S [I-D.ietf-tsvwg-l4s-arch] need
more accurate ECN feedback information whenever more than one marking more accurate ECN feedback information than provided by the feedback
is received in one RTT. A fuller treatment of the motivation for scheme as specified in [RFC3168] whenever more than one marking is
received in one RTT. This document specifies an alternative feedback
scheme that provides more accurate information and could be used by
these new TCP extensions. A fuller treatment of the motivation for
this specification is given in the associated requirements document this specification is given in the associated requirements document
[RFC7560]. [RFC7560].
This documents specifies an experimental scheme for ECN feedback in This documents specifies an experimental scheme for ECN feedback in
the TCP header to provide more than one feedback signal per RTT. It the TCP header to provide more than one feedback signal per RTT. It
will be called the more accurate ECN feedback scheme, or AccECN for will be called the more accurate ECN feedback scheme, or AccECN for
short. If AccECN progresses from experimental to the standards short. If AccECN progresses from experimental to the standards
track, it is intended to be a complete replacement for classic TCP/ track, it is intended to be a complete replacement for classic TCP/
ECN feedback, not a fork in the design of TCP. AccECN feedback ECN feedback, not a fork in the design of TCP. AccECN feedback
complements TCP's loss feedback and it supplements classic TCP/ECN complements TCP's loss feedback and it supplements classic TCP/ECN
feedback, so its applicability is intended to include all public and feedback, so its applicability is intended to include all public and
private IP networks (and even any non-IP networks over which TCP is private IP networks (and even any non-IP networks over which TCP is
used today), whether or not any nodes on the path support ECN of used today), whether or not any nodes on the path support ECN of
whatever flavour. whatever flavour.
Until the AccECN experiment succeeds, [RFC3168] will remain as the Until the AccECN experiment succeeds, [RFC3168] will remain as the
standards track specification for adding ECN to TCP. To avoid only standards track specification for adding ECN to TCP. To avoid
confusion, in this document we use the term 'classic ECN' for the confusion, in this document we use the term 'classic ECN' for the
pre-existing ECN specification [RFC3168]. pre-existing ECN specification [RFC3168].
AccECN feedback overloads flags and fields in the main TCP header AccECN feedback overloads the two existing ECN flags as well as the
with new definitions, so both ends have to support the new wire currently reserved and previously called NS flag in the main TCP
protocol before it can be used. Therefore during the TCP handshake header with new definitions, so both ends have to support the new
the two ends use the three ECN-related flags in the TCP header to wire protocol before it can be used. Therefore during the TCP
negotiate the most advanced feedback protocol that they can both handshake the two ends use the three ECN-related flags in the TCP
support. header to negotiate the most advanced feedback protocol that they can
both support.
AccECN is solely an (experimental) change to the TCP wire protocol; AccECN is solely an (experimental) change to the TCP wire protocol;
it only specifies the negotiation and signaling of more accurate ECN it only specifies the negotiation and signaling of more accurate ECN
feedback from a TCP Data Receiver to a Data Sender. It is completely feedback from a TCP Data Receiver to a Data Sender. It is completely
independent of how TCP might respond to congestion feedback, which is independent of how TCP might respond to congestion feedback, which is
out of scope. For that we refer to [RFC3168] or any RFC that out of scope. For that we refer to [RFC3168] or any RFC that
specifies a different response to TCP ECN feedback, for example: specifies a different response to TCP ECN feedback, for example:
[RFC8257]; or the ECN experiments referred to in [RFC8257]; or the ECN experiments referred to in [RFC8311], namely: a
[I-D.ietf-tsvwg-ecn-experimentation], namely: a TCP-based Low Latency TCP-based Low Latency Low Loss Scalable (L4S) congestion control
Low Loss Scalable (L4S) congestion control [I-D.ietf-tsvwg-l4s-arch]; [I-D.ietf-tsvwg-l4s-arch]; ECN-capable TCP control packets
ECN-capable TCP control packets [I-D.ietf-tcpm-generalized-ecn], or [I-D.ietf-tcpm-generalized-ecn], or Alternative Backoff with ECN
Alternative Backoff with ECN (ABE) (ABE) [I-D.ietf-tcpm-alternativebackoff-ecn].
[I-D.ietf-tcpm-alternativebackoff-ecn].
It is likely (but not required) that the AccECN protocol will be It is likely (but not required) that the AccECN protocol will be
implemented along with the following experimental additions to the implemented along with the following experimental additions to the
TCP-ECN protocol: ECN-capable TCP control packets and retransmissions TCP-ECN protocol: ECN-capable TCP control packets and retransmissions
[I-D.ietf-tcpm-generalized-ecn], which includes the ECN-capable SYN/ [I-D.ietf-tcpm-generalized-ecn], which includes the ECN-capable SYN/
ACK experiment [RFC5562]; and testing receiver non-compliance ACK experiment [RFC5562]; and testing receiver non-compliance
[I-D.moncaster-tcpm-rcv-cheat]. [I-D.moncaster-tcpm-rcv-cheat].
1.1. Document Roadmap 1.1. Document Roadmap
skipping to change at page 6, line 10 skipping to change at page 6, line 10
middleboxes), whether or not they comply with standards. middleboxes), whether or not they comply with standards.
Testing will mostly focus on fall-back strategies in case of Testing will mostly focus on fall-back strategies in case of
middlebox interference. Current recommended strategies are specified middlebox interference. Current recommended strategies are specified
in Sections 3.1.2, 3.2.3, 3.2.4 and 3.2.7. The effectiveness of in Sections 3.1.2, 3.2.3, 3.2.4 and 3.2.7. The effectiveness of
these strategies depends on the actual deployment situation of these strategies depends on the actual deployment situation of
middleboxes. Therefore experimental verification to confirm large- middleboxes. Therefore experimental verification to confirm large-
scale path traversal in the Internet is needed before finalizing this scale path traversal in the Internet is needed before finalizing this
specification on the Standards Track. specification on the Standards Track.
Another experimentation focus is the implementation feasibiliy of
change-triggered ACKs as described in section 3.2.8. While on
average this should not lead to a higher ACK rate, it changes the ACK
patter which especially can have an impact on hardware offload.
Further experimentation is needed to advise if this should a hard
requirement or just prefer behavior.
1.4. Terminology 1.4. Terminology
AccECN: The more accurate ECN feedback scheme will be called AccECN AccECN: The more accurate ECN feedback scheme will be called AccECN
for short. for short.
Classic ECN: the ECN protocol specified in [RFC3168]. Classic ECN: the ECN protocol specified in [RFC3168].
Classic ECN feedback: the feedback aspect of the ECN protocol Classic ECN feedback: the feedback aspect of the ECN protocol
specified in [RFC3168], including generation, encoding, specified in [RFC3168], including generation, encoding,
transmission and decoding of feedback, but not the Data Sender's transmission and decoding of feedback, but not the Data Sender's
skipping to change at page 6, line 38 skipping to change at page 6, line 45
TCP server: The TCP stack that responds to a connection request. TCP server: The TCP stack that responds to a connection request.
Data Receiver: The endpoint of a TCP half-connection that receives Data Receiver: The endpoint of a TCP half-connection that receives
data and sends AccECN feedback. data and sends AccECN feedback.
Data Sender: The endpoint of a TCP half-connection that sends data Data Sender: The endpoint of a TCP half-connection that sends data
and receives AccECN feedback. and receives AccECN feedback.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in BCP 14 [RFC2119]
[RFC8174] when, and only when, they appear in all capitals, as shown
here.
1.5. Recap of Existing ECN feedback in IP/TCP 1.5. Recap of Existing ECN feedback in IP/TCP
ECN [RFC3168] uses two bits in the IP header. Once ECN has been ECN [RFC3168] uses two bits in the IP header. Once ECN has been
negotiated with the receiver at the transport layer, an ECN sender negotiated with the receiver at the transport layer, an ECN sender
can set two possible codepoints (ECT(0) or ECT(1)) in the IP header can set two possible codepoints (ECT(0) or ECT(1)) in the IP header
to indicate an ECN-capable transport (ECT). If both ECN bits are to indicate an ECN-capable transport (ECT). If both ECN bits are
zero, the packet is considered to have been sent by a Not-ECN-capable zero, the packet is considered to have been sent by a Not-ECN-capable
Transport (Not-ECT). When a network node experiences congestion, it Transport (Not-ECT). When a network node experiences congestion, it
will occasionally either drop or mark a packet, with the choice will occasionally either drop or mark a packet, with the choice
skipping to change at page 7, line 34 skipping to change at page 7, line 47
Data Receiver starts to set the Echo Congestion Experienced (ECE) Data Receiver starts to set the Echo Congestion Experienced (ECE)
flag continuously in the TCP header of ACKs, which ensures the signal flag continuously in the TCP header of ACKs, which ensures the signal
is received reliably even if ACKs are lost. The TCP sender confirms is received reliably even if ACKs are lost. The TCP sender confirms
that it has received at least one ECE signal by responding with the that it has received at least one ECE signal by responding with the
congestion window reduced (CWR) flag, which allows the TCP receiver congestion window reduced (CWR) flag, which allows the TCP receiver
to stop repeating the ECN-Echo flag. This always leads to a full RTT to stop repeating the ECN-Echo flag. This always leads to a full RTT
of ACKs with ECE set. Thus any additional CE markings arriving of ACKs with ECE set. Thus any additional CE markings arriving
within this RTT cannot be fed back. within this RTT cannot be fed back.
The last bit in byte 13 of the TCP header was defined as the Nonce The last bit in byte 13 of the TCP header was defined as the Nonce
Sum (NS) for the ECN Nonce [RFC3540]. RFC 3540 was never deployed so Sum (NS) for the ECN Nonce [RFC3540]. In the absence of widespread
it is being reclassified as historic, making this TCP flag available deployment RFC 3540 has been reclassified as historic [RFC8311] and
for use by the AccECN experiment instead. the respective flag has been marked as "reserved", making this TCP
flag available for use by the AccECN experiment instead.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| | | N | C | E | U | A | P | R | S | F | | | | N | C | E | U | A | P | R | S | F |
| Header Length | Reserved | S | W | C | R | C | S | S | Y | I | | Header Length | Reserved | S | W | C | R | C | S | S | Y | I |
| | | | R | E | G | K | H | T | N | N | | | | | R | E | G | K | H | T | N | N |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Figure 1: The (post-ECN Nonce) definition of the TCP header flags Figure 1: The (post-ECN Nonce) definition of the TCP header flags
skipping to change at page 10, line 26 skipping to change at page 10, line 36
their size has been chosen such that a whole cycle of the field would their size has been chosen such that a whole cycle of the field would
never occur between ACKs unless there had been an infeasibly long never occur between ACKs unless there had been an infeasibly long
sequence of ACK losses. Therefore, as long as the AccECN Option is sequence of ACK losses. Therefore, as long as the AccECN Option is
available, it can be treated as a dependable feedback channel. available, it can be treated as a dependable feedback channel.
If the AccECN Option is not available, e.g. it is being stripped by a If the AccECN Option is not available, e.g. it is being stripped by a
middlebox, the AccECN protocol will only feed back information on CE middlebox, the AccECN protocol will only feed back information on CE
markings (using the ACE field). Although not ideal, this will be markings (using the ACE field). Although not ideal, this will be
sufficient, because it is envisaged that neither ECT(0) nor ECT(1) sufficient, because it is envisaged that neither ECT(0) nor ECT(1)
will ever indicate more severe congestion than CE, even though future will ever indicate more severe congestion than CE, even though future
uses for ECT(0) or ECT(1) are still unclear uses for ECT(0) or ECT(1) are still unclear [RFC8311]. Because the
[I-D.ietf-tsvwg-ecn-experimentation]. Because the 3-bit ACE field is 3-bit ACE field is so small, when it is the only field available the
so small, when it is the only field available the Data Sender has to Data Sender has to interpret it conservatively assuming the worst
interpret it conservatively assuming the worst possible wrap. possible wrap.
Certain specified events trigger the Data Receiver to include an Certain specified events trigger the Data Receiver to include an
AccECN Option on an ACK. The rules are designed to ensure that the AccECN Option on an ACK. The rules are designed to ensure that the
order in which different markings arrive at the receiver is order in which different markings arrive at the receiver is
communicated to the sender (as long as there is no ACK loss). communicated to the sender (as long as there is no ACK loss).
Implementations are encouraged to send an AccECN Option more Implementations are encouraged to send an AccECN Option more
frequently, but this is left up to the implementer. frequently, but this is left up to the implementer.
2.4. Feedback Metrics 2.4. Feedback Metrics
skipping to change at page 11, line 30 skipping to change at page 11, line 39
It is also useful to be able to rely on generic reflection behaviour It is also useful to be able to rely on generic reflection behaviour
when senders need to test for unexpected interference with markings when senders need to test for unexpected interference with markings
(for instance [I-D.kuehlewind-tcpm-ecn-fallback] and (for instance [I-D.kuehlewind-tcpm-ecn-fallback] and
[I-D.moncaster-tcpm-rcv-cheat]). [I-D.moncaster-tcpm-rcv-cheat]).
The initial SYN is the most critical control packet, so AccECN The initial SYN is the most critical control packet, so AccECN
provides feedback on whether it is CE marked. Although RFC 3168 provides feedback on whether it is CE marked. Although RFC 3168
prohibits an ECN-capable SYN, providing feedback of CE marking on the prohibits an ECN-capable SYN, providing feedback of CE marking on the
SYN supports future scenarios in which SYNs might be ECN-enabled SYN supports future scenarios in which SYNs might be ECN-enabled
(without prejudging whether they ought to be). For instance, (without prejudging whether they ought to be). For instance,
[I-D.ietf-tsvwg-ecn-experimentation] updates this aspect of RFC 3168 [RFC8311] updates this aspect of RFC 3168 to allow experimentation
to allow experimentation with ECN-capable TCP control packets. with ECN-capable TCP control packets.
Even if the TCP client (or server) has set the SYN (or SYN/ACK) to Even if the TCP client (or server) has set the SYN (or SYN/ACK) to
not-ECT in compliance with RFC 3168, feedback on the state of the ECN not-ECT in compliance with RFC 3168, feedback on the state of the ECN
field when it arrives at the receiver could still be useful, because field when it arrives at the receiver could still be useful, because
middleboxes have been known to overwrite the ECN IP field as if it is middleboxes have been known to overwrite the ECN IP field as if it is
still part of the old Type of Service (ToS) field [Mandalari18]. If still part of the old Type of Service (ToS) field [Mandalari18]. If
a TCP client has set the SYN to Not-ECT, but receives CE feedback, it a TCP client has set the SYN to Not-ECT, but receives CE feedback, it
can detect such middlebox interference and send Not-ECT for the rest can detect such middlebox interference and send Not-ECT for the rest
of the connection (see [I-D.kuehlewind-tcpm-ecn-fallback]). Today, of the connection (see [I-D.kuehlewind-tcpm-ecn-fallback]). Today,
if a TCP server receives ECT or CE on a SYN, it cannot know whether if a TCP server receives ECT or CE on a SYN, it cannot know whether
skipping to change at page 12, line 11 skipping to change at page 12, line 17
to feed back the received ECN field to the client, which then has all to feed back the received ECN field to the client, which then has all
the information to decide whether the connection has to fall-back the information to decide whether the connection has to fall-back
from supporting ECN (or not). from supporting ECN (or not).
3. AccECN Protocol Specification 3. AccECN Protocol Specification
3.1. Negotiating to use AccECN 3.1. Negotiating to use AccECN
3.1.1. Negotiation during the TCP handshake 3.1.1. Negotiation during the TCP handshake
Given the ECN Nonce [RFC3540] is being reclassified as historic, the Given the ECN Nonce [RFC3540] has been reclassified as historic
present specification renames the TCP flag at bit 7 of the TCP header [RFC8311], the present specification renames the TCP flag at bit 7 of
flags from NS (Nonce Sum) to AE (Accurate ECN) (see IANA the TCP header flags from NS (Nonce Sum) to AE (Accurate ECN) (see
Considerations in Section 6). IANA Considerations in Section 6).
During the TCP handshake at the start of a connection, to request During the TCP handshake at the start of a connection, to request
more accurate ECN feedback the TCP client (host A) MUST set the TCP more accurate ECN feedback the TCP client (host A) MUST set the TCP
flags AE=1, CWR=1 and ECE=1 in the initial SYN segment. flags AE=1, CWR=1 and ECE=1 in the initial SYN segment.
If a TCP server (B) that is AccECN-enabled receives a SYN with the If a TCP server (B) that is AccECN-enabled receives a SYN with the
above three flags set, it MUST set both its half connections into above three flags set, it MUST set both its half connections into
AccECN mode. Then it MUST set the TCP flags on the SYN/ACK to one of AccECN mode. Then it MUST set the TCP flags on the SYN/ACK to one of
the 4 values shown in the top block of Table 2 to confirm that it the 4 values shown in the top block of Table 2 to confirm that it
supports AccECN. The TCP server MUST NOT set one of these 4 supports AccECN. The TCP server MUST NOT set one of these 4
skipping to change at page 14, line 14 skipping to change at page 14, line 21
4. The fourth block displays a combination labelled `Broken' . Some 4. The fourth block displays a combination labelled `Broken' . Some
older TCP server implementations incorrectly set the reserved older TCP server implementations incorrectly set the reserved
flags in the SYN/ACK by reflecting those in the SYN. Such broken flags in the SYN/ACK by reflecting those in the SYN. Such broken
TCP servers (B) cannot support ECN, so as soon as an AccECN- TCP servers (B) cannot support ECN, so as soon as an AccECN-
capable TCP client (A) receives such a broken SYN/ACK it MUST capable TCP client (A) receives such a broken SYN/ACK it MUST
fall-back to Not ECN mode for both its half connections. fall-back to Not ECN mode for both its half connections.
The following exceptional cases need some explanation: The following exceptional cases need some explanation:
ECN Nonce: An AccECN implementation, whether client or server, ECN Nonce: With AccECN implementation, there is no need for the ECN
sender or receiver, does not need to implement the ECN Nonce Nonce feedback mode [RFC3540], which has also been reclassified as
feedback mode [RFC3540], which is being reclassified as historic historic [RFC8311], as AccECN is compatible with an alternative
[I-D.ietf-tsvwg-ecn-experimentation]. AccECN is compatible with ECN feedback integrity approach that does not use up the ECT(1)
an alternative ECN feedback integrity approach that does not use codepoint and can be implemented solely at the sender (see
up the ECT(1) codepoint and can be implemented solely at the Section 4.3).
sender (see Section 4.3).
Simultaneous Open: An originating AccECN Host (A), having sent a SYN Simultaneous Open: An originating AccECN Host (A), having sent a SYN
with AE=1, CWR=1 and ECE=1, might receive another SYN from host B. with AE=1, CWR=1 and ECE=1, might receive another SYN from host B.
Host A MUST then enter the same feedback mode as it would have Host A MUST then enter the same feedback mode as it would have
entered had it been a responding host and received the same SYN. entered had it been a responding host and received the same SYN.
Then host A MUST send the same SYN/ACK as it would have sent had Then host A MUST send the same SYN/ACK as it would have sent had
it been a responding host. it been a responding host.
3.1.2. Retransmission of the SYN 3.1.2. Retransmission of the SYN
skipping to change at page 21, line 15 skipping to change at page 21, line 21
AccECN Option has to be sent (e.g. on the SYN/ACK to test the path), AccECN Option has to be sent (e.g. on the SYN/ACK to test the path),
but there is very limited space for the option. For initial but there is very limited space for the option. For initial
experiments, the Length field MUST be 2 greater to accommodate the experiments, the Length field MUST be 2 greater to accommodate the
16-bit magic number. 16-bit magic number.
All implementations of a Data Sender MUST be able to read in AccECN All implementations of a Data Sender MUST be able to read in AccECN
Options of any of the above lengths. If the AccECN Option is of any Options of any of the above lengths. If the AccECN Option is of any
other length, implementations MUST use those whole 3 octet fields other length, implementations MUST use those whole 3 octet fields
that fit within the length and ignore the remainder of the option. that fit within the length and ignore the remainder of the option.
The use of the AccECN option is optional for the Data Receiver. If
the Data Receiver intents to use the AccECN option at any time during
the rest of the connection it strongly recommended to also test its
path traversal by including it in the SYN/ACK as specified in the
next section. By default the use of the AccECN option is
RECOMMENDED.
3.2.7. Path Traversal of the AccECN Option 3.2.7. Path Traversal of the AccECN Option
3.2.7.1. Testing the AccECN Option during the Handshake 3.2.7.1. Testing the AccECN Option during the Handshake
The TCP client MUST NOT include the AccECN TCP Option on the SYN. The TCP client MUST NOT include the AccECN TCP Option on the SYN.
Nonetheless, if the AccECN negotiation using the ECN flags in the Nonetheless, if the AccECN negotiation using the ECN flags in the
main TCP header (Section 3.1) is successful, it implicitly declares main TCP header (Section 3.1) is successful, it implicitly declares
that the endpoints also support the AccECN TCP Option. A fall-back that the endpoints also support the AccECN TCP Option. A fall-back
strategy for the loss of the SYN (possibly due to middlebox strategy for the loss of the SYN (possibly due to middlebox
interference) is specified in Section 3.1.2. interference) is specified in Section 3.1.2.
A TCP server that confirms its support for AccECN (in response to an A TCP server that confirms its support for AccECN (in response to an
AccECN SYN from the client as described in Section 3.1) SHOULD also AccECN SYN from the client as described in Section 3.1) SHOULD
include an AccECN TCP Option in the SYN/ACK. include an AccECN TCP Option in the SYN/ACK.
A TCP client that has successfully negotiated AccECN SHOULD include A TCP client that has successfully negotiated AccECN SHOULD include
an AccECN Option in the first ACK at the end of the 3WHS. However, an AccECN Option in the first ACK at the end of the 3WHS. However,
this first ACK is not delivered reliably, so the TCP client SHOULD this first ACK is not delivered reliably, so the TCP client SHOULD
also include an AccECN Option on the first data segment it sends (if also include an AccECN Option on the first data segment it sends (if
it ever sends one). it ever sends one).
A host MAY NOT include an AccECN Option in any of these three cases A host MAY NOT include an AccECN Option in any of these three cases
if it has cached knowledge that the packet would be likely to be if it has cached knowledge that the packet would be likely to be
skipping to change at page 26, line 9 skipping to change at page 26, line 23
available in each ACK (in total and in the option space). available in each ACK (in total and in the option space).
Appendix A.3 gives an example algorithm to estimate the number of Appendix A.3 gives an example algorithm to estimate the number of
marked bytes from the ACE field alone, if the AccECN Option is not marked bytes from the ACE field alone, if the AccECN Option is not
available. available.
If a host has determined that segments with the AccECN Option always If a host has determined that segments with the AccECN Option always
seem to be discarded somewhere along the path, it is no longer seem to be discarded somewhere along the path, it is no longer
obliged to follow the above rules. obliged to follow the above rules.
3.3. AccECN Compliance by TCP Proxies, Offload Engines and other 3.3. Requirements for TCP Proxies, Offload Engines and other
Middleboxes Middleboxes on AccECN Compliance
A large class of middleboxes split TCP connections. Such a middlebox A large class of middleboxes split TCP connections. Such a middlebox
would be compliant with the AccECN protocol if the TCP implementation would be compliant with the AccECN protocol if the TCP implementation
on each side complied with the present AccECN specification and each on each side complied with the present AccECN specification and each
side negotiated AccECN independently of the other side. side negotiated AccECN independently of the other side.
Another large class of middleboxes intervenes to some degree at the Another large class of middleboxes intervenes to some degree at the
transport layer, but attempts to be transparent (invisible) to the transport layer, but attempts to be transparent (invisible) to the
end-to-end connection. A subset of this class of middleboxes end-to-end connection. A subset of this class of middleboxes
attempts to `normalise' the TCP wire protocol by checking that all attempts to `normalise' the TCP wire protocol by checking that all
values in header fields comply with a rather narrow interpretation of values in header fields comply with a rather narrow interpretation of
the TCP specifications. To comply with the present AccECN the TCP specifications. To comply with the present AccECN
specification, such a middlebox MUST NOT change the ACE field or the specification, such a middlebox MUST NOT change the ACE field or the
AccECN Option and it MUST attempt to preserve the timing of each ACK AccECN Option and it SHOULD preserve the timing of each ACK (for
(for example, if it coalesced ACKs it would not be AccECN-compliant). example, if it coalesced ACKs it would not be AccECN-compliant) as
A middlebox claiming to be transparent at the transport layer MUST these can be used by the Data Sender to infer further information
forward the AccECN TCP Option unaltered, whether or not the length about the path congestion level. A middlebox claiming to be
value matches one of those specified in Section 3.2.6, and whether or transparent at the transport layer MUST forward the AccECN TCP Option
not the initial values of the byte-counter fields are correct. This unaltered, whether or not the length value matches one of those
is because blocking apparently invalid values does not improve specified in Section 3.2.6, and whether or not the initial values of
security (because AccECN hosts are required to ignore invalid values the byte-counter fields are correct. This is because blocking
anyway), while it prevents the standardised set of values being apparently invalid values does not improve security (because AccECN
extended in future (because outdated normalisers would block updated hosts are required to ignore invalid values anyway), while it
hosts from using the extended AccECN standard). prevents the standardised set of values being extended in future
(because outdated normalisers would block updated hosts from using
the extended AccECN standard).
Hardware to offload certain TCP processing represents another large Hardware to offload certain TCP processing represents another large
class of middleboxes, even though it is often a function of a host's class of middleboxes, even though it is often a function of a host's
network interface and rarely in its own 'box'. Leeway has been network interface and rarely in its own 'box'. Leeway has been
allowed in the present AccECN specification in the expectation that allowed in the present AccECN specification in the expectation that
offload hardware could comply and still serve its function. offload hardware could comply and still serve its function.
Nonetheless, such hardware MUST attempt to preserve the timing of Nonetheless, such hardware SHOULD also preserve the timing of each
each ACK (for example, if it coalesced ACKs it would not be AccECN- ACK (for example, if it coalesced ACKs it would not be AccECN-
compliant). compliant).
4. Interaction with Other TCP Variants 4. Interaction with Other TCP Variants
This section is informative, not normative. This section is informative, not normative.
4.1. Compatibility with SYN Cookies 4.1. Compatibility with SYN Cookies
A TCP server can use SYN Cookies (see Appendix A of [RFC4987]) to A TCP server can use SYN Cookies (see Appendix A of [RFC4987]) to
protect itself from SYN flooding attacks. It places minimal commonly protect itself from SYN flooding attacks. It places minimal commonly
skipping to change at page 28, line 52 skipping to change at page 29, line 18
The AccECN fields are immutable end-to-end, so they are amenable The AccECN fields are immutable end-to-end, so they are amenable
to TCP-AO protection, which covers TCP options by default. to TCP-AO protection, which covers TCP options by default.
However, TCP-AO is often too brittle to use on many end-to-end However, TCP-AO is often too brittle to use on many end-to-end
paths, where middleboxes can make verification fail in their paths, where middleboxes can make verification fail in their
attempts to improve performance or security, e.g. by attempts to improve performance or security, e.g. by
resegmentation or shifting the sequence space. resegmentation or shifting the sequence space.
Originally the ECN Nonce [RFC3540] was proposed to ensure integrity Originally the ECN Nonce [RFC3540] was proposed to ensure integrity
of congestion feedback. With minor changes AccECN could be optimised of congestion feedback. With minor changes AccECN could be optimised
for the possibility that the ECT(1) codepoint might be used as an ECN for the possibility that the ECT(1) codepoint might be used as an ECN
Nonce . However, given RFC 3540 is being reclassified as historic, Nonce. However, given RFC 3540 has been reclassified as historic,
the AccECN design has been generalised so that it ought to be able to the AccECN design has been generalised so that it ought to be able to
support other possible uses of the ECT(1) codepoint, such as a lower support other possible uses of the ECT(1) codepoint, such as a lower
severity or a more instant congestion signal than CE. severity or a more instant congestion signal than CE.
5. Protocol Properties 5. Protocol Properties
This section is informative not normative. It describes how well the This section is informative not normative. It describes how well the
protocol satisfies the agreed requirements for a more accurate ECN protocol satisfies the agreed requirements for a more accurate ECN
feedback protocol [RFC7560]. feedback protocol [RFC7560].
skipping to change at page 31, line 9 skipping to change at page 31, line 23
Forward Compatibility: The behaviour of endpoints and middleboxes is Forward Compatibility: The behaviour of endpoints and middleboxes is
carefully defined for all reserved or currently unused codepoints carefully defined for all reserved or currently unused codepoints
in the scheme, to ensure that any blocking of anomalous values is in the scheme, to ensure that any blocking of anomalous values is
always at least under reversible policy control. always at least under reversible policy control.
6. IANA Considerations 6. IANA Considerations
This document reassigns bit 7 of the TCP header flags to the AccECN This document reassigns bit 7 of the TCP header flags to the AccECN
experiment. This bit was previously called the Nonce Sum (NS) flag experiment. This bit was previously called the Nonce Sum (NS) flag
[RFC3540], but RFC 3540 is being reclassified as historic [RFC3540], but RFC 3540 is being reclassified as historic [RFC8311].
[I-D.ietf-tsvwg-ecn-experimentation]. The flag will now be defined The flag will now be defined as:
as:
+-----+-------------------+-----------+ +-----+-------------------+-----------+
| Bit | Name | Reference | | Bit | Name | Reference |
+-----+-------------------+-----------+ +-----+-------------------+-----------+
| 7 | AE (Accurate ECN) | RFC XXXX | | 7 | AE (Accurate ECN) | RFC XXXX |
+-----+-------------------+-----------+ +-----+-------------------+-----------+
[TO BE REMOVED: This registration should take place at the following [TO BE REMOVED: This registration should take place at the following
location: https://www.iana.org/assignments/tcp-header-flags/tcp- location: https://www.iana.org/assignments/tcp-header-flags/tcp-
header-flags.xhtml#tcp-header-flags-1 ] header-flags.xhtml#tcp-header-flags-1 ]
skipping to change at page 33, line 33 skipping to change at page 33, line 43
<https://www.rfc-editor.org/info/rfc3168>. <https://www.rfc-editor.org/info/rfc3168>.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
<https://www.rfc-editor.org/info/rfc5681>. <https://www.rfc-editor.org/info/rfc5681>.
[RFC6994] Touch, J., "Shared Use of Experimental TCP Options", [RFC6994] Touch, J., "Shared Use of Experimental TCP Options",
RFC 6994, DOI 10.17487/RFC6994, August 2013, RFC 6994, DOI 10.17487/RFC6994, August 2013,
<https://www.rfc-editor.org/info/rfc6994>. <https://www.rfc-editor.org/info/rfc6994>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
10.2. Informative References 10.2. Informative References
[I-D.ietf-tcpm-alternativebackoff-ecn] [I-D.ietf-tcpm-alternativebackoff-ecn]
Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst, Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst,
"TCP Alternative Backoff with ECN (ABE)", draft-ietf-tcpm- "TCP Alternative Backoff with ECN (ABE)", draft-ietf-tcpm-
alternativebackoff-ecn-03 (work in progress), October alternativebackoff-ecn-06 (work in progress), February
2017. 2018.
[I-D.ietf-tcpm-generalized-ecn] [I-D.ietf-tcpm-generalized-ecn]
Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
Congestion Notification (ECN) to TCP Control Packets", Congestion Notification (ECN) to TCP Control Packets",
draft-ietf-tcpm-generalized-ecn-02 (work in progress), draft-ietf-tcpm-generalized-ecn-02 (work in progress),
October 2017. October 2017.
[I-D.ietf-tsvwg-ecn-experimentation]
Black, D., "Relaxing Restrictions on Explicit Congestion
Notification (ECN) Experimentation", draft-ietf-tsvwg-ecn-
experimentation-07 (work in progress), October 2017.
[I-D.ietf-tsvwg-l4s-arch] [I-D.ietf-tsvwg-l4s-arch]
Briscoe, B., Schepper, K., and M. Bagnulo, "Low Latency, Briscoe, B., Schepper, K., and M. Bagnulo, "Low Latency,
Low Loss, Scalable Throughput (L4S) Internet Service: Low Loss, Scalable Throughput (L4S) Internet Service:
Architecture", draft-ietf-tsvwg-l4s-arch-01 (work in Architecture", draft-ietf-tsvwg-l4s-arch-01 (work in
progress), October 2017. progress), October 2017.
[I-D.kuehlewind-tcpm-ecn-fallback] [I-D.kuehlewind-tcpm-ecn-fallback]
Kuehlewind, M. and B. Trammell, "A Mechanism for ECN Path Kuehlewind, M. and B. Trammell, "A Mechanism for ECN Path
Probing and Fallback", draft-kuehlewind-tcpm-ecn- Probing and Fallback", draft-kuehlewind-tcpm-ecn-
fallback-01 (work in progress), September 2013. fallback-01 (work in progress), September 2013.
skipping to change at page 36, line 5 skipping to change at page 35, line 40
[RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) [RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx)
Concepts, Abstract Mechanism, and Requirements", RFC 7713, Concepts, Abstract Mechanism, and Requirements", RFC 7713,
DOI 10.17487/RFC7713, December 2015, DOI 10.17487/RFC7713, December 2015,
<https://www.rfc-editor.org/info/rfc7713>. <https://www.rfc-editor.org/info/rfc7713>.
[RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
and G. Judd, "Data Center TCP (DCTCP): TCP Congestion and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257,
October 2017, <https://www.rfc-editor.org/info/rfc8257>. October 2017, <https://www.rfc-editor.org/info/rfc8257>.
[RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion
Notification (ECN) Experimentation", RFC 8311,
DOI 10.17487/RFC8311, January 2018,
<https://www.rfc-editor.org/info/rfc8311>.
Appendix A. Example Algorithms Appendix A. Example Algorithms
This appendix is informative, not normative. It gives example This appendix is informative, not normative. It gives example
algorithms that would satisfy the normative requirements of the algorithms that would satisfy the normative requirements of the
AccECN protocol. However, implementers are free to choose other ways AccECN protocol. However, implementers are free to choose other ways
to implement the requirements. to implement the requirements.
A.1. Example Algorithm to Encode/Decode the AccECN Option A.1. Example Algorithm to Encode/Decode the AccECN Option
The example algorithms below show how a Data Receiver in AccECN mode The example algorithms below show how a Data Receiver in AccECN mode
 End of changes. 34 change blocks. 
78 lines changed or deleted 102 lines changed or added

This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/