draft-ietf-tcpm-accurate-ecn-06.txt   draft-ietf-tcpm-accurate-ecn-07.txt 
TCP Maintenance & Minor Extensions (tcpm) B. Briscoe TCP Maintenance & Minor Extensions (tcpm) B. Briscoe
Internet-Draft CableLabs Internet-Draft CableLabs
Intended status: Experimental M. Kuehlewind Intended status: Experimental M. Kuehlewind
Expires: September 6, 2018 ETH Zurich Expires: January 3, 2019 ETH Zurich
R. Scheffenegger R. Scheffenegger
March 5, 2018 July 2, 2018
More Accurate ECN Feedback in TCP More Accurate ECN Feedback in TCP
draft-ietf-tcpm-accurate-ecn-06 draft-ietf-tcpm-accurate-ecn-07
Abstract Abstract
Explicit Congestion Notification (ECN) is a mechanism where network Explicit Congestion Notification (ECN) is a mechanism where network
nodes can mark IP packets instead of dropping them to indicate nodes can mark IP packets instead of dropping them to indicate
incipient congestion to the end-points. Receivers with an ECN- incipient congestion to the end-points. Receivers with an ECN-
capable transport protocol feed back this information to the sender. capable transport protocol feed back this information to the sender.
ECN is specified for TCP in such a way that only one feedback signal ECN is specified for TCP in such a way that only one feedback signal
can be transmitted per Round-Trip Time (RTT). Recently,ew TCP can be transmitted per Round-Trip Time (RTT). Recently, new TCP
mechanisms like Congestion Exposure (ConEx) or Data Center TCP mechanisms like Congestion Exposure (ConEx), Data Center TCP (DCTCP)
(DCTCP) need more accurate ECN feedback information whenever more or Low Latency Low Loss Scalable Throughput (L4S) need more accurate
than one marking is received in one RTT. This document specifies an ECN feedback information whenever more than one marking is received
experimental scheme to provide more than one feedback signal per RTT in one RTT. This document specifies an experimental scheme to
in the TCP header. Given TCP header space is scarce, it overloads provide more than one feedback signal per RTT in the TCP header.
the three existing ECN-related flags in the TCP header and provides Given TCP header space is scarce, it allocates a reserved header bit,
additional information in a new TCP option. that was previously used for the ECN-Nonce which has now been
declared historic. It also overloads the two existing ECN flags in
the TCP header. Supplementary feedback information can optionally be
provided in a new TCP option, which is never used on the TCP SYN.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 6, 2018. This Internet-Draft will expire on January 3, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 40 skipping to change at page 2, line 40
2.2. Feedback Mechanism . . . . . . . . . . . . . . . . . . . 9 2.2. Feedback Mechanism . . . . . . . . . . . . . . . . . . . 9
2.3. Delayed ACKs and Resilience Against ACK Loss . . . . . . 10 2.3. Delayed ACKs and Resilience Against ACK Loss . . . . . . 10
2.4. Feedback Metrics . . . . . . . . . . . . . . . . . . . . 10 2.4. Feedback Metrics . . . . . . . . . . . . . . . . . . . . 10
2.5. Generic (Dumb) Reflector . . . . . . . . . . . . . . . . 11 2.5. Generic (Dumb) Reflector . . . . . . . . . . . . . . . . 11
3. AccECN Protocol Specification . . . . . . . . . . . . . . . . 12 3. AccECN Protocol Specification . . . . . . . . . . . . . . . . 12
3.1. Negotiating to use AccECN . . . . . . . . . . . . . . . . 12 3.1. Negotiating to use AccECN . . . . . . . . . . . . . . . . 12
3.1.1. Negotiation during the TCP handshake . . . . . . . . 12 3.1.1. Negotiation during the TCP handshake . . . . . . . . 12
3.1.2. Retransmission of the SYN . . . . . . . . . . . . . . 14 3.1.2. Retransmission of the SYN . . . . . . . . . . . . . . 14
3.2. AccECN Feedback . . . . . . . . . . . . . . . . . . . . . 15 3.2. AccECN Feedback . . . . . . . . . . . . . . . . . . . . . 15
3.2.1. Initialization of Feedback Counters at the Data 3.2.1. Initialization of Feedback Counters at the Data
Sender . . . . . . . . . . . . . . . . . . . . . . . 15 Sender . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.2. The ACE Field . . . . . . . . . . . . . . . . . . . . 16 3.2.2. The ACE Field . . . . . . . . . . . . . . . . . . . . 16
3.2.3. Testing for Zeroing of the ACE Field . . . . . . . . 18 3.2.3. Testing for Zeroing of the ACE Field . . . . . . . . 18
3.2.4. Testing for Mangling of the IP/ECN Field . . . . . . 18 3.2.4. Testing for Mangling of the IP/ECN Field . . . . . . 18
3.2.5. Safety against Ambiguity of the ACE Field . . . . . . 19 3.2.5. Safety against Ambiguity of the ACE Field . . . . . . 19
3.2.6. The AccECN Option . . . . . . . . . . . . . . . . . . 20 3.2.6. The AccECN Option . . . . . . . . . . . . . . . . . . 20
3.2.7. Path Traversal of the AccECN Option . . . . . . . . . 21 3.2.7. Path Traversal of the AccECN Option . . . . . . . . . 21
3.2.8. Usage of the AccECN TCP Option . . . . . . . . . . . 24 3.2.8. Usage of the AccECN TCP Option . . . . . . . . . . . 25
3.3. Requirements for TCP Proxies, Offload Engines and other 3.3. Requirements for TCP Proxies, Offload Engines and other
Middleboxes on AccECN Compliance . . . . . . . . . . . . 26 Middleboxes on AccECN Compliance . . . . . . . . . . . . 26
4. Interaction with Other TCP Variants . . . . . . . . . . . . . 27 4. Interaction with Other TCP Variants . . . . . . . . . . . . . 27
4.1. Compatibility with SYN Cookies . . . . . . . . . . . . . 27 4.1. Compatibility with SYN Cookies . . . . . . . . . . . . . 27
4.2. Compatibility with Other TCP Options and Experiments . . 28 4.2. Compatibility with Other TCP Options and Experiments . . 28
4.3. Compatibility with Feedback Integrity Mechanisms . . . . 28 4.3. Compatibility with Feedback Integrity Mechanisms . . . . 28
5. Protocol Properties . . . . . . . . . . . . . . . . . . . . . 29 5. Protocol Properties . . . . . . . . . . . . . . . . . . . . . 29
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31
7. Security Considerations . . . . . . . . . . . . . . . . . . . 32 7. Security Considerations . . . . . . . . . . . . . . . . . . . 32
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 32 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 33
9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . 33 9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . 33
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 33
10.1. Normative References . . . . . . . . . . . . . . . . . . 33 10.1. Normative References . . . . . . . . . . . . . . . . . . 33
10.2. Informative References . . . . . . . . . . . . . . . . . 33 10.2. Informative References . . . . . . . . . . . . . . . . . 34
Appendix A. Example Algorithms . . . . . . . . . . . . . . . . . 36 Appendix A. Example Algorithms . . . . . . . . . . . . . . . . . 36
A.1. Example Algorithm to Encode/Decode the AccECN Option . . 36 A.1. Example Algorithm to Encode/Decode the AccECN Option . . 36
A.2. Example Algorithm for Safety Against Long Sequences of A.2. Example Algorithm for Safety Against Long Sequences of
ACK Loss . . . . . . . . . . . . . . . . . . . . . . . . 37 ACK Loss . . . . . . . . . . . . . . . . . . . . . . . . 37
A.2.1. Safety Algorithm without the AccECN Option . . . . . 37 A.2.1. Safety Algorithm without the AccECN Option . . . . . 37
A.2.2. Safety Algorithm with the AccECN Option . . . . . . . 39 A.2.2. Safety Algorithm with the AccECN Option . . . . . . . 39
A.3. Example Algorithm to Estimate Marked Bytes from Marked A.3. Example Algorithm to Estimate Marked Bytes from Marked
Packets . . . . . . . . . . . . . . . . . . . . . . . . . 40 Packets . . . . . . . . . . . . . . . . . . . . . . . . . 40
A.4. Example Algorithm to Beacon AccECN Options . . . . . . . 41 A.4. Example Algorithm to Beacon AccECN Options . . . . . . . 41
A.5. Example Algorithm to Count Not-ECT Bytes . . . . . . . . 42 A.5. Example Algorithm to Count Not-ECT Bytes . . . . . . . . 42
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 42 Appendix B. Rationale for Usage of TCP Header Flags . . . . . . 42
B.1. Three TCP Header Flags in the SYN-SYN/ACK Handshake . . . 42
B.2. Four Codepoints in the SYN/ACK . . . . . . . . . . . . . 43
B.3. Space for Future Evolution . . . . . . . . . . . . . . . 44
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 44
1. Introduction 1. Introduction
Explicit Congestion Notification (ECN) [RFC3168] is a mechanism where Explicit Congestion Notification (ECN) [RFC3168] is a mechanism where
network nodes can mark IP packets instead of dropping them to network nodes can mark IP packets instead of dropping them to
indicate incipient congestion to the end-points. Receivers with an indicate incipient congestion to the end-points. Receivers with an
ECN-capable transport protocol feed back this information to the ECN-capable transport protocol feed back this information to the
sender. ECN is specified for TCP in such a way that only one sender. ECN is specified for TCP in such a way that only one
feedback signal can be transmitted per Round-Trip Time (RTT). feedback signal can be transmitted per Round-Trip Time (RTT).
Recently, proposed mechanisms like Congestion Exposure (ConEx Recently, proposed mechanisms like Congestion Exposure (ConEx
[RFC7713]), DCTCP [RFC8257] or L4S [I-D.ietf-tsvwg-l4s-arch] need [RFC7713]), DCTCP [RFC8257] or L4S [I-D.ietf-tsvwg-l4s-arch] need to
more accurate ECN feedback information than provided by the feedback know when more than one marking is received in one RTT which is
scheme as specified in [RFC3168] whenever more than one marking is information that cannot be provided by the feedback scheme as
received in one RTT. This document specifies an alternative feedback specified in [RFC3168]. This document specifies an alternative
scheme that provides more accurate information and could be used by feedback scheme that provides more accurate information and could be
these new TCP extensions. A fuller treatment of the motivation for used by these new TCP extensions. A fuller treatment of the
this specification is given in the associated requirements document motivation for this specification is given in the associated
[RFC7560]. requirements document [RFC7560].
This documents specifies an experimental scheme for ECN feedback in This documents specifies an experimental scheme for ECN feedback in
the TCP header to provide more than one feedback signal per RTT. It the TCP header to provide more than one feedback signal per RTT. It
will be called the more accurate ECN feedback scheme, or AccECN for will be called the more accurate ECN feedback scheme, or AccECN for
short. If AccECN progresses from experimental to the standards short. If AccECN progresses from experimental to the standards
track, it is intended to be a complete replacement for classic TCP/ track, it is intended to be a complete replacement for classic TCP/
ECN feedback, not a fork in the design of TCP. AccECN feedback ECN feedback, not a fork in the design of TCP. AccECN feedback
complements TCP's loss feedback and it supplements classic TCP/ECN complements TCP's loss feedback and it supplements classic TCP/ECN
feedback, so its applicability is intended to include all public and feedback, so its applicability is intended to include all public and
private IP networks (and even any non-IP networks over which TCP is private IP networks (and even any non-IP networks over which TCP is
used today), whether or not any nodes on the path support ECN of used today), whether or not any nodes on the path support ECN of
whatever flavour. whatever flavour.
Until the AccECN experiment succeeds, [RFC3168] will remain as the Until the AccECN experiment succeeds, [RFC3168] will remain as the
only standards track specification for adding ECN to TCP. To avoid only standards track specification for adding ECN to TCP. To avoid
confusion, in this document we use the term 'classic ECN' for the confusion, in this document we use the term 'classic ECN' for the
pre-existing ECN specification [RFC3168]. pre-existing ECN specification [RFC3168].
AccECN feedback overloads the two existing ECN flags as well as the AccECN feedback overloads the two existing ECN flags and allocates
currently reserved and previously called NS flag in the main TCP the currently reserved flag (previously called NS) in the TCP header,
header with new definitions, so both ends have to support the new to be used as one field indicating the number of congestion
wire protocol before it can be used. Therefore during the TCP experienced marked packets. Given the new definitions of these three
handshake the two ends use the three ECN-related flags in the TCP bits, both ends have to support the new wire protocol before it can
header to negotiate the most advanced feedback protocol that they can be used. Therefore during the TCP handshake the two ends use these
both support. three bits in the TCP header to negotiate the most advanced feedback
protocol that they can both support, in a way that is backward
compatible with [RFC3168].
AccECN is solely an (experimental) change to the TCP wire protocol; AccECN is solely an (experimental) change to the TCP wire protocol;
it only specifies the negotiation and signaling of more accurate ECN it only specifies the negotiation and signaling of more accurate ECN
feedback from a TCP Data Receiver to a Data Sender. It is completely feedback from a TCP Data Receiver to a Data Sender. It is completely
independent of how TCP might respond to congestion feedback, which is independent of how TCP might respond to congestion feedback, which is
out of scope. For that we refer to [RFC3168] or any RFC that out of scope. For that we refer to [RFC3168] or any RFC that
specifies a different response to TCP ECN feedback, for example: specifies a different response to TCP ECN feedback, for example:
[RFC8257]; or the ECN experiments referred to in [RFC8311], namely: a [RFC8257]; or the ECN experiments referred to in [RFC8311], namely: a
TCP-based Low Latency Low Loss Scalable (L4S) congestion control TCP-based Low Latency Low Loss Scalable (L4S) congestion control
[I-D.ietf-tsvwg-l4s-arch]; ECN-capable TCP control packets [I-D.ietf-tsvwg-l4s-arch]; ECN-capable TCP control packets
skipping to change at page 5, line 40 skipping to change at page 5, line 46
TCP is critical to the robust functioning of the Internet, therefore TCP is critical to the robust functioning of the Internet, therefore
any proposed modifications to TCP need to be thoroughly tested. The any proposed modifications to TCP need to be thoroughly tested. The
present specification describes an experimental protocol that adds present specification describes an experimental protocol that adds
more accurate ECN feedback to the TCP protocol. The intention is to more accurate ECN feedback to the TCP protocol. The intention is to
specify the protocol sufficiently so that more than one specify the protocol sufficiently so that more than one
implementation can be built in order to test its function, robustness implementation can be built in order to test its function, robustness
and interoperability (with itself and with previous version of ECN and interoperability (with itself and with previous version of ECN
and TCP). and TCP).
The experimental protocol will be considered successful if it is The experimental protocol will be considered successful if testing
deployed and if it satisfies the requirements of [RFC7560] in the confirms that the proposed mechanism can be deployed at large scale.
consensus opinion of the IETF tcpm working group. In short, this
requires that it improves the accuracy and timeliness of TCP's ECN
feedback, as claimed in Section 5, while striking a balance between
the conflicting requirements of resilience, integrity and
minimisation of overhead. It also requires that it is not unduly
complex, and that it is compatible with prevalent equipment
behaviours in the current Internet (e.g. hardware offloading and
middleboxes), whether or not they comply with standards.
Testing will mostly focus on fall-back strategies in case of Testing will mostly focus on fall-back strategies in case of
middlebox interference. Current recommended strategies are specified middlebox interference. Current recommended strategies are specified
in Sections 3.1.2, 3.2.3, 3.2.4 and 3.2.7. The effectiveness of in Sections 3.1.2, 3.2.3, 3.2.4 and 3.2.7. The effectiveness of
these strategies depends on the actual deployment situation of these strategies depends on the actual deployment situation of
middleboxes. Therefore experimental verification to confirm large- middleboxes. Therefore experimental verification to confirm large-
scale path traversal in the Internet is needed before finalizing this scale path traversal in the Internet is needed before finalizing this
specification on the Standards Track. specification on the Standards Track.
Another experimentation focus is the implementation feasibiliy of Another experimentation focus is the implementation feasibiliy of
change-triggered ACKs as described in section 3.2.8. While on change-triggered ACKs as described in section 3.2.8. While on
average this should not lead to a higher ACK rate, it changes the ACK average this should not lead to a higher ACK rate, it changes the ACK
patter which especially can have an impact on hardware offload. pattern which can particularly have an impact on hardware offload.
Further experimentation is needed to advise if this should a hard It is currently specified as a hard requirement, because the sender
requirement or just prefer behavior. can exploit the predictability of the receiver's behaviour. However,
further experimentation is needed to advise if will have to become
just preferred behavior.
1.4. Terminology 1.4. Terminology
AccECN: The more accurate ECN feedback scheme will be called AccECN AccECN: The more accurate ECN feedback scheme will be called AccECN
for short. for short.
Classic ECN: the ECN protocol specified in [RFC3168]. Classic ECN: the ECN protocol specified in [RFC3168].
Classic ECN feedback: the feedback aspect of the ECN protocol Classic ECN feedback: the feedback aspect of the ECN protocol
specified in [RFC3168], including generation, encoding, specified in [RFC3168], including generation, encoding,
skipping to change at page 11, line 47 skipping to change at page 11, line 47
SYN supports future scenarios in which SYNs might be ECN-enabled SYN supports future scenarios in which SYNs might be ECN-enabled
(without prejudging whether they ought to be). For instance, (without prejudging whether they ought to be). For instance,
[RFC8311] updates this aspect of RFC 3168 to allow experimentation [RFC8311] updates this aspect of RFC 3168 to allow experimentation
with ECN-capable TCP control packets. with ECN-capable TCP control packets.
Even if the TCP client (or server) has set the SYN (or SYN/ACK) to Even if the TCP client (or server) has set the SYN (or SYN/ACK) to
not-ECT in compliance with RFC 3168, feedback on the state of the ECN not-ECT in compliance with RFC 3168, feedback on the state of the ECN
field when it arrives at the receiver could still be useful, because field when it arrives at the receiver could still be useful, because
middleboxes have been known to overwrite the ECN IP field as if it is middleboxes have been known to overwrite the ECN IP field as if it is
still part of the old Type of Service (ToS) field [Mandalari18]. If still part of the old Type of Service (ToS) field [Mandalari18]. If
a TCP client has set the SYN to Not-ECT, but receives CE feedback, it a TCP client has set the SYN to Not-ECT, but receives feedback that
can detect such middlebox interference and send Not-ECT for the rest the ECN field on the SYN arrived with a different codepoint, it can
of the connection (see [I-D.kuehlewind-tcpm-ecn-fallback]). Today, detect such middlebox interference and send Not-ECT for the rest of
if a TCP server receives ECT or CE on a SYN, it cannot know whether the connection (see [I-D.kuehlewind-tcpm-ecn-fallback]). Today, if a
it is invalid (or valid) because only the TCP client knows whether it TCP server receives ECT or CE on a SYN, it cannot know whether it is
invalid (or valid) because only the TCP client knows whether it
originally marked the SYN as Not-ECT (or ECT). Therefore, prior to originally marked the SYN as Not-ECT (or ECT). Therefore, prior to
AccECN, the server's only safe course of action was to disable ECN AccECN, the server's only safe course of action was to disable ECN
for the connection. Instead, the AccECN protocol allows the server for the connection. Instead, the AccECN protocol allows the server
to feed back the received ECN field to the client, which then has all to feed back the received ECN field to the client, which then has all
the information to decide whether the connection has to fall-back the information to decide whether the connection has to fall-back
from supporting ECN (or not). from supporting ECN (or not).
3. AccECN Protocol Specification 3. AccECN Protocol Specification
3.1. Negotiating to use AccECN 3.1. Negotiating to use AccECN
3.1.1. Negotiation during the TCP handshake 3.1.1. Negotiation during the TCP handshake
Given the ECN Nonce [RFC3540] has been reclassified as historic Given the ECN Nonce [RFC3540] has been reclassified as historic
[RFC8311], the present specification renames the TCP flag at bit 7 of [RFC8311], the present specification re-allocates the TCP flag at bit
the TCP header flags from NS (Nonce Sum) to AE (Accurate ECN) (see 7 of the TCP header, which was previously called NS (Nonce Sum), as
IANA Considerations in Section 6). the AE (Accurate ECN) flag (see IANA Considerations in Section 6).
During the TCP handshake at the start of a connection, to request During the TCP handshake at the start of a connection, to request
more accurate ECN feedback the TCP client (host A) MUST set the TCP more accurate ECN feedback the TCP client (host A) MUST set the TCP
flags AE=1, CWR=1 and ECE=1 in the initial SYN segment. flags AE=1, CWR=1 and ECE=1 in the initial SYN segment.
If a TCP server (B) that is AccECN-enabled receives a SYN with the If a TCP server (B) that is AccECN-enabled receives a SYN with the
above three flags set, it MUST set both its half connections into above three flags set, it MUST set both its half connections into
AccECN mode. Then it MUST set the TCP flags on the SYN/ACK to one of AccECN mode. Then it MUST set the TCP flags on the SYN/ACK to one of
the 4 values shown in the top block of Table 2 to confirm that it the 4 values shown in the top block of Table 2 to confirm that it
supports AccECN. The TCP server MUST NOT set one of these 4 supports AccECN. The TCP server MUST NOT set one of these 4
skipping to change at page 26, line 5 skipping to change at page 26, line 14
The following example series of arriving IP/ECN fields illustrates The following example series of arriving IP/ECN fields illustrates
when a Data Receiver will emit an ACK if it is using a delayed ACK when a Data Receiver will emit an ACK if it is using a delayed ACK
factor of 2 segments and change-triggered ACKs: 01 -> ACK, 01, 01 -> factor of 2 segments and change-triggered ACKs: 01 -> ACK, 01, 01 ->
ACK, 10 -> ACK, 10, 01 -> ACK, 01, 11 -> ACK, 01 -> ACK. ACK, 10 -> ACK, 10, 01 -> ACK, 01, 11 -> ACK, 01 -> ACK.
For the avoidance of doubt, the change-triggered ACK mechanism is For the avoidance of doubt, the change-triggered ACK mechanism is
deliberately worded to ignore the arrival of a control packet with no deliberately worded to ignore the arrival of a control packet with no
payload, which therefore does not alter any byte counters, because it payload, which therefore does not alter any byte counters, because it
is important that TCP does not acknowledge pure ACKs. The change- is important that TCP does not acknowledge pure ACKs. The change-
triggered ACK approach will lead to some additional ACKs but it feeds triggered ACK approach can lead to some additional ACKs but it feeds
back the timing and the order in which ECN marks are received with back the timing and the order in which ECN marks are received with
minimal additional complexity. minimal additional complexity. If only CE marks are infrequent, or
there are multiple marks in a row, the additional load will be low.
Other marking patterns could increase the load significantly,
Investigating the additional load is a goal of the proposed
experiment.
Implementation note: sending an AccECN Option each time a different Implementation note: sending an AccECN Option each time a different
counter changes and including a full-length AccECN Option on every counter changes and including a full-length AccECN Option on every
delayed ACK will satisfy the requirements described above and might delayed ACK will satisfy the requirements described above and might
be the easiest implementation, as long as sufficient space is be the easiest implementation, as long as sufficient space is
available in each ACK (in total and in the option space). available in each ACK (in total and in the option space).
Appendix A.3 gives an example algorithm to estimate the number of Appendix A.3 gives an example algorithm to estimate the number of
marked bytes from the ACE field alone, if the AccECN Option is not marked bytes from the ACE field alone, if the AccECN Option is not
available. available.
skipping to change at page 31, line 32 skipping to change at page 31, line 38
experiment. This bit was previously called the Nonce Sum (NS) flag experiment. This bit was previously called the Nonce Sum (NS) flag
[RFC3540], but RFC 3540 is being reclassified as historic [RFC8311]. [RFC3540], but RFC 3540 is being reclassified as historic [RFC8311].
The flag will now be defined as: The flag will now be defined as:
+-----+-------------------+-----------+ +-----+-------------------+-----------+
| Bit | Name | Reference | | Bit | Name | Reference |
+-----+-------------------+-----------+ +-----+-------------------+-----------+
| 7 | AE (Accurate ECN) | RFC XXXX | | 7 | AE (Accurate ECN) | RFC XXXX |
+-----+-------------------+-----------+ +-----+-------------------+-----------+
[TO BE REMOVED: This registration should take place at the following [TO BE REMOVED: IANA is requested to update the existing entry in the
location: https://www.iana.org/assignments/tcp-header-flags/tcp- Transmission Control Protocol (TCP) Header Flags registration
header-flags.xhtml#tcp-header-flags-1 ] (https://www.iana.org/assignments/tcp-header-flags/tcp-header-
flags.xhtml#tcp-header-flags-1) for Bit 7 to "AE (Accurate ECN),
previously used as NS (Nonce Sum) by [RFC3540], which is now Historic
[RFC8311]" and change the reference to this RFC-to-be instead of
RFC8311.]
This document also defines a new TCP option for AccECN, assigned a This document also defines a new TCP option for AccECN, assigned a
value of TBD1 (decimal) from the TCP option space. This value is value of TBD1 (decimal) from the TCP option space. This value is
defined as: defined as:
+------+--------+-----------------------+-----------+ +------+--------+-----------------------+-----------+
| Kind | Length | Meaning | Reference | | Kind | Length | Meaning | Reference |
+------+--------+-----------------------+-----------+ +------+--------+-----------------------+-----------+
| TBD1 | N | Accurate ECN (AccECN) | RFC XXXX | | TBD1 | N | Accurate ECN (AccECN) | RFC XXXX |
+------+--------+-----------------------+-----------+ +------+--------+-----------------------+-----------+
skipping to change at page 34, line 8 skipping to change at page 34, line 22
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
10.2. Informative References 10.2. Informative References
[I-D.ietf-tcpm-alternativebackoff-ecn] [I-D.ietf-tcpm-alternativebackoff-ecn]
Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst, Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst,
"TCP Alternative Backoff with ECN (ABE)", draft-ietf-tcpm- "TCP Alternative Backoff with ECN (ABE)", draft-ietf-tcpm-
alternativebackoff-ecn-06 (work in progress), February alternativebackoff-ecn-07 (work in progress), March 2018.
2018.
[I-D.ietf-tcpm-generalized-ecn] [I-D.ietf-tcpm-generalized-ecn]
Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
Congestion Notification (ECN) to TCP Control Packets", Congestion Notification (ECN) to TCP Control Packets",
draft-ietf-tcpm-generalized-ecn-02 (work in progress), draft-ietf-tcpm-generalized-ecn-02 (work in progress),
October 2017. October 2017.
[I-D.ietf-tsvwg-l4s-arch] [I-D.ietf-tsvwg-l4s-arch]
Briscoe, B., Schepper, K., and M. Bagnulo, "Low Latency, Briscoe, B., Schepper, K., and M. Bagnulo, "Low Latency,
Low Loss, Scalable Throughput (L4S) Internet Service: Low Loss, Scalable Throughput (L4S) Internet Service:
Architecture", draft-ietf-tsvwg-l4s-arch-01 (work in Architecture", draft-ietf-tsvwg-l4s-arch-02 (work in
progress), October 2017. progress), March 2018.
[I-D.kuehlewind-tcpm-ecn-fallback] [I-D.kuehlewind-tcpm-ecn-fallback]
Kuehlewind, M. and B. Trammell, "A Mechanism for ECN Path Kuehlewind, M. and B. Trammell, "A Mechanism for ECN Path
Probing and Fallback", draft-kuehlewind-tcpm-ecn- Probing and Fallback", draft-kuehlewind-tcpm-ecn-
fallback-01 (work in progress), September 2013. fallback-01 (work in progress), September 2013.
[I-D.moncaster-tcpm-rcv-cheat] [I-D.moncaster-tcpm-rcv-cheat]
Moncaster, T., Briscoe, B., and A. Jacquet, "A TCP Test to Moncaster, T., Briscoe, B., and A. Jacquet, "A TCP Test to
Allow Senders to Identify Receiver Non-Compliance", draft- Allow Senders to Identify Receiver Non-Compliance", draft-
moncaster-tcpm-rcv-cheat-03 (work in progress), July 2014. moncaster-tcpm-rcv-cheat-03 (work in progress), July 2014.
skipping to change at page 42, line 37 skipping to change at page 42, line 37
under-counting. under-counting.
However, such precision is unlikely to be necessary. The only known However, such precision is unlikely to be necessary. The only known
use of a count of Not-ECT marked bytes is to test whether equipment use of a count of Not-ECT marked bytes is to test whether equipment
on the path is clearing the ECN field (perhaps due to an out-dated on the path is clearing the ECN field (perhaps due to an out-dated
attempt to clear, or bleach, what used to be the ToS field). To attempt to clear, or bleach, what used to be the ToS field). To
detect bleaching it will be sufficient to detect whether nearly all detect bleaching it will be sufficient to detect whether nearly all
bytes arrive marked as Not-ECT. Therefore there should be no need to bytes arrive marked as Not-ECT. Therefore there should be no need to
keep track of the details of retransmissions. keep track of the details of retransmissions.
Authors' Addresses Appendix B. Rationale for Usage of TCP Header Flags
B.1. Three TCP Header Flags in the SYN-SYN/ACK Handshake
AccECN uses a rather unorthodox but justified approach to negotiate
the highest version TCP ECN feedback scheme that both ends support.
It follows from the original TCP ECN capability negotiation
[RFC3168], in which the client set the 2 least significant reserved
flags in the TCP header, and fell back to no ECN support if the
server responded with the 2 flags cleared, which had previously been
the default. It is not recorded why ECN originally used this
approach instead of the more orthodox use of a TCP option.
In order to be backward compatible with RFC 3168, AccECN continues
this approach, using the 3rd least significant TCP header flag that
had previously been allocated for the ECN nonce (now historic).
Then, whatever form of server an AccECN client encounters, the
connection can fall back to the highest version of feedback protocol
that both ends support, as explained in Section 3.1.
If AccECN had used the more orthodox approach of a TCP option, it
would still have had to set the two ECN flags in the main TCP header,
in order to be able to fall back to Classic RFC 3168 ECN, or to
disable ECN support, without another round of negotiation. Then
AccECN would also have had to handle all the different ways that
servers currently respond to settings of the ECN flags in the main
TCP header, including all the conflicting cases where a server might
have said it supported one approach in the flags and another approach
in the new TCP option. And AccECN would have had to deal with all
the additional possibilities where a middlebox might have mangled the
ECN flags, or removed the TCP option. Thus, usage of the 3rd
reserved TCP header flag simplified the protocol.
The third flag was used in a way that could be distinguished from the
ECN nonce, in case any nonce deployment was encountered. Previous
usage of this flag for the ECN nonce was integrated into the original
ECN negotiation. This further justified the 3rd flag's use for
AccECN, because a non-ECN usage of this flag would have had to use it
as a separate single bit, rather than in combination with the other 2
ECN flags.
Indeed, having overloaded the original uses of these three flags for
its handshake, AccECN overloads all three bits again as a 3-bit
counter.
B.2. Four Codepoints in the SYN/ACK
Of the 8 possible codepoints that the 3 TCP header flags can indicate
on the SYN/ACK, 4 already indicated earlier (or broken) versions of
ECN support. In the early design of AccECN, an AccECN server could
use only 2 of the 4 remaining codepoints. They both indicated AccECN
support, but one fed back that the SYN had arrived marked as CE.
Even though ECN support on a SYN is not yet on the standards track,
the idea is for either end to act as a dumb reflector, so that future
capabilities can be unilaterally deployed without requiring 2-ended
deployment (justified in Section 2.5).
During traversal testing it was discovered that the ECN field in the
SYN was mangled on a non-negligible proportion of paths. Therefore
it was necessary to allow the SYN/ACK to feed all four IP/ECN
codepoints that the SYN could arrive with back to the client.
Without this, the client could not know whether to disable ECN for
the connection due to mangling of the IP/ECN field (also explained in
Section 2.5). This development consumed the remaining 2 codepoints
on the SYN/ACK that had been reserved for future use by AccECN in
earlier versions.
B.3. Space for Future Evolution
Despite availability of usable TCP header space being extremely
scarce, the AccECN protocol has taken all possible steps to ensure
that there is space to negotiate possible future variants of the
protocol, either if the experiment proves that a variant of AccECN is
required, or if a completely different ECN feedback approach is
needed:
Future AccECN variants: The requirement not to reject unexpected
initial values of the ACE counter (in the main TCP header) in the
last para of Section 3.2.3 ensures that 5 unused codepoints on the
final ACK of the 3-way handshake and 7 unused values on the first
data packet from the server could be used to negotiate future
variants of the AccECN protocol between the endpoints. Also, a
similar requirement not to reject unexpected initial values in the
TCP option is for the same purpose. If traversal of the TCP
option were reliable, this would have enabled a far wider range of
future variation.
Future non-AccECN variants: Five codepoints out of the 8 possible in
the 3 TCP header flags used by AccECN are unused on the initial
SYN (in the order AE,CWR,ECE): 001, 010, 100, 101, 110. All
possible combinations of SYN/ACK coiuld be used in response except
000 and reflection of the same values sent on the SYN. These
would not allow fall-back to Classic ECN support for a server that
did not understand them, but they are available, perhaps for uses
other than ECN in future.
Of course, other ways could be resorted to in order to extend
AccECN or ECN in future, although their traversal properties are
likely to be inferior. They include a new TCP option; using the
remaining reserved flags in the main TCP header (preferably
extending the 3-bit combinations used by AccECN to 4-bit
combinations, rather than burning one bit for just one state); a
non-zero urgent pointer in combination with the URG flag cleared;
or some other unexpected combination of fields yet to be invented.
Authors' Addresses
Bob Briscoe Bob Briscoe
CableLabs CableLabs
UK UK
EMail: ietf@bobbriscoe.net EMail: ietf@bobbriscoe.net
URI: http://bobbriscoe.net/ URI: http://bobbriscoe.net/
Mirja Kuehlewind Mirja Kuehlewind
ETH Zurich ETH Zurich
Zurich Zurich
 End of changes. 23 change blocks. 
64 lines changed or deleted 179 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/