draft-ietf-tsvwg-l4s-arch-12.txt   draft-ietf-tsvwg-l4s-arch-13.txt 
Transport Area Working Group B. Briscoe, Ed. Transport Area Working Group B. Briscoe, Ed.
Internet-Draft Independent Internet-Draft Independent
Intended status: Informational K. De Schepper Intended status: Informational K. De Schepper
Expires: 28 April 2022 Nokia Bell Labs Expires: 11 May 2022 Nokia Bell Labs
M. Bagnulo Braun M. Bagnulo Braun
Universidad Carlos III de Madrid Universidad Carlos III de Madrid
G. White G. White
CableLabs CableLabs
25 October 2021 7 November 2021
Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service: Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service:
Architecture Architecture
draft-ietf-tsvwg-l4s-arch-12 draft-ietf-tsvwg-l4s-arch-13
Abstract Abstract
This document describes the L4S architecture, which enables Internet This document describes the L4S architecture, which enables Internet
applications to achieve Low queuing Latency, Low Loss, and Scalable applications to achieve Low queuing Latency, Low Loss, and Scalable
throughput (L4S). The insight on which L4S is based is that the root throughput (L4S). The insight on which L4S is based is that the root
cause of queuing delay is in the congestion controllers of senders, cause of queuing delay is in the congestion controllers of senders,
not in the queue itself. With the L4S architecture _all_ Internet not in the queue itself. With the L4S architecture all Internet
applications could (but do not have to) transition away from applications could (but do not have to) transition away from
congestion control algorithms that cause substantial queuing delay, congestion control algorithms that cause substantial queuing delay,
to a new class of congestion controls that induce very little to a new class of congestion controls that induce very little
queuing, aided by explicit congestion signaling from the network. queuing, aided by explicit congestion signalling from the network.
This new class of congestion controls can provide low latency for This new class of congestion controls can provide low latency for
capacity-seeking flows, so applications can achieve both high capacity-seeking flows, so applications can achieve both high
bandwidth and low latency. bandwidth and low latency.
The architecture primarily concerns incremental deployment. It The architecture primarily concerns incremental deployment. It
defines mechanisms that allow the new class of L4S congestion defines mechanisms that allow the new class of L4S congestion
controls to coexist with 'Classic' congestion controls in a shared controls to coexist with 'Classic' congestion controls in a shared
network. These mechanisms aim to ensure that the latency and network. These mechanisms aim to ensure that the latency and
throughput performance using an L4S-compliant congestion controller throughput performance using an L4S-compliant congestion controller
is usually much better (and never worse) than performance would have is usually much better (and rarely worse) than performance would have
been using a 'Classic' congestion controller, and that competing been using a 'Classic' congestion controller, and that competing
flows continuing to use 'Classic' controllers are typically not flows continuing to use 'Classic' controllers are typically not
impacted by the presence of L4S. These characteristics are important impacted by the presence of L4S. These characteristics are important
to encourage adoption of L4S congestion control algorithms and L4S to encourage adoption of L4S congestion control algorithms and L4S
compliant network elements. compliant network elements.
The L4S architecture consists of three components: network support to The L4S architecture consists of three components: network support to
isolate L4S traffic from classic traffic; protocol features that isolate L4S traffic from classic traffic; protocol features that
allow network elements to identify L4S traffic; and host support for allow network elements to identify L4S traffic; and host support for
L4S congestion controls. L4S congestion controls.
skipping to change at page 2, line 20 skipping to change at page 2, line 20
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 28 April 2022. This Internet-Draft will expire on 11 May 2022.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 2, line 48 skipping to change at page 2, line 48
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Document Roadmap . . . . . . . . . . . . . . . . . . . . 5 1.1. Document Roadmap . . . . . . . . . . . . . . . . . . . . 5
2. L4S Architecture Overview . . . . . . . . . . . . . . . . . . 5 2. L4S Architecture Overview . . . . . . . . . . . . . . . . . . 5
3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7
4. L4S Architecture Components . . . . . . . . . . . . . . . . . 9 4. L4S Architecture Components . . . . . . . . . . . . . . . . . 9
4.1. Protocol Mechanisms . . . . . . . . . . . . . . . . . . . 9 4.1. Protocol Mechanisms . . . . . . . . . . . . . . . . . . . 9
4.2. Network Components . . . . . . . . . . . . . . . . . . . 10 4.2. Network Components . . . . . . . . . . . . . . . . . . . 10
4.3. Host Mechanisms . . . . . . . . . . . . . . . . . . . . . 13 4.3. Host Mechanisms . . . . . . . . . . . . . . . . . . . . . 13
5. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.1. Why These Primary Components? . . . . . . . . . . . . . . 14 5.1. Why These Primary Components? . . . . . . . . . . . . . . 14
5.2. What L4S adds to Existing Approaches . . . . . . . . . . 17 5.2. What L4S adds to Existing Approaches . . . . . . . . . . 18
6. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 20 6. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 21
6.1. Applications . . . . . . . . . . . . . . . . . . . . . . 20 6.1. Applications . . . . . . . . . . . . . . . . . . . . . . 21
6.2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 22 6.2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 22
6.3. Applicability with Specific Link Technologies . . . . . . 23 6.3. Applicability with Specific Link Technologies . . . . . . 23
6.4. Deployment Considerations . . . . . . . . . . . . . . . . 23 6.4. Deployment Considerations . . . . . . . . . . . . . . . . 24
6.4.1. Deployment Topology . . . . . . . . . . . . . . . . . 24 6.4.1. Deployment Topology . . . . . . . . . . . . . . . . . 24
6.4.2. Deployment Sequences . . . . . . . . . . . . . . . . 25 6.4.2. Deployment Sequences . . . . . . . . . . . . . . . . 26
6.4.3. L4S Flow but Non-ECN Bottleneck . . . . . . . . . . . 27 6.4.3. L4S Flow but Non-ECN Bottleneck . . . . . . . . . . . 28
6.4.4. L4S Flow but Classic ECN Bottleneck . . . . . . . . . 28 6.4.4. L4S Flow but Classic ECN Bottleneck . . . . . . . . . 29
6.4.5. L4S AQM Deployment within Tunnels . . . . . . . . . . 28 6.4.5. L4S AQM Deployment within Tunnels . . . . . . . . . . 30
7. IANA Considerations (to be removed by RFC Editor) . . . . . . 28 7. IANA Considerations (to be removed by RFC Editor) . . . . . . 30
8. Security Considerations . . . . . . . . . . . . . . . . . . . 29 8. Security Considerations . . . . . . . . . . . . . . . . . . . 30
8.1. Traffic Rate (Non-)Policing . . . . . . . . . . . . . . . 29 8.1. Traffic Rate (Non-)Policing . . . . . . . . . . . . . . . 30
8.2. 'Latency Friendliness' . . . . . . . . . . . . . . . . . 30 8.2. 'Latency Friendliness' . . . . . . . . . . . . . . . . . 31
8.3. Interaction between Rate Policing and L4S . . . . . . . . 31 8.3. Interaction between Rate Policing and L4S . . . . . . . . 33
8.4. ECN Integrity . . . . . . . . . . . . . . . . . . . . . . 32 8.4. ECN Integrity . . . . . . . . . . . . . . . . . . . . . . 34
8.5. Privacy Considerations . . . . . . . . . . . . . . . . . 33 8.5. Privacy Considerations . . . . . . . . . . . . . . . . . 34
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 33 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 35
10. Informative References . . . . . . . . . . . . . . . . . . . 33 10. Informative References . . . . . . . . . . . . . . . . . . . 35
Appendix A. Standardization items . . . . . . . . . . . . . . . 42 Appendix A. Standardization items . . . . . . . . . . . . . . . 45
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 48
1. Introduction 1. Introduction
At any one time, it is increasingly common for _all_ of the traffic At any one time, it is increasingly common for all of the traffic in
in a bottleneck link (e.g. a household's Internet access) to come a bottleneck link (e.g. a household's Internet access) to come from
from applications that prefer low delay: interactive Web, Web applications that prefer low delay: interactive Web, Web services,
services, voice, conversational video, interactive video, interactive voice, conversational video, interactive video, interactive remote
remote presence, instant messaging, online gaming, remote desktop, presence, instant messaging, online gaming, remote desktop, cloud-
cloud-based applications and video-assisted remote control of based applications and video-assisted remote control of machinery and
machinery and industrial processes. In the last decade or so, much industrial processes. In the last decade or so, much has been done
has been done to reduce propagation delay by placing caches or to reduce propagation delay by placing caches or servers closer to
servers closer to users. However, queuing remains a major, albeit users. However, queuing remains a major, albeit intermittent,
intermittent, component of latency. For instance spikes of hundreds component of latency. For instance spikes of hundreds of
of milliseconds are not uncommon, even with state-of-the-art active milliseconds are not uncommon, even with state-of-the-art active
queue management (AQM) [COBALT], [DOCSIS3AQM]. Queuing in access queue management (AQM) [COBALT], [DOCSIS3AQM]. Queuing in access
network bottlenecks is typically configured to cause overall network network bottlenecks is typically configured to cause overall network
delay to roughly double during a long-running flow, relative to delay to roughly double during a long-running flow, relative to
expected base (unloaded) path delay [BufferSize]. Low loss is also expected base (unloaded) path delay [BufferSize]. Low loss is also
important because, for interactive applications, losses translate important because, for interactive applications, losses translate
into even longer retransmission delays. into even longer retransmission delays.
It has been demonstrated that, once access network bit rates reach It has been demonstrated that, once access network bit rates reach
levels now common in the developed world, increasing capacity offers levels now common in the developed world, increasing capacity offers
diminishing returns if latency (delay) is not addressed. Therefore, diminishing returns if latency (delay) is not addressed
the goal is an Internet service with very Low queueing Latency, very [Dukkipati15], [Rajiullah15]. Therefore, the goal is an Internet
Low Loss and Scalable throughput (L4S). Very low queuing latency service with very Low queueing Latency, very Low Loss and Scalable
means less than 1 millisecond (ms) on average and less than about throughput (L4S). Very low queuing latency means less than
2 ms at the 99th percentile. This document describes the L4S 1 millisecond (ms) on average and less than about 2 ms at the 99th
architecture for achieving these goals. percentile. This document describes the L4S architecture for
achieving these goals.
Differentiated services (Diffserv) offers Expedited Forwarding Differentiated services (Diffserv) offers Expedited Forwarding
(EF [RFC3246]) for some packets at the expense of others, but this (EF [RFC3246]) for some packets at the expense of others, but this
makes no difference when all (or most) of the traffic at a bottleneck makes no difference when all (or most) of the traffic at a bottleneck
at any one time requires low latency. In contrast, L4S still works at any one time requires low latency. In contrast, L4S still works
well when _all_ traffic is L4S - a service that gives without taking well when all traffic is L4S - a service that gives without taking
needs none of the configuration or management baggage (traffic needs none of the configuration or management baggage (traffic
policing, traffic contracts) associated with favouring some traffic policing, traffic contracts) associated with favouring some traffic
flows over others. flows over others.
Queuing delay degrades performance intermittently [Hohlfeld14]. It Queuing delay degrades performance intermittently [Hohlfeld14]. It
occurs when a large enough capacity-seeking (e.g. TCP) flow is occurs when a large enough capacity-seeking (e.g. TCP) flow is
running alongside the user's traffic in the bottleneck link, which is running alongside the user's traffic in the bottleneck link, which is
typically in the access network. Or when the low latency application typically in the access network. Or when the low latency application
is itself a large capacity-seeking or adaptive rate (e.g. interactive is itself a large capacity-seeking or adaptive rate (e.g. interactive
video) flow. At these times, the performance improvement from L4S video) flow. At these times, the performance improvement from L4S
skipping to change at page 9, line 39 skipping to change at page 9, line 39
a. An essential aspect of a scalable congestion control is the use a. An essential aspect of a scalable congestion control is the use
of explicit congestion signals. 'Classic' ECN [RFC3168] requires of explicit congestion signals. 'Classic' ECN [RFC3168] requires
an ECN signal to be treated as equivalent to drop, both when it an ECN signal to be treated as equivalent to drop, both when it
is generated in the network and when it is responded to by hosts. is generated in the network and when it is responded to by hosts.
L4S needs networks and hosts to support a more fine-grained L4S needs networks and hosts to support a more fine-grained
meaning for each ECN signal that is less severe than a drop, so meaning for each ECN signal that is less severe than a drop, so
that the L4S signals: that the L4S signals:
* can be much more frequent; * can be much more frequent;
* can be signalled immediately, without the signficant delay * can be signalled immediately, without the significant delay
required to smooth out fluctuations in the queue. required to smooth out fluctuations in the queue.
To enable L4S, the standards track [RFC3168] has had to be To enable L4S, the standards track [RFC3168] has had to be
updated to allow L4S packets to depart from the 'equivalent to updated to allow L4S packets to depart from the 'equivalent to
drop' constraint. [RFC8311] is a standards track update to relax drop' constraint. [RFC8311] is a standards track update to relax
specific requirements in RFC 3168 (and certain other standards specific requirements in RFC 3168 (and certain other standards
track RFCs), which clears the way for the experimental changes track RFCs), which clears the way for the experimental changes
proposed for L4S. [RFC8311] also reclassifies the original proposed for L4S. [RFC8311] also reclassifies the original
experimental assignment of the ECT(1) codepoint as an ECN experimental assignment of the ECT(1) codepoint as an ECN
nonce [RFC3540] as historic. nonce [RFC3540] as historic.
b. [I-D.ietf-tsvwg-ecn-l4s-id] specifies that ECT(1) is used as the b. [I-D.ietf-tsvwg-ecn-l4s-id] specifies that ECT(1) is used as the
identifier to classify L4S packets into a separate treatment from identifier to classify L4S packets into a separate treatment from
Classic packets. This satisfies the requirements for identifying Classic packets. This satisfies the requirement for identifying
an alternative ECN treatment in [RFC4774]. an alternative ECN treatment in [RFC4774].
The CE codepoint is used to indicate Congestion Experienced by The CE codepoint is used to indicate Congestion Experienced by
both L4S and Classic treatments. This raises the concern that a both L4S and Classic treatments. This raises the concern that a
Classic AQM earlier on the path might have marked some ECT(0) Classic AQM earlier on the path might have marked some ECT(0)
packets as CE. Then these packets will be erroneously classified packets as CE. Then these packets will be erroneously classified
into the L4S queue. Appendix B of [I-D.ietf-tsvwg-ecn-l4s-id] into the L4S queue. Appendix B of [I-D.ietf-tsvwg-ecn-l4s-id]
explains why five unlikely eventualities all have to coincide for explains why five unlikely eventualities all have to coincide for
this to have any detrimental effect, which even then would only this to have any detrimental effect, which even then would only
involve a vanishingly small likelihood of a spurious involve a vanishingly small likelihood of a spurious
skipping to change at page 10, line 42 skipping to change at page 10, line 42
The L4S architecture aims to provide low latency without the _need_ The L4S architecture aims to provide low latency without the _need_
for per-flow operations in network components. Nonetheless, the for per-flow operations in network components. Nonetheless, the
architecture does not preclude per-flow solutions. The following architecture does not preclude per-flow solutions. The following
bullets describe the known arrangements: a) the DualQ Coupled AQM bullets describe the known arrangements: a) the DualQ Coupled AQM
with an L4S AQM in one queue coupled from a Classic AQM in the other; with an L4S AQM in one queue coupled from a Classic AQM in the other;
b) Per-Flow Queues with an instance of a Classic and an L4S AQM in b) Per-Flow Queues with an instance of a Classic and an L4S AQM in
each queue; c) Dual queues with per-flow AQMs, but no per-flow each queue; c) Dual queues with per-flow AQMs, but no per-flow
queues: queues:
a. The Dual Queue Coupled AQM (illustrated in Figure 1) achieves the a. The Dual Queue Coupled AQM (illustrated in Figure 1) achieves the
'semi-permeable' membrane property mentioned earlier as follows. 'semi-permeable' membrane property mentioned earlier as follows:
The obvious part is that using two separate queues isolates the
queuing delay of one from the other. The less obvious part is
how the two queues act as if they are a single pool of bandwidth
without the scheduler needing to decide between them. This is
achieved by having the Classic AQM provide a congestion signal to
both queues in a manner that ensures a consistent response from
the two types of congestion control. In other words, the Classic
AQM generates a drop/mark probability based on congestion in the
Classic queue, uses this probability to drop/mark packets in that
queue, and also uses this probability to affect the marking
probability in the L4S queue. This coupling of the congestion
signaling between the two queues makes the L4S flows slow down to
leave the right amount of capacity for the Classic traffic (as
they would if they were the same type of traffic sharing the same
queue). Then the scheduler can serve the L4S queue with priority
(denoted by the '1' on the higher priority input), because the
L4S traffic isn't offering up enough traffic to use all the
priority that it is given. Therefore, on short time-scales (sub-
round-trip) the prioritization of the L4S queue protects its low
latency by allowing bursts to dissipate quickly; but on longer
time-scales (round-trip and longer) the Classic queue creates an
equal and opposite pressure against the L4S traffic to ensure
that neither has priority when it comes to bandwidth. The
tension between prioritizing L4S and coupling the marking from
the Classic AQM results in approximate per-flow fairness. To
protect against unresponsive traffic in the L4S queue taking
advantage of the prioritization and starving the Classic queue,
it is advisable not to use strict priority, but instead to use a
weighted scheduler (see Appendix A of
[I-D.ietf-tsvwg-aqm-dualq-coupled]).
When there is no Classic traffic, the L4S queue's AQM comes into * Latency isolation: Two separate queues are used to isolate L4S
play. It starts congestion marking with a very shallow queue, so queuing delay from the larger queue that Classic traffic needs
L4S traffic maintains very low queuing delay. to maintain full utilization.
* Bandwidth pooling: The two queues act as if they are a single
pool of bandwidth in which flows of either type get roughly
equal throughput without the scheduler needing to identify any
flows. This is achieved by having an AQM in each queue, but
the Classic AQM provides a congestion signal to both queues in
a manner that ensures a consistent response from the two
classes of congestion control. Specifically, the Classic AQM
generates a drop/mark probability based on congestion in its
own queue, which it uses both to drop/mark packets in its own
queue and to affect the marking probability in the L4S queue.
The strength of the coupling of the congestion signalling
between the two queues is enough to make the L4S flows slow
down to leave the right amount of capacity for the Classic
flows (as they would if they were the same type of traffic
sharing the same queue).
Then the scheduler can serve the L4S queue with priority (denoted
by the '1' on the higher priority input), because the L4S traffic
isn't offering up enough traffic to use all the priority that it
is given. Therefore:
* for latency isolation on short time-scales (sub-round-trip)
the prioritization of the L4S queue protects its low latency
by allowing bursts to dissipate quickly;
* but for bandwidth pooling on longer time-scales (round-trip
and longer) the Classic queue creates an equal and opposite
pressure against the L4S traffic to ensure that neither has
priority when it comes to bandwidth - the tension between
prioritizing L4S and coupling the marking from the Classic AQM
results in approximate per-flow fairness.
To protect against unresponsive traffic taking advantage of the
prioritization of the L4S queue and starving the Classic queue,
it is advisable for the priority to be conditional, not strict
(see Appendix A of [I-D.ietf-tsvwg-aqm-dualq-coupled]).
When there is no Classic traffic, the L4S queue's own AQM comes
into play. It starts congestion marking with a very shallow
queue, so L4S traffic maintains very low queuing delay.
If either queue becomes persistently overloaded, ECN marking is If either queue becomes persistently overloaded, ECN marking is
disabled, as recommended in Section 7 of [RFC3168] and disabled, as recommended in Section 7 of [RFC3168] and
Section 4.2.1 of [RFC7567]. Then both queues introduce the same Section 4.2.1 of [RFC7567]. Then both queues introduce the same
level of drop (not shown in the figure). level of drop (not shown in the figure).
The Dual Queue Coupled AQM has been specified as generically as The Dual Queue Coupled AQM has been specified as generically as
possible [I-D.ietf-tsvwg-aqm-dualq-coupled] without specifying possible [I-D.ietf-tsvwg-aqm-dualq-coupled] without specifying
the particular AQMs to use in the two queues so that designers the particular AQMs to use in the two queues so that designers
are free to implement diverse ideas. Informational appendices in are free to implement diverse ideas. Informational appendices in
skipping to change at page 12, line 28 skipping to change at page 12, line 31
Figure 1: Components of an L4S DualQ Coupled AQM Solution: 1) Figure 1: Components of an L4S DualQ Coupled AQM Solution: 1)
Scalable Sending Host; 2) Isolation in separate network Scalable Sending Host; 2) Isolation in separate network
queues; and 3) Packet Identification Protocol queues; and 3) Packet Identification Protocol
b. Per-Flow Queues and AQMs: A scheduler with per-flow queues such b. Per-Flow Queues and AQMs: A scheduler with per-flow queues such
as FQ-CoDel or FQ-PIE can be used for L4S. For instance within as FQ-CoDel or FQ-PIE can be used for L4S. For instance within
each queue of an FQ-CoDel system, as well as a CoDel AQM, there each queue of an FQ-CoDel system, as well as a CoDel AQM, there
is typically also the option of ECN marking at an immediate is typically also the option of ECN marking at an immediate
(unsmoothed) shallow threshold to support use in data centres (unsmoothed) shallow threshold to support use in data centres
(see Sec.5.2.7 of [RFC8290]). This can be modified so that the (see Sec.5.2.7 of [RFC8290]). In Linux, this has been modified
shallow threshold is solely applied to ECT(1) packets so that the shallow threshold can be solely applied to ECT(1)
[FQ_CoDel_Thresh]. Then if there is a flow of non-ECN or ECT(0) packets [FQ_CoDel_Thresh]. Then if there is a flow of non-ECN or
packets in the per-flow-queue, the Classic AQM (e.g. CoDel) is ECT(0) packets in the per-flow-queue, the Classic AQM (e.g.
applied; while if there is a flow of ECT(1) packets in the queue, CoDel) is applied; while if there is a flow of ECT(1) packets in
the shallower (typically sub-millisecond) threshold is applied. the queue, the shallower (typically sub-millisecond) threshold is
In addition, ECT(0) and not-ECT packets could potentially be applied. In addition, ECT(0) and not-ECT packets could
classified into a separate flow-queue from ECT(1) and CE packets potentially be classified into a separate flow-queue from ECT(1)
to avoid them mixing if they share a common flow-identifier (e.g. and CE packets to avoid them mixing if they share a common flow-
in a VPN). identifier (e.g. in a VPN).
c. Dual-queues, but per-flow AQMs: It should also be possible to use c. Dual-queues, but per-flow AQMs: It should also be possible to use
dual queues for isolation, but with per-flow marking to control dual queues for isolation, but with per-flow marking to control
flow-rates (instead of the coupled per-queue marking of the Dual flow-rates (instead of the coupled per-queue marking of the Dual
Queue Coupled AQM). One of the two queues would be for isolating Queue Coupled AQM). One of the two queues would be for isolating
L4S packets, which would be classified by the ECN codepoint. L4S packets, which would be classified by the ECN codepoint.
Flow rates could be controlled by flow-specific marking. The Flow rates could be controlled by flow-specific marking. The
policy goal of the marking could be to differentiate flow rates policy goal of the marking could be to differentiate flow rates
(e.g. [Nadas20], which requires additional signalling of a per- (e.g. [Nadas20], which requires additional signalling of a per-
flow 'value'), or to equalize flow-rates (perhaps in a similar flow 'value'), or to equalize flow-rates (perhaps in a similar
skipping to change at page 17, line 7 skipping to change at page 17, line 32
960 Mb/s it enters true Cubic mode, with a recovery time of 960 Mb/s it enters true Cubic mode, with a recovery time of
12.2 s. From then on, each further scaling by 8x doubles Cubic's 12.2 s. From then on, each further scaling by 8x doubles Cubic's
recovery time (because the cube root of 8 is 2), e.g. at 7.68 Gb/s recovery time (because the cube root of 8 is 2), e.g. at 7.68 Gb/s
the recovery time is 24.3 s. In contrast a scalable congestion the recovery time is 24.3 s. In contrast a scalable congestion
control like DCTCP or TCP Prague induces 2 congestion signals per control like DCTCP or TCP Prague induces 2 congestion signals per
round trip on average, which remains invariant for any flow rate, round trip on average, which remains invariant for any flow rate,
keeping dynamic control very tight. keeping dynamic control very tight.
For a feel of where the global average lone-flow download sits on For a feel of where the global average lone-flow download sits on
this scale at the time of writing (2021), according to [BDPdata] this scale at the time of writing (2021), according to [BDPdata]
globally averaged fixed access capacity was 103Mb/s in 2020 and globally averaged fixed access capacity was 103 Mb/s in 2020 and
averaged base RTT to a CDN was 25-34ms in 2019. Averaging of per- averaged base RTT to a CDN was 25-34ms in 2019. Averaging of per-
country data was weighted by Internet user population. So a lone country data was weighted by Internet user population (data
CUBIC flow would at best take about 200 round trips (5 s) to collected globally is necessarily of variable quality, but the
recover from each of its sawtooth reductions, if the flow even paper does double-check that the outcome compares well against a
lasted that long. This is described as 'at best' because it second source). So a lone CUBIC flow would at best take about 200
assume everyone uses an AQM, whereas In reality most users still round trips (5 s) to recover from each of its sawtooth reductions,
have a bloated tail-drop buffer. So likely average recovery time if the flow even lasted that long. This is described as 'at best'
would be at least 4x 5 s, if not more, because RTT under load because it assume everyone uses an AQM, whereas in reality most
would be at least double, and recovery time depends on the square users still have a (probably bloated) tail-drop buffer. In the
of RTT. tail-drop case, likely average recovery time would be at least 4x
5 s, if not more, because RTT under load would be at least double
that of an AQM, and recovery time depends on the square of RTT.
Although work on scaling congestion controls tends to start with Although work on scaling congestion controls tends to start with
TCP as the transport, the above is not intended to exclude other TCP as the transport, the above is not intended to exclude other
transports (e.g. SCTP, QUIC) or less elastic algorithms transports (e.g. SCTP, QUIC) or less elastic algorithms
(e.g. RMCAT), which all tend to adopt the same or similar (e.g. RMCAT), which all tend to adopt the same or similar
developments. developments.
5.2. What L4S adds to Existing Approaches 5.2. What L4S adds to Existing Approaches
All the following approaches address some part of the same problem All the following approaches address some part of the same problem
space as L4S. In each case, it is shown that L4S complements them or space as L4S. In each case, it is shown that L4S complements them or
improves on them, rather than being a mutually exclusive alternative: improves on them, rather than being a mutually exclusive alternative:
Diffserv: Diffserv addresses the problem of bandwidth apportionment Diffserv: Diffserv addresses the problem of bandwidth apportionment
for important traffic as well as queuing latency for delay- for important traffic as well as queuing latency for delay-
sensitive traffic. Of these, L4S solely addresses the problem of sensitive traffic. Of these, L4S solely addresses the problem of
queuing latency. Diffserv will still be necessary where important queuing latency. Diffserv will still be necessary where important
traffic requires priority (e.g. for commercial reasons, or for traffic requires priority (e.g. for commercial reasons, or for
protection of critical infrastructure traffic) - see protection of critical infrastructure traffic) - see
[I-D.briscoe-tsvwg-l4s-diffserv]. Nonetheless, the L4S approach [I-D.briscoe-tsvwg-l4s-diffserv]. Nonetheless, the L4S approach
can provide low latency for _all_ traffic within each Diffserv can provide low latency for all traffic within each Diffserv class
class (including the case where there is only the one default (including the case where there is only the one default Diffserv
Diffserv class). class).
Also, Diffserv can only provide a latency benefit if a small Also, Diffserv can only provide a latency benefit if a small
subset of the traffic on a bottleneck link requests low latency. subset of the traffic on a bottleneck link requests low latency.
As already explained, it has no effect when all the applications As already explained, it has no effect when all the applications
in use at one time at a single site (home, small business or in use at one time at a single site (home, small business or
mobile device) require low latency. In contrast, because L4S mobile device) require low latency. In contrast, because L4S
works for all traffic, it needs none of the management baggage works for all traffic, it needs none of the management baggage
(traffic policing, traffic contracts) associated with favouring (traffic policing, traffic contracts) associated with favouring
some packets over others. This baggage has probably held Diffserv some packets over others. This lack of management baggage ought
back from widespread end-to-end deployment. to give L4S a better chance of end-to-end deployment.
In particular, because networks tend not to trust end systems to In particular, because networks tend not to trust end systems to
identify which packets should be favoured over others, where identify which packets should be favoured over others, where
networks assign packets to Diffserv classes they tend to use networks assign packets to Diffserv classes they tend to use
packet inspection of application flow identifiers or deeper packet inspection of application flow identifiers or deeper
inspection of application signatures. Thus, nowadays, Diffserv inspection of application signatures. Thus, nowadays, Diffserv
doesn't always sit well with encryption of the layers above IP doesn't always sit well with encryption of the layers above IP
[RFC8404]. So users have to choose between privacy and QoS. [RFC8404]. So users have to choose between privacy and QoS.
As with Diffserv, the L4S identifier is in the IP header. But, in As with Diffserv, the L4S identifier is in the IP header. But, in
skipping to change at page 18, line 29 skipping to change at page 19, line 7
depends on network behaviour. depends on network behaviour.
State-of-the-art AQMs: AQMs such as PIE and FQ-CoDel give a State-of-the-art AQMs: AQMs such as PIE and FQ-CoDel give a
significant reduction in queuing delay relative to no AQM at all. significant reduction in queuing delay relative to no AQM at all.
L4S is intended to complement these AQMs, and should not distract L4S is intended to complement these AQMs, and should not distract
from the need to deploy them as widely as possible. Nonetheless, from the need to deploy them as widely as possible. Nonetheless,
AQMs alone cannot reduce queuing delay too far without AQMs alone cannot reduce queuing delay too far without
significantly reducing link utilization, because the root cause of significantly reducing link utilization, because the root cause of
the problem is on the host - where Classic congestion controls use the problem is on the host - where Classic congestion controls use
large saw-toothing rate variations. The L4S approach resolves large saw-toothing rate variations. The L4S approach resolves
this tension by ensuring hosts can minimize the size of their this tension between delay and utilization by enabling hosts to
sawteeth without appearing so aggressive to Classic flows that minimize the amplitude of their sawteeth. A single-queue Classic
they starve them. AQM is not sufficient to allow hosts to use small sawteeth for two
reasons: i) smaller sawteeth would not get lower delay in an AQM
designed for larger amplitude Classic sawteeth, because a queue
can only have one length at a time; and ii) much smaller sawteeth
implies much more frequent sawteeth, so L4S flows would drive a
Classic AQM into a high level of ECN-marking, which would appear
as heavy congestion to Classic flows, which in turn would greatly
reduce their rate as a result (see Section 6.4.4).
Per-flow queuing or marking: Similarly, per-flow approaches such as Per-flow queuing or marking: Similarly, per-flow approaches such as
FQ-CoDel or Approx Fair CoDel [AFCD] are not incompatible with the FQ-CoDel or Approx Fair CoDel [AFCD] are not incompatible with the
L4S approach. However, per-flow queuing alone is not enough - it L4S approach. However, per-flow queuing alone is not enough - it
only isolates the queuing of one flow from others; not from only isolates the queuing of one flow from others; not from
itself. Per-flow implementations still need to have support for itself. Per-flow implementations need to have support for
scalable congestion control added, which has already been done in scalable congestion control added, which has already been done for
FQ-CoDel (see Sec.5.2.7 of [RFC8290]). Without this simple FQ-CoDel in Linux (see Sec.5.2.7 of [RFC8290] and
modification, per-flow AQMs like FQ-CoDel would still not be able [FQ_CoDel_Thresh]). Without this simple modification, per-flow
to support applications that need both very low delay and high AQMs like FQ-CoDel would still not be able to support applications
bandwidth, e.g. video-based control of remote procedures, or that need both very low delay and high bandwidth, e.g. video-based
interactive cloud-based video (see Note 1 below). control of remote procedures, or interactive cloud-based video
(see Note 1 below).
Although per-flow techniques are not incompatible with L4S, it is Although per-flow techniques are not incompatible with L4S, it is
important to have the DualQ alternative. This is because handling important to have the DualQ alternative. This is because handling
end-to-end (layer 4) flows in the network (layer 3 or 2) precludes end-to-end (layer 4) flows in the network (layer 3 or 2) precludes
some important end-to-end functions. For instance: some important end-to-end functions. For instance:
a. Per-flow forms of L4S like FQ-CoDel are incompatible with full a. Per-flow forms of L4S like FQ-CoDel are incompatible with full
end-to-end encryption of transport layer identifiers for end-to-end encryption of transport layer identifiers for
privacy and confidentiality (e.g. IPSec or encrypted VPN privacy and confidentiality (e.g. IPSec or encrypted VPN
tunnels, as opposed to TLS over UDP), because they require tunnels, as opposed to TLS over UDP), because they require
skipping to change at page 25, line 14 skipping to change at page 26, line 5
Deployment in mesh topologies depends on how overbooked the core is. Deployment in mesh topologies depends on how overbooked the core is.
If the core is non-blocking, or at least generously provisioned so If the core is non-blocking, or at least generously provisioned so
that the edges are nearly always the bottlenecks, it would only be that the edges are nearly always the bottlenecks, it would only be
necessary to deploy an L4S AQM at the edge bottlenecks. For example, necessary to deploy an L4S AQM at the edge bottlenecks. For example,
some data-centre networks are designed with the bottleneck in the some data-centre networks are designed with the bottleneck in the
hypervisor or host NICs, while others bottleneck at the top-of-rack hypervisor or host NICs, while others bottleneck at the top-of-rack
switch (both the output ports facing hosts and those facing the switch (both the output ports facing hosts and those facing the
core). core).
An L4S AQM would often next be needed where the WiFi links in a hom An L4S AQM would often next be needed where the WiFi links in a home
sometimes become the bottleneck. And an L4S AQM would eventually sometimes become the bottleneck. And an L4S AQM would eventually
also need to be deployed at any other persistent bottlenecks such as also need to be deployed at any other persistent bottlenecks such as
network interconnections, e.g. some public Internet exchange points network interconnections, e.g. some public Internet exchange points
and the ingress and egress to WAN links interconnecting data-centres. and the ingress and egress to WAN links interconnecting data-centres.
6.4.2. Deployment Sequences 6.4.2. Deployment Sequences
For any one L4S flow to provide benefit, it requires 3 parts to have For any one L4S flow to provide benefit, it requires 3 parts to have
been deployed. This was the same deployment problem that ECN been deployed. This was the same deployment problem that ECN
faced [RFC8170] so we have learned from that experience. faced [RFC8170] so we have learned from that experience.
skipping to change at page 26, line 45 skipping to change at page 28, line 7
seen, but limited to a controlled trial or controlled deployment. seen, but limited to a controlled trial or controlled deployment.
In this example downstream deployment is first, but in other In this example downstream deployment is first, but in other
scenarios the upstream might be deployed first. If no AQM at all scenarios the upstream might be deployed first. If no AQM at all
was previously deployed for the downstream access, an L4S AQM was previously deployed for the downstream access, an L4S AQM
greatly improves the Classic service (as well as adding the L4S greatly improves the Classic service (as well as adding the L4S
service). If an AQM was already deployed, the Classic service service). If an AQM was already deployed, the Classic service
will be unchanged (and L4S will add an improvement on top). will be unchanged (and L4S will add an improvement on top).
2. In this stage, the name 'TCP 2. In this stage, the name 'TCP
Prague' [I-D.briscoe-iccrg-prague-congestion-control] is used to Prague' [I-D.briscoe-iccrg-prague-congestion-control] is used to
represent a variant of DCTCP that is safe to use in a production represent a variant of DCTCP that is designed to be used in a
Internet environment. If the application is primarily production Internet environment. If the application is primarily
unidirectional, 'TCP Prague' at one end will provide all the unidirectional, 'TCP Prague' at one end will provide all the
benefit needed. For TCP transports, Accurate ECN feedback benefit needed. For TCP transports, Accurate ECN feedback
(AccECN) [I-D.ietf-tcpm-accurate-ecn] is needed at the other end, (AccECN) [I-D.ietf-tcpm-accurate-ecn] is needed at the other end,
but it is a generic ECN feedback facility that is already planned but it is a generic ECN feedback facility that is already planned
to be deployed for other purposes, e.g. DCTCP, BBR. The two ends to be deployed for other purposes, e.g. DCTCP, BBR. The two ends
can be deployed in either order, because, in TCP, an L4S can be deployed in either order, because, in TCP, an L4S
congestion control only enables itself if it has negotiated the congestion control only enables itself if it has negotiated the
use of AccECN feedback with the other end during the connection use of AccECN feedback with the other end during the connection
handshake. Thus, deployment of TCP Prague on a server enables handshake. Thus, deployment of TCP Prague on a server enables
L4S trials to move to a production service in one direction, L4S trials to move to a production service in one direction,
skipping to change at page 28, line 30 skipping to change at page 29, line 41
L4S deployment scenarios that minimize these issues (e.g. over L4S deployment scenarios that minimize these issues (e.g. over
wireline networks) can proceed in parallel to this research, in the wireline networks) can proceed in parallel to this research, in the
expectation that research success could continually widen L4S expectation that research success could continually widen L4S
applicability. applicability.
6.4.4. L4S Flow but Classic ECN Bottleneck 6.4.4. L4S Flow but Classic ECN Bottleneck
Classic ECN support is starting to materialize on the Internet as an Classic ECN support is starting to materialize on the Internet as an
increased level of CE marking. It is hard to detect whether this is increased level of CE marking. It is hard to detect whether this is
all due to the addition of support for ECN in the Linux all due to the addition of support for ECN in implementations of FQ-
implementation of FQ-CoDel, which is not problematic, because FQ CoDel and/or FQ-COBALT, which is not generally problematic, because
inherently forces the throughput of each flow to be equal flow-queue (FQ) scheduling inherently prevents a flow from exceeding
irrespective of its aggressiveness. However, some of this Classic the 'fair' rate irrespective of its aggressiveness. However, some of
ECN marking might be due to single-queue ECN deployment. This case this Classic ECN marking might be due to single-queue ECN deployment.
is discussed in Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id]). This case is discussed in Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id].
6.4.5. L4S AQM Deployment within Tunnels 6.4.5. L4S AQM Deployment within Tunnels
An L4S AQM uses the ECN field to signal congestion. So, in common An L4S AQM uses the ECN field to signal congestion. So, in common
with Classic ECN, if the AQM is within a tunnel or at a lower layer, with Classic ECN, if the AQM is within a tunnel or at a lower layer,
correct functioning of ECN signalling requires correct propagation of correct functioning of ECN signalling requires correct propagation of
the ECN field up the layers [RFC6040], the ECN field up the layers [RFC6040],
[I-D.ietf-tsvwg-rfc6040update-shim], [I-D.ietf-tsvwg-rfc6040update-shim],
[I-D.ietf-tsvwg-ecn-encap-guidelines]. [I-D.ietf-tsvwg-ecn-encap-guidelines].
7. IANA Considerations (to be removed by RFC Editor) 7. IANA Considerations (to be removed by RFC Editor)
This specification contains no IANA considerations. This specification contains no IANA considerations.
8. Security Considerations 8. Security Considerations
8.1. Traffic Rate (Non-)Policing 8.1. Traffic Rate (Non-)Policing
In the current Internet, scheduling usually enforces separation
between 'sites' (e.g. households, businesses or mobile users
[RFC0970]) and various techniques like redirection to traffic
scrubbing facilities deal with flooding attacks. However, there has
never been a universal need to police the rate of individual
application flows - the Internet has generally always relied on self-
restraint of congestion controls at senders for sharing intra-'site'
capacity.
As explained in Section 5.2, the DualQ variant of L4S provides low
delay without prejudging the issue of flow-rate control. Then, if
flow-rate control is needed, per-flow-queuing (FQ) can be used
instead, or flow rate policing can be added as a modular addition to
a DualQ.
Because the L4S service reduces delay without increasing the delay of Because the L4S service reduces delay without increasing the delay of
Classic traffic, it should not be necessary to rate-police access to Classic traffic, it should not be necessary to rate-police access to
the L4S service. In contrast, Section 5.2 explains how Diffserv only the L4S service. In contrast, Section 5.2 explains how Diffserv only
makes a difference if some packets get less favourable treatment than makes a difference if some packets get less favourable treatment than
others, which typically requires traffic policing, which can, in others, which typically requires traffic rate policing, which can, in
turn, lead to further complexity such as traffic contracts at trust turn, lead to further complexity such as traffic contracts at trust
boundaries. Because L4S avoids this management complexity, it is boundaries. Because L4S avoids this management complexity, it is
more likely to work end-to-end. more likely to work end-to-end.
During early deployment (and perhaps always), some networks will not During early deployment (and perhaps always), some networks will not
offer the L4S service. In general, these networks should not need to offer the L4S service. In general, these networks should not need to
police L4S traffic - they are required not to change the L4S police L4S traffic. They are required (by both [RFC3168] and
identifier, merely treating the traffic as Not-ECT, as they might [I-D.ietf-tsvwg-ecn-l4s-id]) not to change the L4S identifier, which
already treat ECT(1) traffic today. At a bottleneck, such networks would interfere with end-to-end congestion control. Instead they can
will introduce some queuing and dropping. When a scalable congestion merely treat L4S traffic as Not-ECT, as they might already treat all
control detects a drop it will have to respond safely with respect to ECN traffic today. At a bottleneck, such networks will introduce
Classic congestion controls (as required in Section 4.3 of some queuing and dropping. When a scalable congestion control
detects a drop it will have to respond safely with respect to Classic
congestion controls (as required in Section 4.3 of
[I-D.ietf-tsvwg-ecn-l4s-id]). This will degrade the L4S service to [I-D.ietf-tsvwg-ecn-l4s-id]). This will degrade the L4S service to
be no better (but never worse) than Classic best efforts, whenever a be no better (but never worse) than Classic best efforts, whenever a
non-ECN bottleneck is encountered on a path (see Section 6.4.3). non-ECN bottleneck is encountered on a path (see Section 6.4.3).
In some cases, networks that solely support Classic ECN [RFC3168] in In cases that are expected to be rare, networks that solely support
a single queue bottleneck might opt to police L4S traffic so as to Classic ECN [RFC3168] in a single queue bottleneck might opt to
protect competing Classic ECN traffic. police L4S traffic so as to protect competing Classic ECN traffic
(for instance, see Section 6.1.3 of [I-D.ietf-tsvwg-l4sops]).
However, Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id] recommends that
the sender adapts its congestion response to properly coexist with
Classic ECN flows, i.e. reverting to the self-restraint approach.
Certain network operators might choose to restrict access to the L4S Certain network operators might choose to restrict access to the L4S
class, perhaps only to selected premium customers as a value-added class, perhaps only to selected premium customers as a value-added
service. Their packet classifier (item 2 in Figure 1) could identify service. Their packet classifier (item 2 in Figure 1) could identify
such customers against some other field (e.g. source address range) such customers against some other field (e.g. source address range)
as well as classifying on the ECN field. If only the ECN L4S as well as classifying on the ECN field. If only the ECN L4S
identifier matched, but not the source address (say), the classifier identifier matched, but not the source address (say), the classifier
could direct these packets (from non-premium customers) into the could direct these packets (from non-premium customers) into the
Classic queue. Explaining clearly how operators can use an Classic queue. Explaining clearly how operators can use an
additional local classifiers (see section 5.4 of additional local classifiers (see section 5.4 of
[I-D.ietf-tsvwg-ecn-l4s-id]) is intended to remove any motivation to [I-D.ietf-tsvwg-ecn-l4s-id]) is intended to remove any motivation to
bleach the L4S identifier. Then at least the L4S ECN identifier will clear the L4S identifier. Then at least the L4S ECN identifier will
be more likely to survive end-to-end even though the service may not be more likely to survive end-to-end even though the service may not
be supported at every hop. Such local arrangements would only be supported at every hop. Such local arrangements would only
require simple registered/not-registered packet classification, require simple registered/not-registered packet classification,
rather than the managed, application-specific traffic policing rather than the managed, application-specific traffic policing
against customer-specific traffic contracts that Diffserv uses. against customer-specific traffic contracts that Diffserv uses.
8.2. 'Latency Friendliness' 8.2. 'Latency Friendliness'
Like the Classic service, the L4S service relies on self-constraint - Like the Classic service, the L4S service relies on self-constraint -
limiting rate in response to congestion. In addition, the L4S limiting rate in response to congestion. In addition, the L4S
skipping to change at page 30, line 50 skipping to change at page 32, line 35
Local bottleneck per-flow scheduling: Per-flow scheduling should Local bottleneck per-flow scheduling: Per-flow scheduling should
inherently isolate non-bursty flows from bursty (see Section 5.2 inherently isolate non-bursty flows from bursty (see Section 5.2
for discussion of the merits of per-flow scheduling relative to for discussion of the merits of per-flow scheduling relative to
per-flow policing). per-flow policing).
Distributed access subnet queue protection: Per-flow queue Distributed access subnet queue protection: Per-flow queue
protection could be arranged for a queue structure distributed protection could be arranged for a queue structure distributed
across a subnet inter-communicating using lower layer control across a subnet inter-communicating using lower layer control
messages (see Section 2.1.4 of [QDyn]). For instance, in a radio messages (see Section 2.1.4 of [QDyn]). For instance, in a radio
access network user equipment already sends regular buffer status access network, user equipment already sends regular buffer status
reports to a radio network controller, which could use this reports to a radio network controller, which could use this
information to remotely police individual flows. information to remotely police individual flows.
Distributed Congestion Exposure to Ingress Policers: The Congestion Distributed Congestion Exposure to Ingress Policers: The Congestion
Exposure (ConEx) architecture [RFC7713] which uses egress audit to Exposure (ConEx) architecture [RFC7713] which uses egress audit to
motivate senders to truthfully signal path congestion in-band motivate senders to truthfully signal path congestion in-band
where it can be used by ingress policers. An edge-to-edge variant where it can be used by ingress policers. An edge-to-edge variant
of this architecture is also possible. of this architecture is also possible.
Distributed Domain-edge traffic conditioning: An architecture Distributed Domain-edge traffic conditioning: An architecture
similar to Diffserv [RFC2475] may be preferred, where traffic is similar to Diffserv [RFC2475] may be preferred, where traffic is
proactively conditioned on entry to a domain, rather than proactively conditioned on entry to a domain, rather than
reactively policed only if it leads to queuing once combined with reactively policed only if it leads to queuing once combined with
other traffic at a bottleneck. other traffic at a bottleneck.
Distributed core network queue protection: The policing function Distributed core network queue protection: The policing function
could be divided between per-flow mechanisms at the network could be divided between per-flow mechanisms at the network
ingress that characterize the burstiness of each flow into a ingress that characterize the burstiness of each flow into a
signal carried with the traffic, and per-class mechanisms at signal carried with the traffic, and per-class mechanisms at
bottlenecks that act on these signals if queuing actually occurs bottlenecks that act on these signals if queuing actually occurs
once the traffic converges. This would be somewhat similar to the once the traffic converges. This would be somewhat similar to
idea behind core stateless fair queuing, which is in turn similar [Nadas20], which is in turn similar to the idea behind core
to [Nadas20]. stateless fair queuing.
None of these possible queue protection capabilities are considered a None of these possible queue protection capabilities are considered a
necessary part of the L4S architecture, which works without them (in necessary part of the L4S architecture, which works without them (in
a similar way to how the Internet works without per-flow rate a similar way to how the Internet works without per-flow rate
policing). Indeed, under normal circumstances, latency policers policing). Indeed, even where latency policers are deployed, under
would not intervene, and if operators found they were not necessary normal circumstances they would not intervene, and if operators found
they could disable them. Part of the L4S experiment will be to see they were not necessary they could disable them. Part of the L4S
whether such a function is necessary, and which arrangements are most experiment will be to see whether such a function is necessary, and
appropriate to the size of the problem. which arrangements are most appropriate to the size of the problem.
8.3. Interaction between Rate Policing and L4S 8.3. Interaction between Rate Policing and L4S
As mentioned in Section 5.2, L4S should remove the need for low As mentioned in Section 5.2, L4S should remove the need for low
latency Diffserv classes. However, those Diffserv classes that give latency Diffserv classes. However, those Diffserv classes that give
certain applications or users priority over capacity, would still be certain applications or users priority over capacity, would still be
applicable in certain scenarios (e.g. corporate networks). Then, applicable in certain scenarios (e.g. corporate networks). Then,
within such Diffserv classes, L4S would often be applicable to give within such Diffserv classes, L4S would often be applicable to give
traffic low latency and low loss as well. Within such a Diffserv traffic low latency and low loss as well. Within such a Diffserv
class, the bandwidth available to a user or application is often class, the bandwidth available to a user or application is often
limited by a rate policer. Similarly, in the default Diffserv class, limited by a rate policer. Similarly, in the default Diffserv class,
rate policers are used to partition shared capacity. rate policers are used to partition shared capacity.
A classic rate policer drops any packets exceeding a set rate, A classic rate policer drops any packets exceeding a set rate,
usually also giving a burst allowance (variants exist where the usually also giving a burst allowance (variants exist where the
policer re-marks non-compliant traffic to a discard-eligible Diffserv policer re-marks non-compliant traffic to a discard-eligible Diffserv
codepoint, so they may be dropped elsewhere during contention). codepoint, so they can be dropped elsewhere during contention).
Whenever L4S traffic encounters one of these rate policers, it will Whenever L4S traffic encounters one of these rate policers, it will
experience drops and the source will have to fall back to a Classic experience drops and the source will have to fall back to a Classic
congestion control, thus losing the benefits of L4S (Section 6.4.3). congestion control, thus losing the benefits of L4S (Section 6.4.3).
So, in networks that already use rate policers and plan to deploy So, in networks that already use rate policers and plan to deploy
L4S, it will be preferable to redesign these rate policers to be more L4S, it will be preferable to redesign these rate policers to be more
friendly to the L4S service. friendly to the L4S service.
L4S-friendly rate policing is currently a research area (note that L4S-friendly rate policing is currently a research area (note that
this is not the same as latency policing). It might be achieved by this is not the same as latency policing). It might be achieved by
setting a threshold where ECN marking is introduced, such that it is setting a threshold where ECN marking is introduced, such that it is
just under the policed rate or just under the burst allowance where just under the policed rate or just under the burst allowance where
drop is introduced. This could be applied to various types of rate drop is introduced. For instance the two-rate three-colour marker
policer, e.g. [RFC2697], [RFC2698] or the 'local' (non-ConEx) variant [RFC2698] or a PCN threshold and excess-rate marker [RFC5670] could
of the ConEx congestion policer [I-D.briscoe-conex-policing]. It mark ECN at the lower rate and drop at the higher. Or an existing
might also be possible to design scalable congestion controls to rate policer could have congestion-rate policing added, e.g. using
respond less catastrophically to loss that has not been preceded by a the 'local' (non-ConEx) variant of the ConEx aggregate congestion
period of increasing delay. policer [I-D.briscoe-conex-policing]. It might also be possible to
design scalable congestion controls to respond less catastrophically
to loss that has not been preceded by a period of increasing delay.
The design of L4S-friendly rate policers will require a separate The design of L4S-friendly rate policers will require a separate
dedicated document. For further discussion of the interaction dedicated document. For further discussion of the interaction
between L4S and Diffserv, see [I-D.briscoe-tsvwg-l4s-diffserv]. between L4S and Diffserv, see [I-D.briscoe-tsvwg-l4s-diffserv].
8.4. ECN Integrity 8.4. ECN Integrity
Receiving hosts can fool a sender into downloading faster by Receiving hosts can fool a sender into downloading faster by
suppressing feedback of ECN marks (or of losses if retransmissions suppressing feedback of ECN marks (or of losses if retransmissions
are not necessary or available otherwise). Various ways to protect are not necessary or available otherwise). Various ways to protect
skipping to change at page 33, line 9 skipping to change at page 34, line 46
congestion feedback, but it has been reclassified as congestion feedback, but it has been reclassified as
historic [RFC8311]. historic [RFC8311].
Appendix C.1 of [I-D.ietf-tsvwg-ecn-l4s-id] gives more details of Appendix C.1 of [I-D.ietf-tsvwg-ecn-l4s-id] gives more details of
these techniques including their applicability and pros and cons. these techniques including their applicability and pros and cons.
8.5. Privacy Considerations 8.5. Privacy Considerations
As discussed in Section 5.2, the L4S architecture does not preclude As discussed in Section 5.2, the L4S architecture does not preclude
approaches that inspect end-to-end transport layer identifiers. For approaches that inspect end-to-end transport layer identifiers. For
instance it is simple to add L4S support to FQ-CoDel, which instance, L4S support has been added to FQ-CoDel, which classifies by
classifies by application flow ID in the network. However, the main application flow ID in the network. However, the main innovation of
innovation of L4S is the DualQ AQM framework that does not need to L4S is the DualQ AQM framework that does not need to inspect any
inspect any deeper than the outermost IP header, because the L4S deeper than the outermost IP header, because the L4S identifier is in
identifier is in the IP-ECN field. the IP-ECN field.
Thus, the L4S architecture enables very low queuing delay without Thus, the L4S architecture enables very low queuing delay without
_requiring_ inspection of information above the IP layer. This means _requiring_ inspection of information above the IP layer. This means
that users who want to encrypt application flow identifiers, e.g. in that users who want to encrypt application flow identifiers, e.g. in
IPSec or other encrypted VPN tunnels, don't have to sacrifice low IPSec or other encrypted VPN tunnels, don't have to sacrifice low
delay [RFC8404]. delay [RFC8404].
Because L4S can provide low delay for a broad set of applications Because L4S can provide low delay for a broad set of applications
that choose to use it, there is no need for individual applications that choose to use it, there is no need for individual applications
or classes within that broad set to be distinguishable in any way or classes within that broad set to be distinguishable in any way
while traversing networks. This removes much of the ability to while traversing networks. This removes much of the ability to
correlate between the delay requirements of traffic and other correlate between the delay requirements of traffic and other
identifying features [RFC6973]. There may be some types of traffic identifying features [RFC6973]. There may be some types of traffic
that prefer not to use L4S, but the coarse binary categorization of that prefer not to use L4S, but the coarse binary categorization of
traffic reveals very little that could be exploited to compromise traffic reveals very little that could be exploited to compromise
privacy. privacy.
9. Acknowledgements 9. Acknowledgements
Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David
Black, Jake Holland, Vidhi Goel, Ermin Sakic, Praveen Balasubramanian Black, Jake Holland, Vidhi Goel, Ermin Sakic, Praveen
and Gorry Fairhurst for their useful review comments. Balasubramanian, Gorry Fairhurst, Mirja Kuehlewind, Philip Eardley,
Neal Cardwell and Pete Heist for their useful review comments.
Bob Briscoe and Koen De Schepper were part-funded by the European Bob Briscoe and Koen De Schepper were part-funded by the European
Community under its Seventh Framework Programme through the Reducing Community under its Seventh Framework Programme through the Reducing
Internet Transport Latency (RITE) project (ICT-317700). Bob Briscoe Internet Transport Latency (RITE) project (ICT-317700). Bob Briscoe
was also part-funded by the Research Council of Norway through the was also part-funded by the Research Council of Norway through the
TimeIn project, partly by CableLabs and partly by the Comcast TimeIn project, partly by CableLabs and partly by the Comcast
Innovation Fund. The views expressed here are solely those of the Innovation Fund. The views expressed here are solely those of the
authors. authors.
10. Informative References 10. Informative References
skipping to change at page 35, line 5 skipping to change at page 36, line 44
content/uploads/2013/11/ content/uploads/2013/11/
Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf>. Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf>.
[DualPI2Linux] [DualPI2Linux]
Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O., Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O.,
and H. Steen, "DUALPI2 - Low Latency, Low Loss and and H. Steen, "DUALPI2 - Low Latency, Low Loss and
Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019, Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019,
<https://www.netdevconf.org/0x13/session.html?talk- <https://www.netdevconf.org/0x13/session.html?talk-
DUALPI2-AQM>. DUALPI2-AQM>.
[Dukkipati15]
Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is
the Right Metric for Congestion Control", ACM CCR
36(1):59--62, January 2006,
<https://dl.acm.org/doi/10.1145/1111322.1111336>.
[FQ_CoDel_Thresh] [FQ_CoDel_Thresh]
Høiland-Jørgensen, T., "fq_codel: generalise ce_threshold Høiland-Jørgensen, T., "fq_codel: generalise ce_threshold
marking for subset of traffic", Linux Patch Commit ID: marking for subset of traffic", Linux Patch Commit ID:
dfcb63ce1de6b10b, 20 October 2021, dfcb63ce1de6b10b, 20 October 2021,
<https://git.kernel.org/pub/scm/linux/kernel/git/netdev/ <https://git.kernel.org/pub/scm/linux/kernel/git/netdev/
net-next.git/commit/?id=dfcb63ce1de6b10b>. net-next.git/commit/?id=dfcb63ce1de6b10b>.
[Hohlfeld14] [Hohlfeld14]
Hohlfeld, O., Pujol, E., Ciucu, F., Feldmann, A., and P. Hohlfeld, O., Pujol, E., Ciucu, F., Feldmann, A., and P.
Barford, "A QoE Perspective on Sizing Network Buffers", Barford, "A QoE Perspective on Sizing Network Buffers",
Proc. ACM Internet Measurement Conf (IMC'14) hmm, November Proc. ACM Internet Measurement Conf (IMC'14) hmm, November
2014, <http://doi.acm.org/10.1145/2663716.2663730>. 2014, <http://doi.acm.org/10.1145/2663716.2663730>.
skipping to change at page 36, line 44 skipping to change at page 38, line 44
ecn-encap-guidelines-16>. ecn-encap-guidelines-16>.
[I-D.ietf-tsvwg-ecn-l4s-id] [I-D.ietf-tsvwg-ecn-l4s-id]
Schepper, K. D. and B. Briscoe, "Explicit Congestion Schepper, K. D. and B. Briscoe, "Explicit Congestion
Notification (ECN) Protocol for Very Low Queuing Delay Notification (ECN) Protocol for Very Low Queuing Delay
(L4S)", Work in Progress, Internet-Draft, draft-ietf- (L4S)", Work in Progress, Internet-Draft, draft-ietf-
tsvwg-ecn-l4s-id-19, 26 July 2021, tsvwg-ecn-l4s-id-19, 26 July 2021,
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg- <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
ecn-l4s-id-19>. ecn-l4s-id-19>.
[I-D.ietf-tsvwg-l4sops]
White, G., "Operational Guidance for Deployment of L4S in
the Internet", Work in Progress, Internet-Draft, draft-
ietf-tsvwg-l4sops-01, 12 July 2021,
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
l4sops-01>.
[I-D.ietf-tsvwg-nqb] [I-D.ietf-tsvwg-nqb]
White, G. and T. Fossati, "A Non-Queue-Building Per-Hop White, G. and T. Fossati, "A Non-Queue-Building Per-Hop
Behavior (NQB PHB) for Differentiated Services", Work in Behavior (NQB PHB) for Differentiated Services", Work in
Progress, Internet-Draft, draft-ietf-tsvwg-nqb-07, 28 July Progress, Internet-Draft, draft-ietf-tsvwg-nqb-07, 28 July
2021, <https://datatracker.ietf.org/doc/html/draft-ietf- 2021, <https://datatracker.ietf.org/doc/html/draft-ietf-
tsvwg-nqb-07>. tsvwg-nqb-07>.
[I-D.ietf-tsvwg-rfc6040update-shim] [I-D.ietf-tsvwg-rfc6040update-shim]
Briscoe, B., "Propagating Explicit Congestion Notification Briscoe, B., "Propagating Explicit Congestion Notification
Across IP Tunnel Headers Separated by a Shim", Work in Across IP Tunnel Headers Separated by a Shim", Work in
skipping to change at page 38, line 41 skipping to change at page 40, line 44
the `TCP Prague' Requirements for Low Latency Low Loss the `TCP Prague' Requirements for Low Latency Low Loss
Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 , Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
March 2019, <https://www.netdevconf.org/0x13/ March 2019, <https://www.netdevconf.org/0x13/
session.html?talk-tcp-prague-l4s>. session.html?talk-tcp-prague-l4s>.
[QDyn] Briscoe, B., "Rapid Signalling of Queue Dynamics", [QDyn] Briscoe, B., "Rapid Signalling of Queue Dynamics",
bobbriscoe.net Technical Report TR-BB-2017-001; bobbriscoe.net Technical Report TR-BB-2017-001;
arXiv:1904.07044 [cs.NI], September 2017, arXiv:1904.07044 [cs.NI], September 2017,
<https://arxiv.org/abs/1904.07044>. <https://arxiv.org/abs/1904.07044>.
[Rajiullah15]
Rajiullah, M., "Towards a Low Latency Internet:
Understanding and Solutions", Masters Thesis; Karlstad
Uni, Dept of Maths & CS 2015:41, 2015, <https://www.diva-
portal.org/smash/get/diva2:846109/FULLTEXT01.pdf>.
[RFC0970] Nagle, J., "On Packet Switches With Infinite Storage",
RFC 970, DOI 10.17487/RFC0970, December 1985,
<https://www.rfc-editor.org/info/rfc970>.
[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
and W. Weiss, "An Architecture for Differentiated and W. Weiss, "An Architecture for Differentiated
Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, Services", RFC 2475, DOI 10.17487/RFC2475, December 1998,
<https://www.rfc-editor.org/info/rfc2475>. <https://www.rfc-editor.org/info/rfc2475>.
[RFC2697] Heinanen, J. and R. Guerin, "A Single Rate Three Color
Marker", RFC 2697, DOI 10.17487/RFC2697, September 1999,
<https://www.rfc-editor.org/info/rfc2697>.
[RFC2698] Heinanen, J. and R. Guerin, "A Two Rate Three Color [RFC2698] Heinanen, J. and R. Guerin, "A Two Rate Three Color
Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999, Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999,
<https://www.rfc-editor.org/info/rfc2698>. <https://www.rfc-editor.org/info/rfc2698>.
[RFC2884] Hadi Salim, J. and U. Ahmed, "Performance Evaluation of [RFC2884] Hadi Salim, J. and U. Ahmed, "Performance Evaluation of
Explicit Congestion Notification (ECN) in IP Networks", Explicit Congestion Notification (ECN) in IP Networks",
RFC 2884, DOI 10.17487/RFC2884, July 2000, RFC 2884, DOI 10.17487/RFC2884, July 2000,
<https://www.rfc-editor.org/info/rfc2884>. <https://www.rfc-editor.org/info/rfc2884>.
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
skipping to change at page 40, line 5 skipping to change at page 42, line 19
[RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion [RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion
Control Algorithms", BCP 133, RFC 5033, Control Algorithms", BCP 133, RFC 5033,
DOI 10.17487/RFC5033, August 2007, DOI 10.17487/RFC5033, August 2007,
<https://www.rfc-editor.org/info/rfc5033>. <https://www.rfc-editor.org/info/rfc5033>.
[RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
Friendly Rate Control (TFRC): Protocol Specification", Friendly Rate Control (TFRC): Protocol Specification",
RFC 5348, DOI 10.17487/RFC5348, September 2008, RFC 5348, DOI 10.17487/RFC5348, September 2008,
<https://www.rfc-editor.org/info/rfc5348>. <https://www.rfc-editor.org/info/rfc5348>.
[RFC5670] Eardley, P., Ed., "Metering and Marking Behaviour of PCN-
Nodes", RFC 5670, DOI 10.17487/RFC5670, November 2009,
<https://www.rfc-editor.org/info/rfc5670>.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
<https://www.rfc-editor.org/info/rfc5681>. <https://www.rfc-editor.org/info/rfc5681>.
[RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP
Authentication Option", RFC 5925, DOI 10.17487/RFC5925, Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
June 2010, <https://www.rfc-editor.org/info/rfc5925>. June 2010, <https://www.rfc-editor.org/info/rfc5925>.
[RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion
Notification", RFC 6040, DOI 10.17487/RFC6040, November Notification", RFC 6040, DOI 10.17487/RFC6040, November
 End of changes. 45 change blocks. 
162 lines changed or deleted 232 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/