draft-ietf-tsvwg-aqm-dualq-coupled-11.txt   draft-ietf-tsvwg-aqm-dualq-coupled-12.txt 
Transport Area working group (tsvwg) K. De Schepper Transport Area working group (tsvwg) K. De Schepper
Internet-Draft Nokia Bell Labs Internet-Draft Nokia Bell Labs
Intended status: Experimental B. Briscoe, Ed. Intended status: Experimental B. Briscoe, Ed.
Expires: September 10, 2020 Independent Expires: January 28, 2021 Independent
G. White G. White
CableLabs CableLabs
March 9, 2020 July 27, 2020
DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput
(L4S) (L4S)
draft-ietf-tsvwg-aqm-dualq-coupled-11 draft-ietf-tsvwg-aqm-dualq-coupled-12
Abstract Abstract
The Low Latency Low Loss Scalable Throughput (L4S) architecture The Low Latency Low Loss Scalable Throughput (L4S) architecture
allows data flows over the public Internet to achieve consistent low allows data flows over the public Internet to achieve consistent low
queuing latency, generally zero congestion loss and scaling of per- queuing latency, generally zero congestion loss and scaling of per-
flow throughput without the scaling problems of standard TCP Reno- flow throughput without the scaling problems of standard TCP Reno-
friendly congestion controls. To achieve this, L4S data flows have friendly congestion controls. To achieve this, L4S data flows have
to use one of the family of 'Scalable' congestion controls (TCP to use one of the family of 'Scalable' congestion controls (TCP
Prague and Data Center TCP are examples) and a form of Explicit Prague and Data Center TCP are examples) and a form of Explicit
skipping to change at page 2, line 20 skipping to change at page 2, line 20
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 10, 2020. This Internet-Draft will expire on January 28, 2021.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 25, line 32 skipping to change at page 25, line 32
implementations were tested. implementations were tested.
7. References 7. References
7.1. Normative References 7.1. Normative References
[I-D.ietf-tsvwg-ecn-l4s-id] [I-D.ietf-tsvwg-ecn-l4s-id]
Schepper, K. and B. Briscoe, "Identifying Modified Schepper, K. and B. Briscoe, "Identifying Modified
Explicit Congestion Notification (ECN) Semantics for Explicit Congestion Notification (ECN) Semantics for
Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s- Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s-
id-09 (work in progress), February 2020. id-10 (work in progress), March 2020.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP", of Explicit Congestion Notification (ECN) to IP",
RFC 3168, DOI 10.17487/RFC3168, September 2001, RFC 3168, DOI 10.17487/RFC3168, September 2001,
<https://www.rfc-editor.org/info/rfc3168>. <https://www.rfc-editor.org/info/rfc3168>.
skipping to change at page 27, line 31 skipping to change at page 27, line 31
[I-D.briscoe-tsvwg-l4s-diffserv] [I-D.briscoe-tsvwg-l4s-diffserv]
Briscoe, B., "Interactions between Low Latency, Low Loss, Briscoe, B., "Interactions between Low Latency, Low Loss,
Scalable Throughput (L4S) and Differentiated Services", Scalable Throughput (L4S) and Differentiated Services",
draft-briscoe-tsvwg-l4s-diffserv-02 (work in progress), draft-briscoe-tsvwg-l4s-diffserv-02 (work in progress),
November 2018. November 2018.
[I-D.ietf-tsvwg-l4s-arch] [I-D.ietf-tsvwg-l4s-arch]
Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low
Latency, Low Loss, Scalable Throughput (L4S) Internet Latency, Low Loss, Scalable Throughput (L4S) Internet
Service: Architecture", draft-ietf-tsvwg-l4s-arch-05 (work Service: Architecture", draft-ietf-tsvwg-l4s-arch-06 (work
in progress), February 2020. in progress), March 2020.
[I-D.ietf-tsvwg-nqb] [I-D.ietf-tsvwg-nqb]
White, G. and T. Fossati, "A Non-Queue-Building Per-Hop White, G. and T. Fossati, "A Non-Queue-Building Per-Hop
Behavior (NQB PHB) for Differentiated Services", draft- Behavior (NQB PHB) for Differentiated Services", draft-
ietf-tsvwg-nqb-00 (work in progress), November 2019. ietf-tsvwg-nqb-01 (work in progress), March 2020.
[L4Sdemo16] [L4Sdemo16]
Bondarenko, O., De Schepper, K., Tsang, I., and B. Bondarenko, O., De Schepper, K., Tsang, I., and B.
Briscoe, "Ultra-Low Delay for All: Live Experience, Live Briscoe, "Ultra-Low Delay for All: Live Experience, Live
Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016, Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016,
<http://dl.acm.org/citation.cfm?doid=2910017.2910633 <http://dl.acm.org/citation.cfm?doid=2910017.2910633
(videos of demos: (videos of demos:
https://riteproject.eu/dctth/#1511dispatchwg )>. https://riteproject.eu/dctth/#1511dispatchwg )>.
[LLD] White, G., Sundaresan, K., and B. Briscoe, "Low Latency [LLD] White, G., Sundaresan, K., and B. Briscoe, "Low Latency
skipping to change at page 46, line 9 skipping to change at page 46, line 9
18: S_L = S_C - k' % L4S ramp scaling factor as a power of 2 18: S_L = S_C - k' % L4S ramp scaling factor as a power of 2
19: range_L = 2^S_L % Range of L4S ramp 19: range_L = 2^S_L % Range of L4S ramp
20: } 20: }
Figure 9: Example Header Pseudocode for DualQ Coupled Curvy RED AQM Figure 9: Example Header Pseudocode for DualQ Coupled Curvy RED AQM
1: cred_dequeue(lq, cq, pkt) { % Couples L4S & Classic queues 1: cred_dequeue(lq, cq, pkt) { % Couples L4S & Classic queues
2: while ( lq.len() + cq.len() > 0 ) { 2: while ( lq.len() + cq.len() > 0 ) {
3: if ( scheduler() == lq ) { 3: if ( scheduler() == lq ) {
4: lq.dequeue(pkt) % L4S scheduled 4: lq.dequeue(pkt) % L4S scheduled
5a: p_CL = (cq.time() - minTh) / range_L 5a: p_CL = (Q_C - minTh) / range_L
5b: if ( ( lq.time() > T ) 5b: if ( ( lq.time() > T )
5c: OR ( p_CL > maxrand(U) ) ) 5c: OR ( p_CL > maxrand(U) ) )
6: mark(pkt) 6: mark(pkt)
7: } else { 7: } else {
8: cq.dequeue(pkt) % Classic scheduled 8: cq.dequeue(pkt) % Classic scheduled
9a: Q_C = gamma * qc.time() + (1-gamma) * Q_C % Classic Q EWMA 9a: Q_C = gamma * cq.time() + (1-gamma) * Q_C % Classic Q EWMA
10a: sqrt_p_C = (Q_C - minTh) / range_C 10a: sqrt_p_C = (Q_C - minTh) / range_C
10b: if ( sqrt_p_C > maxrand(2*U) ) { 10b: if ( sqrt_p_C > maxrand(2*U) ) {
11: if ( (ecn(pkt) == 0) { % ECN field = not-ECT 11: if ( (ecn(pkt) == 0) { % ECN field = not-ECT
12: drop(pkt) % Squared drop, redo loop 12: drop(pkt) % Squared drop, redo loop
13: continue % continue to the top of the while loop 13: continue % continue to the top of the while loop
14: } 14: }
15: mark(pkt) 15: mark(pkt)
16: } 16: }
17: } 17: }
18: return(pkt) % return the packet and stop here 18: return(pkt) % return the packet and stop here
skipping to change at page 47, line 27 skipping to change at page 47, line 27
Within each queue, the decision whether to forward, drop or mark is Within each queue, the decision whether to forward, drop or mark is
taken as follows (to simplify the explanation, it is assumed that taken as follows (to simplify the explanation, it is assumed that
U=1): U=1):
L4S: If the test at line 3 determines there is an L4S packet to L4S: If the test at line 3 determines there is an L4S packet to
dequeue, the tests at lines 5b and 5c determine whether to mark dequeue, the tests at lines 5b and 5c determine whether to mark
it. The first is a simple test of whether the L4S queue delay it. The first is a simple test of whether the L4S queue delay
(lq.time()) is greater than a step threshold T (Note 3). The (lq.time()) is greater than a step threshold T (Note 3). The
second test is similar to the random ECN marking in RED, but with second test is similar to the random ECN marking in RED, but with
the following differences: ii) marking depends on queuing time, the following differences: i) marking depends on queuing time, not
not bytes, in order to scale for any link rate without being bytes, in order to scale for any link rate without being
reconfigured; ii) marking of the L4S queue does not depend on reconfigured; ii) marking of the L4S queue depends on a logical OR
itself, it depends on the queuing time of the _other_ (Classic) of two tests; one against its own queuing time and one against the
queue, where cq.time() is the queuing time of the packet at the queuing time of the _other_ (Classic) queue; iii) the tests are
head of the Classic queue (zero if empty); iii) marking depends on against the instantaneous queuing time of the L4S queue, but a
the instantaneous queuing time (of the other Classic queue), not a smoothed average of the other (Classic) queue; iv) the queue is
smoothed average; iv) the queue is compared with the maximum of U compared with the maximum of U random numbers (but if U=1, this is
random numbers (but if U=1, this is the same as the single random the same as the single random number used in RED).
number used in RED).
Specifically, in line 5a the coupled marking probability p_CL is Specifically, in line 5a the coupled marking probability p_CL is
set to the excess of the Classic queueing delay qc.time() above set to the amount by which the averaged Classic queueing delay Q_C
the minimum queuing delay threshold (minTh) all divided by the L4S exceeds the minimum queuing delay threshold (minTh) all divided by
scaling parameter range_L. range_L represents the queuing delay the L4S scaling parameter range_L. range_L represents the queuing
(in seconds) added to minTh at which marking probability would hit delay (in seconds) added to minTh at which marking probability
100%. Then in line 5c (if U=1) the result is compared with a would hit 100%. Then in line 5c (if U=1) the result is compared
uniformly distributed random number between 0 and 1, which ensures with a uniformly distributed random number between 0 and 1, which
that marking probability will linearly increase with queueing ensures that, over range_L, marking probability will linearly
time. increase with queueing time.
Classic: If the scheduler at line 3 chooses to dequeue a Classic Classic: If the scheduler at line 3 chooses to dequeue a Classic
packet and jumps to line 7, the test at line 10b determines packet and jumps to line 7, the test at line 10b determines
whether to drop or mark it. But before that, line 9a updates Q_C, whether to drop or mark it. But before that, line 9a updates Q_C,
which is an exponentially weighted moving average (Note 4) of the which is an exponentially weighted moving average (Note 4) of the
queuing time in the Classic queue, where qc.time() is the current queuing time of the Classic queue, where cq.time() is the current
instantaneous queueing time of the Classic queue and gamma is the instantaneous queueing time of the packet at the head of the
EWMA constant (default 1/32, see line 12 of the initialization Classic queue (zero if empty) and gamma is the EWMA constant
function). (default 1/32, see line 12 of the initialization function).
Lines 10a and 10b implement the Classic AQM. In line 10a the Lines 10a and 10b implement the Classic AQM. In line 10a the
averaged queuing time Q_C is divided by the Classic scaling averaged queuing time Q_C is divided by the Classic scaling
parameter range_C, in the same way that queuing time was scaled parameter range_C, in the same way that queuing time was scaled
for L4S marking. This scaled queuing time will be squared to for L4S marking. This scaled queuing time will be squared to
compute Classic drop probability so, before it is squared, it is compute Classic drop probability so, before it is squared, it is
effectively the square root of the drop probability, hence it is effectively the square root of the drop probability, hence it is
given the variable name sqrt_p_C. The squaring is done by given the variable name sqrt_p_C. The squaring is done by
comparing it with the maximum out of two random numbers (assuming comparing it with the maximum out of two random numbers (assuming
U=1). Comparing it with the maximum out of two is the same as the U=1). Comparing it with the maximum out of two is the same as the
skipping to change at page 48, line 38 skipping to change at page 48, line 37
rather than queuing becomes the dominant cause of delay for short rather than queuing becomes the dominant cause of delay for short
flows, due to timeouts and tail losses. flows, due to timeouts and tail losses.
Curvy RED constrains delay with a softened target that allows some Curvy RED constrains delay with a softened target that allows some
increase in delay as load increases. This is achieved by increasing increase in delay as load increases. This is achieved by increasing
drop probability on a convex curve relative to queue growth (the drop probability on a convex curve relative to queue growth (the
square curve in the Classic queue, if U=1). Like RED, the curve hugs square curve in the Classic queue, if U=1). Like RED, the curve hugs
the zero axis while the queue is shallow. Then, as load increases, the zero axis while the queue is shallow. Then, as load increases,
it introduces a growing barrier to higher delay. But, unlike RED, it it introduces a growing barrier to higher delay. But, unlike RED, it
requires only two parameters, not three. The disadvantage of Curvy requires only two parameters, not three. The disadvantage of Curvy
RED is that it is not adapted to a wide range of RTTs. Curvy RED can RED (compared to a PI controller for example) is that it is not
be used as is when the RTT range to be supported is limited, adapted to a wide range of RTTs. Curvy RED can be used as is when
otherwise an adaptation mechanism is required. the RTT range to be supported is limited, otherwise an adaptation
mechanism is required.
From our limited experiments with Curvy RED so far, recommended From our limited experiments with Curvy RED so far, recommended
values of these parameters are: S_C = -1; g_C = 5; T = 5 * MTU at the values of these parameters are: S_C = -1; g_C = 5; T = 5 * MTU at the
link rate (about 1ms at 60Mb/s) for the range of base RTTs typical on link rate (about 1ms at 60Mb/s) for the range of base RTTs typical on
the public Internet. [CRED_Insights] explains why these parameters the public Internet. [CRED_Insights] explains why these parameters
are applicable whatever rate link this AQM implementation is deployed are applicable whatever rate link this AQM implementation is deployed
on and how the parameters would need to be adjusted for a scenario on and how the parameters would need to be adjusted for a scenario
with a different range of RTTs (e.g. a data centre). The setting of with a different range of RTTs (e.g. a data centre). The setting of
k depends on policy (see Section 2.5 and Appendix C.2 respectively k depends on policy (see Section 2.5 and Appendix C.2 respectively
for its recommended setting and guidance on alternatives). for its recommended setting and guidance on alternatives).
skipping to change at page 50, line 19 skipping to change at page 50, line 19
in parallel out of band. For instance, if U=1, the Classic queue in parallel out of band. For instance, if U=1, the Classic queue
would require the maximum of two random numbers. So, instead of would require the maximum of two random numbers. So, instead of
calling maxrand(2*U) in-band, the maximum of every pair of values calling maxrand(2*U) in-band, the maximum of every pair of values
from a pseudorandom number generator could be generated out-of-band, from a pseudorandom number generator could be generated out-of-band,
and held in a buffer ready for the Classic queue to consume. and held in a buffer ready for the Classic queue to consume.
1: cred_dequeue(lq, cq, pkt) { % Couples L4S & Classic queues 1: cred_dequeue(lq, cq, pkt) { % Couples L4S & Classic queues
2: while ( lq.len() + cq.len() > 0 ) { 2: while ( lq.len() + cq.len() > 0 ) {
3: if ( scheduler() == lq ) { 3: if ( scheduler() == lq ) {
4: lq.dequeue(pkt) % L4S scheduled 4: lq.dequeue(pkt) % L4S scheduled
5: if ((lq.time() > T) OR (cq.ns() >> (S_L-2) > maxrand(U))) 5: if ((lq.time() > T) OR (Q_C >> (S_L-2) > maxrand(U)))
6: mark(pkt) 6: mark(pkt)
7: } else { 7: } else {
8: cq.dequeue(pkt) % Classic scheduled 8: cq.dequeue(pkt) % Classic scheduled
9: Q_C += (cq.ns() - Q_C) >> g_C % Classic Q EWMA 9: Q_C += (qc.ns() - Q_C) >> g_C % Classic Q EWMA
10: if ( (Q_C >> (S_C-2) ) > maxrand(2*U) ) { 10: if ( (Q_C >> (S_C-2) ) > maxrand(2*U) ) {
11: if ( (ecn(pkt) == 0) { % ECN field = not-ECT 11: if ( (ecn(pkt) == 0) { % ECN field = not-ECT
12: drop(pkt) % Squared drop, redo loop 12: drop(pkt) % Squared drop, redo loop
13: continue % continue to the top of the while loop 13: continue % continue to the top of the while loop
14: } 14: }
15: mark(pkt) 15: mark(pkt)
16: } 16: }
17: } 17: }
18: return(pkt) % return the packet and stop here 18: return(pkt) % return the packet and stop here
19: } 19: }
skipping to change at page 50, line 51 skipping to change at page 50, line 51
that division can be implemented as a right bit-shift (>>) in lines 5 that division can be implemented as a right bit-shift (>>) in lines 5
and 10 of the integer variant of the pseudocode (Figure 11). and 10 of the integer variant of the pseudocode (Figure 11).
For the integer variant of the pseudocode, an integer version of the For the integer variant of the pseudocode, an integer version of the
rand() function used at line 25 of the maxrand(function) in Figure 10 rand() function used at line 25 of the maxrand(function) in Figure 10
would be arranged to return an integer in the range 0 <= maxrand() < would be arranged to return an integer in the range 0 <= maxrand() <
2^32 (not shown). This would scale up all the floating point 2^32 (not shown). This would scale up all the floating point
probabilities in the range [0,1] by 2^32. probabilities in the range [0,1] by 2^32.
Queuing delays are also scaled up by 2^32, but in two stages: i) In Queuing delays are also scaled up by 2^32, but in two stages: i) In
lines 5 and 10 queuing times cq.ns() and pkt.ns() are returned in line 9 queuing time qc.ns() is returned in integer nanoseconds,
integer nanoseconds, making the values about 2^30 times larger than making the value about 2^30 times larger than when the units were
when the units were seconds, ii) then in lines 3 and 9 an adjustment seconds, ii) then in lines 5 and 10 an adjustment of -2 to the right
of -2 to the right bit-shift multiplies the result by 2^2, to bit-shift multiplies the result by 2^2, to complete the scaling by
complete the scaling by 2^32. 2^32.
In line 8 of the initialization function, the EWMA constant gamma is In line 8 of the initialization function, the EWMA constant gamma is
represented as an integer power of 2, g_C, so that in line 9 of the represented as an integer power of 2, g_C, so that in line 9 of the
integer code the division needed to weight the moving average can be integer code the division needed to weight the moving average can be
implemented by a right bit-shift (>> g_C). implemented by a right bit-shift (>> g_C).
Appendix C. Choice of Coupling Factor, k Appendix C. Choice of Coupling Factor, k
C.1. RTT-Dependence C.1. RTT-Dependence
 End of changes. 16 change blocks. 
42 lines changed or deleted 42 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/