--- 1/draft-ietf-rmcat-gcc-01.txt 2016-07-08 10:17:28.853080860 -0700 +++ 2/draft-ietf-rmcat-gcc-02.txt 2016-07-08 10:17:29.197089481 -0700 @@ -1,22 +1,22 @@ Network Working Group S. Holmer Internet-Draft H. Lundin Intended status: Informational Google -Expires: April 21, 2016 G. Carlucci +Expires: January 9, 2017 G. Carlucci L. De Cicco S. Mascolo Politecnico di Bari - October 19, 2015 + July 8, 2016 A Google Congestion Control Algorithm for Real-Time Communication - draft-ietf-rmcat-gcc-01 + draft-ietf-rmcat-gcc-02 Abstract This document describes two methods of congestion control when using real-time communications on the World Wide Web (RTCWEB); one delay- based and one loss-based. It is published as an input document to the RMCAT working group on congestion control for media streams. The mailing list of that working group is rmcat@ietf.org. @@ -35,69 +35,72 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on April 21, 2016. + This Internet-Draft will expire on January 9, 2017. Copyright Notice - Copyright (c) 2015 IETF Trust and the persons identified as the + Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Mathematical notation conventions . . . . . . . . . . . . 3 2. System model . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Feedback and extensions . . . . . . . . . . . . . . . . . . . 4 - 4. Delay-based control . . . . . . . . . . . . . . . . . . . . . 5 - 4.1. Arrival-time model . . . . . . . . . . . . . . . . . . . 5 - 4.2. Arrival-time filter . . . . . . . . . . . . . . . . . . . 7 - 4.3. Over-use detector . . . . . . . . . . . . . . . . . . . . 8 - 4.4. Rate control . . . . . . . . . . . . . . . . . . . . . . 9 - 4.5. Parameters settings . . . . . . . . . . . . . . . . . . . 12 - 5. Loss-based control . . . . . . . . . . . . . . . . . . . . . 13 - 6. Interoperability Considerations . . . . . . . . . . . . . . . 13 - 7. Implementation Experience . . . . . . . . . . . . . . . . . . 14 - 8. Further Work . . . . . . . . . . . . . . . . . . . . . . . . 14 - 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 - 10. Security Considerations . . . . . . . . . . . . . . . . . . . 14 - 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 - 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 12.1. Normative References . . . . . . . . . . . . . . . . . . 15 - 12.2. Informative References . . . . . . . . . . . . . . . . . 15 + 4. Sending Engine . . . . . . . . . . . . . . . . . . . . . . . 5 + 5. Delay-based control . . . . . . . . . . . . . . . . . . . . . 5 + 5.1. Arrival-time model . . . . . . . . . . . . . . . . . . . 6 + 5.2. Pre-filtering . . . . . . . . . . . . . . . . . . . . . . 7 + 5.3. Arrival-time filter . . . . . . . . . . . . . . . . . . . 7 + 5.4. Over-use detector . . . . . . . . . . . . . . . . . . . . 8 + 5.5. Rate control . . . . . . . . . . . . . . . . . . . . . . 10 + 5.6. Parameters settings . . . . . . . . . . . . . . . . . . . 12 + 6. Loss-based control . . . . . . . . . . . . . . . . . . . . . 13 + 7. Interoperability Considerations . . . . . . . . . . . . . . . 14 + 8. Implementation Experience . . . . . . . . . . . . . . . . . . 14 + 9. Further Work . . . . . . . . . . . . . . . . . . . . . . . . 15 + 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 + 11. Security Considerations . . . . . . . . . . . . . . . . . . . 15 + 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 + 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 + 13.1. Normative References . . . . . . . . . . . . . . . . . . 16 + 13.2. Informative References . . . . . . . . . . . . . . . . . 16 Appendix A. Change log . . . . . . . . . . . . . . . . . . . . . 16 A.1. Version -00 to -01 . . . . . . . . . . . . . . . . . . . 16 - A.2. Version -01 to -02 . . . . . . . . . . . . . . . . . . . 16 - A.3. Version -02 to -03 . . . . . . . . . . . . . . . . . . . 16 - A.4. rtcweb-03 to rmcat-00 . . . . . . . . . . . . . . . . . . 16 + A.2. Version -01 to -02 . . . . . . . . . . . . . . . . . . . 17 + A.3. Version -02 to -03 . . . . . . . . . . . . . . . . . . . 17 + A.4. rtcweb-03 to rmcat-00 . . . . . . . . . . . . . . . . . . 17 A.5. rmcat -00 to -01 . . . . . . . . . . . . . . . . . . . . 17 A.6. rmcat -01 to -02 . . . . . . . . . . . . . . . . . . . . 17 - A.7. rmcat -02 to -03 . . . . . . . . . . . . . . . . . . . . 17 - A.8. ietf-rmcat -00 to ietf-rmcat -01 . . . . . . . . . . . . 17 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 + A.7. rmcat -02 to -03 . . . . . . . . . . . . . . . . . . . . 18 + A.8. ietf-rmcat -00 to ietf-rmcat -01 . . . . . . . . . . . . 18 + A.9. ietf-rmcat -01 to ietf-rmcat -02 . . . . . . . . . . . . 18 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 1. Introduction Congestion control is a requirement for all applications sharing the Internet resources [RFC2914]. Congestion control for real-time media is challenging for a number of reasons: o The media is usually encoded in forms that cannot be quickly @@ -137,24 +140,24 @@ subscript i. E{X} The expected value of the stochastic variable X 2. System model The following elements are in the system: o RTP packet - an RTP packet containing media data. - o Packet group - a set of RTP packets transmitted from the sender - uniquely identified by the group departure and group arrival time - (absolute send time) [abs-send-time]. These could be video - packets, audio packets, or a mix of audio and video packets. + o Group of packets - a set of RTP packets transmitted from the + sender uniquely identified by the group departure and group + arrival time (absolute send time) [abs-send-time]. These could be + video packets, audio packets, or a mix of audio and video packets. o Incoming media stream - a stream of frames consisting of RTP packets. o RTP sender - sends the RTP stream over the network to the RTP receiver. It generates the RTP timestamp and the abs-send-time header extension o RTP receiver - receives the RTP stream, marks the time of arrival. @@ -210,57 +213,54 @@ RTP header extension [abs-send-time] to enable the receiver to compute the inter-group delay variation. The output from the delay- based controller will be a bitrate, which will be sent back to the sender using the REMB feedback message [I-D.alvestrand-rmcat-remb]. The packet loss ratio is sent back via RTCP receiver reports. At the sender the bitrate in the REMB message and the fraction of packets lost are fed into the loss-based controller, which outputs a final target bitrate. It is RECOMMENDED to send the REMB message as soon as congestion is detected, and otherwise at least once every second. -4. Delay-based control +4. Sending Engine - The delay-based control algorithm can be further decomposed into - three parts: an arrival-time filter, an over-use detector, and a rate - controller. + Pacing is used to actuate the target bitrate computed by the + controllers. -4.1. Arrival-time model + When media encoder produces data, this is fed into a Pacer queue. + The Pacer sends a group of packets to the network every burst_time + interval. RECOMMENDED value for burst_time is 5 ms. The size of a + group of packets is computed as the product between the target + bitrate and the burst_time. + +5. Delay-based control + + The delay-based control algorithm can be further decomposed into four + parts: a pre-filtering, an arrival-time filter, an over-use detector, + and a rate controller. + +5.1. Arrival-time model This section describes an adaptive filter that continuously updates estimates of network parameters based on the timing of the received - packets. + groups of packets. We define the inter-arrival time, t(i) - t(i-1), as the difference in - arrival time of two packets or two groups of packets. - Correspondingly, the inter-departure time, T(i) - T(i-1), is defined - as the difference in departure-time of two packets or two groups of - packets. Finally, the inter-group delay variation, d(i), is defined - as the difference between the inter-arrival time and the inter- - departure time. Or interpreted differently, as the difference - between the delay of group i and group i-1. + arrival time of two groups of packets. Correspondingly, the inter- + departure time, T(i) - T(i-1), is defined as the difference in + departure-time of two groups of packets. Finally, the inter-group + delay variation, d(i), is defined as the difference between the + inter-arrival time and the inter-departure time. Or interpreted + differently, as the difference between the delay of group i and group + i-1. d(i) = t(i) - t(i-1) - (T(i) - T(i-1)) - At the receiving side we are observing groups of incoming packets, - where a group of packets is defined as follows: - - o A sequence of packets which are sent within a burst_time interval - constitute a group. RECOMMENDED value for burst_time is 5 ms. - - o In addition, any packet which has an inter-arrival time less than - burst_time and an inter-group delay variation d(i) less than 0 is - also considered being part of the current group of packets. The - reasoning behind including these packets in the group is to better - handle delay transients, caused by packets being queued up for - reasons unrelated to congestion. As an example this has been - observed to happen on many Wi-Fi and wireless networks. - An inter-departure time is computed between consecutive groups as T(i) - T(i-1), where T(i) is the departure timestamp of the last packet in the current packet group being processed. Any packets received out of order are ignored by the arrival-time model. Each group is assigned a receive time t(i), which corresponds to the time at which the last packet of the group was received. A group is delayed relative to its predecessor if t(i) - t(i-1) > T(i) - T(i-1), i.e., if the inter-arrival time is larger than the inter-departure time. @@ -279,21 +279,39 @@ Breaking out the mean, m(i), from w(i) to make the process zero mean, we get Equation 1 d(i) = m(i) + v(i) The noise term v(i) represents network jitter and other delay effects not captured by the model. -4.2. Arrival-time filter +5.2. Pre-filtering + + The pre-filtering aims at handling delay transients caused by channel + outages. During an outage, packets being queued in network buffers, + for reasons unrelated to congestion, are delivered in a burst when + the outage ends. + + The pre-filtering merges together groups of packets that arrive in a + burst. Packets are merged in the same group if one of these two + conditions holds: + + o A sequence of packets which are sent within a burst_time interval + constitute a group. + + o A Packet which has an inter-arrival time less than burst_time and + an inter-group delay variation d(i) less than 0 is considered + being part of the current group of packets. + +5.3. Arrival-time filter The parameter d(i) is readily available for each group of packets, i > 1. We want to estimate m(i) and use this estimate to detect whether or not the bottleneck link is over-used. The parameter can be estimated by any adaptive filter - we are using the Kalman filter. Let m(i) be the estimate at time i We model the state evolution from time i to time i+1 as @@ -336,21 +353,21 @@ highest rate at which the last K packet groups have been received and chi is a filter coefficient typically chosen as a number in the interval [0.1, 0.001]. Since our assumption that v(i) should be zero mean WGN is less accurate in some cases, we have introduced an additional outlier filter around the updates of var_v_hat. If z(i) > 3*sqrt(var_v_hat) the filter is updated with 3*sqrt(var_v_hat) rather than z(i). For instance v(i) will not be white in situations where packets are sent at a higher rate than the channel capacity, in which case they will be queued behind each other. -4.3. Over-use detector +5.4. Over-use detector The inter-group delay variation estimate m(i), obtained as the output of the arrival-time filter, is compared with a threshold del_var_th(i). An estimate above the threshold is considered as an indication of over-use. Such an indication is not enough for the detector to signal over-use to the rate control subsystem. A definitive over-use will be signaled only if over-use has been detected for at least overuse_time_th milliseconds. However, if m(i) < m(i-1), over-use will not be signaled even if all the above conditions are met. Similarly, the opposite state, under-use, is @@ -401,21 +418,21 @@ decreased so that a lower queuing delay can be achieved. It is RECOMMENDED to choose K_u > K_d so that the rate at which del_var_th is increased is higher than the rate at which it is decreased. With this setting it is possible to increase the threshold in the case of a concurrent TCP flow and prevent starvation as well as enforcing intra-protocol fairness. RECOMMENDED values for del_var_th(0), overuse_time_th, K_u and K_d are respectively 12.5 ms, 10 ms, 0.01 and 0.00018. -4.4. Rate control +5.5. Rate control The rate control is split in two parts, one controlling the bandwidth estimate based on delay, and one controlling the bandwidth estimate based on loss. Both are designed to increase the estimate of the available bandwidth A_hat as long as there is no detected congestion and to ensure that we will eventually match the available bandwidth of the channel and detect an over-use. As soon as over-use has been detected, the available bandwidth estimated by the delay-based controller is decreased. In this way we @@ -530,22 +547,21 @@ will enter the hold state, where the receive-side available bandwidth estimate will be held constant while waiting for the queues to stabilize at a lower level - a way of keeping the delay as low as possible. This decrease of delay is wanted, and expected, immediately after the estimate has been reduced due to over-use, but can also happen if the cross traffic over some links is reduced. It is RECOMMENDED that the routine to update A_hat(i) is run at least once every response_time interval. -4.5. Parameters settings - +5.6. Parameters settings +-----------------+-----------------------------------+-------------+ | Parameter | Description | RECOMMENDED | | | | Value | +-----------------+-----------------------------------+-------------+ | burst_time | Time limit in milliseconds | 5 ms | | | between packet bursts which | | | | identifies a group | | | q | State noise covariance matrix | q = 10^-3 | | e(0) | Initial value of the system | e(0) = 0.1 | | | error covariance | | @@ -561,21 +577,21 @@ | | threshold | | | T | Time window for measuring the | [0.5, 1] s | | | received bitrate | | | beta | Decrease rate factor | 0.85 | +-----------------+-----------------------------------+-------------+ Table 1: RECOMMENDED values for delay based controller Table 1 -5. Loss-based control +6. Loss-based control A second part of the congestion controller bases its decisions on the round-trip time, packet loss and available bandwidth estimates A_hat received from the delay-based controller. The available bandwidth estimates computed by the loss-based controller are denoted with As_hat. The available bandwidth estimates A_hat produced by the delay-based controller are only reliable when the size of the queues along the path sufficiently large. If the queues are very short, over-use will @@ -601,92 +617,92 @@ between As_hat and A_hat. We motivate the packet loss thresholds by noting that if the transmission channel has a small amount of packet loss due to over- use, that amount will soon increase if the sender does not adjust his bitrate. Therefore we will soon enough reach above the 10% threshold and adjust As_hat(i). However, if the packet loss ratio does not increase, the losses are probably not related to self-inflicted congestion and therefore we should not react on them. -6. Interoperability Considerations +7. Interoperability Considerations In case a sender implementing these algorithms talks to a receiver which do not implement any of the proposed RTCP messages and RTP header extensions, it is suggested that the sender monitors RTCP receiver reports and uses the fraction of lost packets and the round- trip time as input to the loss-based controller. The delay-based controller should be left disabled. -7. Implementation Experience +8. Implementation Experience This algorithm has been implemented in the open-source WebRTC project, has been in use in Chrome since M23, and is being used by Google Hangouts. Deployment of the algorithm have revealed problems related to, e.g, congested or otherwise problematic WiFi networks, which have led to algorithm improvements. The algorithm has also been tested in a multi-party conference scenario with a conference server which terminates the congestion control between endpoints. This ensures that no assumptions are being made by the congestion control about maximum send and receive bitrates, etc., which typically is out of control for a conference server. -8. Further Work +9. Further Work This draft is offered as input to the congestion control discussion. Work that can be done on this basis includes: o Considerations of integrated loss control: How loss and delay control can be better integrated, and the loss control improved. o Considerations of locus of control: evaluate the performance of having all congestion control logic at the sender, compared to splitting logic between sender and receiver. o Considerations of utilizing ECN as a signal for congestion estimation and link over-use detection. -9. IANA Considerations +10. IANA Considerations This document makes no request of IANA. Note to RFC Editor: this section may be removed on publication as an RFC. -10. Security Considerations +11. Security Considerations An attacker with the ability to insert or remove messages on the connection would have the ability to disrupt rate control. This could make the algorithm to produce either a sending rate under- utilizing the bottleneck link capacity, or a too high sending rate causing network congestion. In this case, the control information is carried inside RTP, and can be protected against modification or message insertion using SRTP, just as for the media. Given that timestamps are carried in the RTP header, which is not encrypted, this is not protected against disclosure, but it seems hard to mount an attack based on timing information only. -11. Acknowledgements +12. Acknowledgements Thanks to Randell Jesup, Magnus Westerlund, Varun Singh, Tim Panton, Soo-Hyun Choo, Jim Gettys, Ingemar Johansson, Michael Welzl and others for providing valuable feedback on earlier versions of this draft. -12. References +13. References -12.1. Normative References +13.1. Normative References [I-D.alvestrand-rmcat-remb] Alvestrand, H., "RTCP message for Receiver Estimated Maximum Bitrate", draft-alvestrand-rmcat-remb-03 (work in progress), October 2013. [I-D.holmer-rmcat-transport-wide-cc-extensions] Holmer, S., Flodman, M., and E. Sprang, "RTP Extensions for Transport-wide Congestion Control", draft-holmer- rmcat-transport-wide-cc-extensions-00 (work in progress), @@ -697,21 +713,21 @@ [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. [abs-send-time] "RTP Header Extension for Absolute Sender Time", . -12.2. Informative References +13.2. Informative References [Pv13] De Cicco, L., Carlucci, G., and S. Mascolo, "Understanding the Dynamic Behaviour of the Google Congestion Control", Packet Video Workshop , December 2013. [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, RFC 2914, September 2000. Appendix A. Change log @@ -793,29 +809,34 @@ A.8. ietf-rmcat -00 to ietf-rmcat -01 o Arrival-time filter converted from a two dimensional Kalman filter to a scalar Kalman filter. o The use of the TFRC equation was removed from the loss-based controller, as it turned out to have little to no effect in practice. +A.9. ietf-rmcat -01 to ietf-rmcat -02 + + o Added a section which better describes the pre-filtering + algorithm. + Authors' Addresses + Stefan Holmer Google Kungsbron 2 Stockholm 11122 Sweden Email: holmer@google.com - Henrik Lundin Google Kungsbron 2 Stockholm 11122 Sweden Email: hlundin@google.com Gaetano Carlucci Politecnico di Bari