draft-ietf-tsvwg-tunnel-congestion-feedback-00.txt   draft-ietf-tsvwg-tunnel-congestion-feedback-01.txt 
Internet Engineering Task Force X. Wei Internet Engineering Task Force X. Wei
INTERNET-DRAFT Huawei Technologies INTERNET-DRAFT Huawei Technologies
Intended Status: Informational L.Zhu Intended Status: Informational L.Zhu
Expires: March 19, 2016 Huawei Technologies Expires: June 2, 2016 Huawei Technologies
L.Deng L.Deng
China Mobile China Mobile
September 16, 2015 November 30, 2015
Tunnel Congestion Feedback Tunnel Congestion Feedback
draft-ietf-tsvwg-tunnel-congestion-feedback-00 draft-ietf-tsvwg-tunnel-congestion-feedback-01
Abstract Abstract
This document describes a mechanism to calculate congestion of a This document describes a mechanism to calculate congestion of a
tunnel segment based on RFC 6040 recommendations, and a feedback tunnel segment based on RFC6040 recommendations, and a feedback
protocol by which to send the measured congestion of the tunnel from protocol by which to send the measured congestion of the tunnel from
egress to ingress . A basic model for measuring tunnel congestion egress to ingress . A basic model for measuring tunnel congestion
and feedback is described, and a protocol for carrying the feedback and feedback is described, and a protocol for carrying the feedback
data is outlined. data is outlined.
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
skipping to change at page 2, line 23 skipping to change at page 2, line 23
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions And Terminologies . . . . . . . . . . . . . . . . . 3
3. Congestion Information Feedback Models . . . . . . . . . . . . 4 3. Congestion Information Feedback Models . . . . . . . . . . . . 4
3.1 Direct Model . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1 Direct Model . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 Centralized Model . . . . . . . . . . . . . . . . . . . . . 4 3.2 Centralized Model . . . . . . . . . . . . . . . . . . . . . 4
4. Congestion Level Measurement . . . . . . . . . . . . . . . . . 5 4. Congestion Level Measurement . . . . . . . . . . . . . . . . . 5
5. Congestion Information Delivery . . . . . . . . . . . . . . . . 7 5. Congestion Information Delivery . . . . . . . . . . . . . . . . 8
5.1 IPFIX Extentions . . . . . . . . . . . . . . . . . . . . . . 7 5.1 IPFIX Extentions . . . . . . . . . . . . . . . . . . . . . . 9
5.1.1 ce-cePacketTotalCount . . . . . . . . . . . . . . . . . 7 5.1.1 ce-cePacketTotalCount . . . . . . . . . . . . . . . . . 9
5.1.2 ect0-nectPacketTotalCount . . . . . . . . . . . . . . . 8 5.1.2 ect0-nectPacketTotalCount . . . . . . . . . . . . . . . 9
5.1.3 ect1-nectPacketTotalCount . . . . . . . . . . . . . . . 8 5.1.3 ect1-nectPacketTotalCount . . . . . . . . . . . . . . . 10
5.1.4 ce-nectPacketTotalCount . . . . . . . . . . . . . . . . 8 5.1.4 ce-nectPacketTotalCount . . . . . . . . . . . . . . . . 10
5.1.5 ce-ect0PacketTotalCount . . . . . . . . . . . . . . . . 9 5.1.5 ce-ect0PacketTotalCount . . . . . . . . . . . . . . . . 10
5.1.6 ce-ect1PacketTotalCount . . . . . . . . . . . . . . . . 9 5.1.6 ce-ect1PacketTotalCount . . . . . . . . . . . . . . . . 11
5.1.7 ect0-ect0PacketTotalCount . . . . . . . . . . . . . . . 9 5.1.7 ect0-ect0PacketTotalCount . . . . . . . . . . . . . . . 11
5.1.8 ect1-ect1PacketTotalCount . . . . . . . . . . . . . . . 10 5.1.8 ect1-ect1PacketTotalCount . . . . . . . . . . . . . . . 11
6. Congestion Management . . . . . . . . . . . . . . . . . . . . . 10 6. Congestion Management . . . . . . . . . . . . . . . . . . . . . 12
7. Security . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 7. Security . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 11 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 12
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
9.1 Normative References . . . . . . . . . . . . . . . . . . . 11 9.1 Normative References . . . . . . . . . . . . . . . . . . . 13
9.2 Informative References . . . . . . . . . . . . . . . . . . 12 9.2 Informative References . . . . . . . . . . . . . . . . . . 13
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 14
1. Introduction 1. Introduction
In IP network, persistent congestion (or named congestion collapse) In IP network, persistent congestion (or named congestion collapse)
lowers transport throughput, leading to waste of network resource. lowers transport throughput, leading to waste of network resource.
Appropriate congestion control mechanisms are therefore critical to Appropriate congestion control mechanisms are therefore critical to
prevent the network from falling into the persistent congestion prevent the network from falling into the persistent congestion
state. Currently, transport protocols such as TCP, SCTP, DCCP, have state. Currently, transport protocols such as TCP[RFC793],
their built-in congestion control mechanisms, and even for certain SCTP[RFC4960], DCCP[RFC4340], have their built-in congestion control
single transport protocol like TCP there can be a couple of different mechanisms, and even for certain single transport protocol like TCP
congestion control mechanisms to choose from. All these congestion there can be a couple of different congestion control mechanisms to
control mechanisms are implemented on host side, and there are choose from. All these congestion control mechanisms are implemented
reasons that only host side congestion control is not sufficient for on host side, and there are reasons that only host side congestion
the whole network to keep away from persistent congestion. For control is not sufficient for the whole network to keep away from
example, (1) some protocol's congestion control scheme may have persistent congestion. For example, (1) some protocol's congestion
internal design flaws; (2) improper software implementation of control scheme may have internal design flaws; (2) improper software
protocol; (3) some transport protocols do not even provide congestion implementation of protocol; (3) some transport protocols do not even
control at all. provide congestion control at all.
In order to have a better control on network congestion status, it's In order to have a better control on network congestion status, it's
necessary for the network side to do certain kind of traffic control. necessary for the network side to do certain kind of traffic control.
For example, ConEx [ConEx] provides a method for network operator to For example, ConEx [ConEx] provides a method for network operator to
learn about traffic's congestion contribution information, and then learn about traffic's congestion contribution information, and then
congestion management action can be taken based on this information. congestion management action can be taken based on this information.
Tunnels are widely deployed in various networks including public Tunnels are widely deployed in various networks including public
Internet, datacenter network, and enterprise network etc. A tunnel Internet, datacenter network, and enterprise network etc. A tunnel
consists of ingress, an egress and a set of interior routers. For the consists of ingress, an egress and a set of interior routers. For the
skipping to change at page 3, line 51 skipping to change at page 3, line 51
would require additional knowledge of downstream capacity and would require additional knowledge of downstream capacity and
topology, as well as cross traffic that does not pass through this topology, as well as cross traffic that does not pass through this
ingress. ingress.
This document provides a mechanism of feeding back inner tunnel This document provides a mechanism of feeding back inner tunnel
congestion level to the ingress. Using this mechanism the egress can congestion level to the ingress. Using this mechanism the egress can
feed the tunnel congestion level information it collects back to the feed the tunnel congestion level information it collects back to the
ingress. After receiving this information the ingress will be able to ingress. After receiving this information the ingress will be able to
perform congestion management according to network management policy. perform congestion management according to network management policy.
2. Conventions 2. Conventions And Terminologies
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119] document are to be interpreted as described in RFC 2119 [RFC2119]
DP: Decision Point, an logical entity that make congestion management
decision based on the received congestion feedback information.
AP: Action Point, an logical entity that implements congestion
management action according to the decision made by Decision Point.
3. Congestion Information Feedback Models 3. Congestion Information Feedback Models
According to specific network deployment, there are two kinds of According to specific network deployment, there are two kinds of
feedback model: direct model and centralized model. feedback model: direct model and centralized model.
3.1 Direct Model 3.1 Direct Model
Feedback Feedback
|-----------------------------------------| +-----------------------------------+
| | | |
| | | |
| V | V
+----------+ tunnel +-----------+ +--------------+ +-------------+
|Egress |========================== |Inress | | +--------+ | | +---------+ |
|(Exporter)| |(Collector)| | |Exporter| | | |Collector| |
+----------+ +-----------+ | +---|----+ | | +---|-----+ |
| +--|--+ | | +|-+ |
| |Meter| | | |DP| |
| +-----+ | | +--+ |
| | | +--+ |
| | | |AP| |
| | | +--+ |
|Egress | | Ingress |
+--------------+ +-------------+
(a) Direct Feedback Model. (a) Direct Feedback Model.
Direct model means egress feeds information directly to ingress. In Direct model means egress feeds information directly to ingress. The
this model, egress collects network congestion level information and egress consists of Meter function and Exporter function, the Meter
feedback the information to the ingress for congestion management. function collects network congestion level information, and convey
The ingress here will act as both the decision point that decides how the information to Exporter which feeds back the information to the
to do congestion management and the action point that implements Collector function locating at ingress, after that congestion
congestion management decision. management Decision Point (DP) function on ingress will make
congestion management decision based on the information from
Collector. The ingress here will act as both the decision point that
decides how to do congestion management and the action point that
implements congestion management decision.
3.2 Centralized Model 3.2 Centralized Model
+-------------------+
Feedback +-----------+ |+---------+ +--+ |
--------->|Controller |##################### feedback ||Collector|---|DP| |
| |(Collector)| # +---->|+---------+ +--+ |#########
| +-----------+ # | | | #
| # | | Controller | #
+----------+ tunnel +-----V-+ | +-------------------+ #
|Egress | ===========================|Ingress| | #
|(Exporter)| +-------+ | #
+----------+ +--------------+ +------V------+
| +--------+ | | |
| |Exporter| | | |
| +---|----+ | | |
| +--|--+ | | |
| |Meter| | | |
| +-----+ | | |
| | | +--+ |
| | | |AP| |
| | | +--+ |
|Egress | | Ingress |
+--------------+ +-------------+
(b) Centralized Feedback Model (b) Centralized Feedback Model
In the centralized model, the ingress only takes the role of action In the centralized model, the ingress only takes the role of action
point, and it implements traffic control decision from another entity point, and it implements traffic control decision from another entity
named "controller". Here, after egress has collected network named "controller". Here, after Exporter function on egress has
congestion level information, it feeds back the information to a collected network congestion level information, it feeds back the
controller instead of the ingress. Then the controller makes information to the collector of a controller instead of the ingress.
congestion management decision and sends the decision to the ingress Then the controller makes congestion management decision and sends
to implement. the decision to the ingress to implement.
4. Congestion Level Measurement 4. Congestion Level Measurement
This section describes how to measure congestion level in a tunnel. This section describes how to measure congestion level in a tunnel.
There may be different approaches to packet loss detection for There may be different approaches to packet loss detection for
different tunneling protocol scenarios. For instance, if there is a different tunneling protocol scenarios. For instance, if there is a
sequence field in the tunneling protocol header, it will be easy for sequence field in the tunneling protocol header, it will be easy for
egress to detect packet loss through the gaps in sequence number egress to detect packet loss through the gaps in sequence number
space. Another approach is to compare the number of packets entering space. Another approach is to compare the number of packets entering
skipping to change at page 7, line 38 skipping to change at page 8, line 33
network data traffic, referred as out of band signal. Because out of network data traffic, referred as out of band signal. Because out of
band scheme needs additional separate path which might limit its band scheme needs additional separate path which might limit its
actual deployment, the in band scheme will be discussed here. actual deployment, the in band scheme will be discussed here.
Because the message is transmitted in band, so the message packet may Because the message is transmitted in band, so the message packet may
get lost in case of network congestion. To cope with the situation get lost in case of network congestion. To cope with the situation
that the message packet gets lost, the packet counts values are sent that the message packet gets lost, the packet counts values are sent
as cumulative counters. Then if a message is lost the next message as cumulative counters. Then if a message is lost the next message
will recover the missing information. will recover the missing information.
IPFIX [RFC7011] is selected as a choice of candidate protocol. IPFIX IPFIX [RFC7011] is selected as information feedback protocol. IPFIX
is preferred to use SCTP as transport. SCTP allows partially reliable is preferred to use SCTP as transport. SCTP allows partially reliable
delivery [RFC3758], which ensures the feedback message will not be delivery [RFC3758], which ensures the feedback message will not be
blocked in case of packet loss due to network congestion. blocked in case of packet loss due to network congestion.
Ingress can do congestion management at different granularity which
means both the overall aggregated inner tunnel congestion level and
congestion level contributed by certain traffic(s) could be measured
for different congestion management purpose. For example, if the
ingress only wants to limit congestion volume caused by certain
traffic(s),e.g UDP-based traffic, then congestion volume for the
traffic will be fed back; or if the ingress do overall congestion
management, the aggregated congestion volume will be fed back.
When sending message from ingress to egress, the ingress acts as When sending message from ingress to egress, the ingress acts as
IPFIX exporter and egress acts as IPFIX collector; when sending IPFIX exporter and egress acts as IPFIX collector; When feedback
message from egress to ingress or controller, the egress acts as congestion level information from egress to ingress or to controller,
IPFIX exporter and ingress or controller acts as IPFIX collector. the the egress acts as IPFIX exporter and ingress or controller acts
as IPFIX collector.
The combination of congestion level measurement and congestion
information delivery procedure should be as following:
# The ingress determines template record to be used. The template
record can be preconfigured or determined at runtime, the content of
template record will be determined according to the granularity of
congestion management, if the ingress wants to limit congestion
volume contributed by specific traffic flow then the elements such as
source IP address, destination IP address, flow id and CE-marked
packet volume of the flow etc will be included in the template
record.
# Meter on ingress measures traffic volume according to template
record chosen and then the measurement records are sent to egress in
band.
# Meter on egress measures congestion level information according to
template record, the template record can be preconfigured or use the
template record from ingress, the content of template record should
be the same as template record of ingress.
# Exporter of egress sends measurement record together with the
measurement record of ingress to Controller or back to the ingress.
5.1 IPFIX Extentions 5.1 IPFIX Extentions
5.1.1 ce-cePacketTotalCount 5.1.1 ce-cePacketTotalCount
Description: The total number of incoming packets with CE|CE ECN Description: The total number of incoming packets with CE|CE ECN
marking combination for this Flow at the Observation Point since the marking combination for this Flow at the Observation Point since the
Metering Process (re-)initialization for this Observation Point. Metering Process (re-)initialization for this Observation Point.
Abstract Data Type: unsigned64 Abstract Data Type: unsigned64
Data Type Semantics: totalCounter Data Type Semantics: totalCounter
ElementId: TBD1 ElementId: TBD1
skipping to change at page 11, line 42 skipping to change at page 13, line 23
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP", of Explicit Congestion Notification (ECN) to IP",
RFC 3168, September 2001, <http://www.rfc- RFC 3168, September 2001, <http://www.rfc-
editor.org/info/rfc3168>. editor.org/info/rfc3168>.
[RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P. [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P.
Conrad, "Stream Control Transmission Protocol (SCTP) Conrad, "Stream Control Transmission Protocol (SCTP)
Partial Reliability Extension", RFC 3758, May 2004, Partial Reliability Extension", RFC 3758, May 2004,
<http://www.rfc-editor.org/info/rfc3758>. <http://www.rfc-editor.org/info/rfc3758>.
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
Congestion Control Protocol (DCCP)", RFC 4340, March 2006,
<http://www.rfc-editor.org/info/rfc4340>.
[RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol",
RFC 4960, September 2007, <http://www.rfc-
editor.org/info/rfc4960>.
[RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion
Notification", RFC 6040, November 2010, <http://www.rfc- Notification", RFC 6040, November 2010, <http://www.rfc-
editor.org/info/rfc6040>. editor.org/info/rfc6040>.
[RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken,
"Specification of the IP Flow Information Export (IPFIX) "Specification of the IP Flow Information Export (IPFIX)
Protocol for the Exchange of Flow Information", STD 77, Protocol for the Exchange of Flow Information", STD 77,
RFC 7011, September 2013, <http://www.rfc- RFC 7011, September 2013, <http://www.rfc-
editor.org/info/rfc7011>. editor.org/info/rfc7011>.
 End of changes. 18 change blocks. 
69 lines changed or deleted 141 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/