draft-ietf-tsvwg-ecn-encap-guidelines-00.txt   draft-ietf-tsvwg-ecn-encap-guidelines-01.txt 
Transport Area Working Group B. Briscoe Transport Area Working Group B. Briscoe
Internet-Draft BT Internet-Draft BT
Updates: 3819 (if approved) J. Kaippallimalil Updates: 3819 (if approved) J. Kaippallimalil
Intended status: Best Current Practice Huawei Intended status: Best Current Practice Huawei
Expires: September 8, 2014 P. Thaler Expires: May 20, 2015 P. Thaler
Broadcom Corporation Broadcom Corporation
March 07, 2014 November 16, 2014
Guidelines for Adding Congestion Notification to Protocols that Guidelines for Adding Congestion Notification to Protocols that
Encapsulate IP Encapsulate IP
draft-ietf-tsvwg-ecn-encap-guidelines-00 draft-ietf-tsvwg-ecn-encap-guidelines-01
Abstract Abstract
The purpose of this document is to guide the design of congestion The purpose of this document is to guide the design of congestion
notification in any lower layer or tunnelling protocol that notification in any lower layer or tunnelling protocol that
encapsulates IP. The aim is for explicit congestion signals to encapsulates IP. The aim is for explicit congestion signals to
propagate consistently from lower layer protocols into IP. Then the propagate consistently from lower layer protocols into IP. Then the
IP internetwork layer can act as a portability layer to carry IP internetwork layer can act as a portability layer to carry
congestion notification from non-IP-aware congested nodes up to the congestion notification from non-IP-aware congested nodes up to the
transport layer (L4). Following these guidelines should assure transport layer (L4). Following these guidelines should assure
skipping to change at page 1, line 42 skipping to change at page 1, line 42
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 8, 2014. This Internet-Draft will expire on May 20, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 47 skipping to change at page 3, line 47
delivery time predictably short [RFC2884]; delivery time predictably short [RFC2884];
o As ECN is used more widely by end-systems, it will gradually o As ECN is used more widely by end-systems, it will gradually
remove the need to configure a degree of delay into buffers before remove the need to configure a degree of delay into buffers before
they start to notify congestion (the cause of bufferbloat). This they start to notify congestion (the cause of bufferbloat). This
is because drop involves a trade-off between sending a timely is because drop involves a trade-off between sending a timely
signal and trying to avoid impairment, whereas ECN is solely a signal and trying to avoid impairment, whereas ECN is solely a
signal not an impairment, so there is no harm triggering it signal not an impairment, so there is no harm triggering it
earlier. earlier.
Some lower layer technologies (e.g. MPLS, Ethernet) are used to form Some lower layer technologies (e.g. MPLS, Ethernet) are used to form
subnetworks with IP-aware nodes only at the edges. These networks subnetworks with IP-aware nodes only at the edges. These networks
are often sized so that it is rare for interior queues to overflow. are often sized so that it is rare for interior queues to overflow.
However, this has often be more due to the inability of the original However, this has often be more due to the inability of the original
TCP protocol to saturate the links. For many years, fixes such as TCP protocol to saturate the links. For many years, fixes such as
window scaling [RFC1323] proved hard to deploy. But now that modern window scaling [RFC1323] proved hard to deploy. But now that modern
operating systems are finally capable of saturating interior links, operating systems are finally capable of saturating interior links,
even the buffers of well-provisioned interior switches will need to even the buffers of well-provisioned interior switches will need to
signal episodes of queuing. signal episodes of queuing.
Propagation of ECN is defined for MPLS [RFC5129], and is being Propagation of ECN is defined for MPLS [RFC5129], and is being
defined for TRILL [trill-rbridge-options], but it remains to be defined for TRILL [I-D.ietf-trill-rfc7180bis], but it remains to be
defined for a number of other subnetwork technologies. defined for a number of other subnetwork technologies.
Similarly, ECN propagation is yet to be defined for many tunnelling Similarly, ECN propagation is yet to be defined for many tunnelling
protocols. [RFC6040] defines how ECN should be propagated for IP-in- protocols. [RFC6040] defines how ECN should be propagated for IP-in-
IP [RFC2003] and IPsec [RFC4301] tunnels. However, as Section 9.3 of IP [RFC2003] and IPsec [RFC4301] tunnels. However, as Section 9.3 of
RFC3168 pointed out, ECN support will need to be defined for other RFC3168 pointed out, ECN support will need to be defined for other
tunnelling protocols, e.g. L2TP [RFC2661], GRE [RFC1701], [RFC2784], tunnelling protocols, e.g. L2TP [RFC2661], GRE [RFC1701], [RFC2784],
PPTP [RFC2637] and GTP [GTPv1], [GTPv1-U], [GTPv2-C]. PPTP [RFC2637] and GTP [GTPv1], [GTPv1-U], [GTPv2-C].
Incremental deployment is the most tricky aspect when adding support Incremental deployment is the most tricky aspect when adding support
for ECN. The original ECN protocol in IP [RFC3168] was carefully for ECN. The original ECN protocol in IP [RFC3168] was carefully
designed so that a congested buffer would not mark a packet (rather designed so that a congested buffer would not mark a packet (rather
than drop it) unless both source and destination hosts were ECN- than drop it) unless both source and destination hosts were ECN-
capable. Otherwise its congestion markings would never be detected capable. Otherwise its congestion markings would never be detected
and congestion would just deteriorate further. However, to support and congestion would just deteriorate further. However, to support
congestion marking below the IP layer, it is not sufficient to only congestion marking below the IP layer, it is not sufficient to only
check that the two end-points support ECN; correct operation also check that the two end-points support ECN; correct operation also
skipping to change at page 5, line 28 skipping to change at page 5, line 28
1.1. Scope 1.1. Scope
This document only concerns wire protocol processing of explicit This document only concerns wire protocol processing of explicit
notification of congestion and makes no changes or recommendations notification of congestion and makes no changes or recommendations
concerning algorithms for congestion marking or for congestion concerning algorithms for congestion marking or for congestion
response (algorithm issues should be independent of the layer the response (algorithm issues should be independent of the layer the
algorithm operates in). algorithm operates in).
The question of congestion notification signals with different The question of congestion notification signals with different
semantics to those of ECN in IP is touched on in a couple of specific semantics to those of ECN in IP is touched on in a couple of specific
cases (e.g. QCN [IEEE802.1Qau]) and with schemes with multiple cases (e.g. QCN [IEEE802.1Qau]) and with schemes with multiple
severity levels such as PCN [RFC6660]). However, no attempt is made severity levels such as PCN [RFC6660]). However, no attempt is made
to give guidelines about schemes with different semantics that are to give guidelines about schemes with different semantics that are
yet to be invented. yet to be invented.
The semantics of congestion signals can be relative to the traffic The semantics of congestion signals can be relative to the traffic
class. Therefore correct propagation of congestion signals could class. Therefore correct propagation of congestion signals could
depend on correct propagation of any traffic class field between the depend on correct propagation of any traffic class field between the
layers. In this document, correct propagation of traffic class layers. In this document, correct propagation of traffic class
information is assumed, while what 'correct' means and how it is information is assumed, while what 'correct' means and how it is
achieved is covered elsewhere (e.g. [RFC2983]) and is outside the achieved is covered elsewhere (e.g. [RFC2983]) and is outside the
scope of the present document. scope of the present document.
Note that these guidelines do not require the subnet wire protocol to Note that these guidelines do not require the subnet wire protocol to
be changed to accommodate congestion notification. Another way to be changed to accommodate congestion notification. Another way to
add congestion notification without consuming header space in the add congestion notification without consuming header space in the
subnet protocol might be to use a parallel control plane protocol. subnet protocol might be to use a parallel control plane protocol.
This document focuses on the congestion notification interface This document focuses on the congestion notification interface
between IP and lower layer protocols that can encapsulate IP, where between IP and lower layer protocols that can encapsulate IP, where
the term 'IP' includes v4 or v6, unicast, multicast or anycast. the term 'IP' includes v4 or v6, unicast, multicast or anycast.
However, it is likely that the guidelines will also be useful when a However, it is likely that the guidelines will also be useful when a
lower layer protocol or tunnel encapsulates itself (e.g. Ethernet MAC lower layer protocol or tunnel encapsulates itself (e.g. Ethernet
in MAC [IEEE802.1Qah]) or when it encapsulates other protocols. In MAC in MAC [IEEE802.1Qah]) or when it encapsulates other protocols.
the feed-backward mode, propagation of congestion signals for
In the feed-backward mode, propagation of congestion signals for
multicast and anycast packets is out-of-scope (because it would be so multicast and anycast packets is out-of-scope (because it would be so
complicated that it is hoped no-one would attempt such an complicated that it is hoped no-one would attempt such an
abomination). abomination).
2. Terminology 2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
skipping to change at page 6, line 30 skipping to change at page 6, line 31
(payload) of that layer. The scope of this document includes (payload) of that layer. The scope of this document includes
layer 2 and layer 3 networks, where the PDU is respectively termed layer 2 and layer 3 networks, where the PDU is respectively termed
a frame or a packet (or a cell in ATM). PDU is a general term for a frame or a packet (or a cell in ATM). PDU is a general term for
any of these. This definition also includes a payload with a shim any of these. This definition also includes a payload with a shim
header lying somewhere between layer 2 & 3. header lying somewhere between layer 2 & 3.
Transport: The end-to-end transmission control function, Transport: The end-to-end transmission control function,
conventionally considered at layer-4 in the OSI reference model. conventionally considered at layer-4 in the OSI reference model.
Given the audience for this document will often use the word Given the audience for this document will often use the word
transport to mean low level bit carriage, whenever the term is transport to mean low level bit carriage, whenever the term is
used it will be qualified, e.g. 'L4 transport'. used it will be qualified, e.g. 'L4 transport'.
Encapsulator: The link or tunnel endpoint function that adds an Encapsulator: The link or tunnel endpoint function that adds an
outer header to a PDU (also termed the 'link ingress', the 'subnet outer header to a PDU (also termed the 'link ingress', the 'subnet
ingress', the 'ingress tunnel endpoint' or just the 'ingress' ingress', the 'ingress tunnel endpoint' or just the 'ingress'
where the context is clear). where the context is clear).
Decapsulator: The link or tunnel endpoint function that removes an Decapsulator: The link or tunnel endpoint function that removes an
outer header from a PDU (also termed the 'link egress', the outer header from a PDU (also termed the 'link egress', the
'subnet egress', the 'egress tunnel endpoint' or just the 'egress' 'subnet egress', the 'egress tunnel endpoint' or just the 'egress'
where the context is clear). where the context is clear).
skipping to change at page 7, line 49 skipping to change at page 7, line 49
normative guidelines for designers of explicit congestion normative guidelines for designers of explicit congestion
notification protocols, taking each mode in turn: notification protocols, taking each mode in turn:
Feed-Forward-and-Up: Nodes feed forward congestion notification Feed-Forward-and-Up: Nodes feed forward congestion notification
towards the egress within the lower layer then up and along the towards the egress within the lower layer then up and along the
layers towards the end-to-end destination at the transport layer. layers towards the end-to-end destination at the transport layer.
The following local optimisation is possible: The following local optimisation is possible:
Feed-Up-and-Forward: A lower layer switch feeds-up congestion Feed-Up-and-Forward: A lower layer switch feeds-up congestion
notification directly into the ECN field in the higher layer notification directly into the ECN field in the higher layer
(e.g. IP) header, irrespective of whether the node is at the (e.g. IP) header, irrespective of whether the node is at the
egress of a subnet. egress of a subnet.
Feed-Backward: Nodes feed back congestion signals towards the Feed-Backward: Nodes feed back congestion signals towards the
ingress of the lower layer and (optionally) attempt to control ingress of the lower layer and (optionally) attempt to control
congestion within their own layer. congestion within their own layer.
Null: Nodes cannot experience congestion at the lower layer except Null: Nodes cannot experience congestion at the lower layer except
at ingress nodes (which are IP-aware or equivalently higher-layer- at ingress nodes (which are IP-aware or equivalently higher-layer-
aware). aware).
3.1. Feed-Forward-and-Up Mode 3.1. Feed-Forward-and-Up Mode
Like IP and MPLS, many subnet technologies are based on self- Like IP and MPLS, many subnet technologies are based on self-
contained protocol data units (PDUs) or frames sent unreliably. They contained protocol data units (PDUs) or frames sent unreliably. They
provide no feedback channel at the subnetwork layer, instead relying provide no feedback channel at the subnetwork layer, instead relying
on higher layers (e.g. TCP) to feed back loss signals. on higher layers (e.g. TCP) to feed back loss signals.
In these cases, ECN may best be supported by standardising explicit In these cases, ECN may best be supported by standardising explicit
notification of congestion into the lower layer protocol that carries notification of congestion into the lower layer protocol that carries
the data forwards. It will then also be necessary to define how the the data forwards. It will then also be necessary to define how the
egress of the lower layer subnet propagates this explicit signal into egress of the lower layer subnet propagates this explicit signal into
the forwarded upper layer (IP) header. It can then continue forwards the forwarded upper layer (IP) header. It can then continue forwards
until it finally reaches the destination transport (at L4). Then until it finally reaches the destination transport (at L4). Then
typically the destination will feed this congestion notification back typically the destination will feed this congestion notification back
to the source transport using an end-to-end protocol (e.g. TCP). to the source transport using an end-to-end protocol (e.g. TCP).
This is the arrangement that has already been used to add ECN to IP- This is the arrangement that has already been used to add ECN to IP-
in-IP tunnels [RFC6040], IP-in-MPLS and MPLS-in-MPLS [RFC5129]. in-IP tunnels [RFC6040], IP-in-MPLS and MPLS-in-MPLS [RFC5129].
This mode is illustrated in Figure 1. Along the middle of the This mode is illustrated in Figure 1. Along the middle of the
figure, layers 2, 3 & 4 of the protocol stack are shown, and one figure, layers 2, 3 & 4 of the protocol stack are shown, and one
packet is shown along the bottom as it progresses across the network packet is shown along the bottom as it progresses across the network
from source to destination, crossing two subnets connected by a from source to destination, crossing two subnets connected by a
router, and crossing two switches on the path across each subnet. router, and crossing two switches on the path across each subnet.
Congestion at the output of the first switch (shown as *) leads to a Congestion at the output of the first switch (shown as *) leads to a
congestion marking in the L2 header (shown as C in the illustration congestion marking in the L2 header (shown as C in the illustration
skipping to change at page 8, line 52 skipping to change at page 8, line 52
router forwards the marked L3 header into subnet 2, and when it adds router forwards the marked L3 header into subnet 2, and when it adds
a new L2 header it copies the L3 marking into the L2 header as well, a new L2 header it copies the L3 marking into the L2 header as well,
as shown by the 'C's in both layers (assuming the technology of as shown by the 'C's in both layers (assuming the technology of
subnet 2 also supports explicit congestion marking). subnet 2 also supports explicit congestion marking).
Note that there is no implication that each 'C' marking is encoded Note that there is no implication that each 'C' marking is encoded
the same; a different encoding might be used for the 'C' marking in the same; a different encoding might be used for the 'C' marking in
each protocol. each protocol.
Finally, for completeness, we show the L3 marking arriving at the Finally, for completeness, we show the L3 marking arriving at the
destination, where the host transport protocol (e.g. TCP) feeds it destination, where the host transport protocol (e.g. TCP) feeds it
back to the source in the L4 acknowledgement (the 'C' at L4 in the back to the source in the L4 acknowledgement (the 'C' at L4 in the
packet at the top of the diagram). packet at the top of the diagram).
_ _ _ _ _ _
/_______ | | |C| ACK Packet (V) /_______ | | |C| ACK Packet (V)
\ |_|_|_| \ |_|_|_|
+---+ layer: 2 3 4 header +---+ +---+ layer: 2 3 4 header +---+
| <|<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Packet V <<<<<<<<<<<<<|<< |L4 | <|<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Packet V <<<<<<<<<<<<<|<< |L4
| | +---+ | ^ | | | +---+ | ^ |
| | . . . . . . Packet U. . | >>|>>> Packet U >>>>>>>>>>>>|>^ |L3 | | . . . . . . Packet U. . | >>|>>> Packet U >>>>>>>>>>>>|>^ |L3
skipping to change at page 11, line 38 skipping to change at page 11, line 38
| | | | | | | | | | | | | | | | | | | data________\ | | | | | | | | | | | | | | | | | | | data________\
|__|_|_|_| |__|_|_|_| |__|_|_| |__|_|_|_| packet (U) / |__|_|_|_| |__|_|_|_| |__|_|_| |__|_|_|_| packet (U) /
layer: 4 3 2 4 3 2 4 3 4 3 2 layer: 4 3 2 4 3 2 4 3 4 3 2
header header
Figure 3: Feed-Backward Mode Figure 3: Feed-Backward Mode
ATM's feed-backward approach doesn't fit well when layered beneath ATM's feed-backward approach doesn't fit well when layered beneath
IP's feed-forward approach--unless the initial data source is the IP's feed-forward approach--unless the initial data source is the
same node as the ATM ingress. Figure 3 shows the feed-backward same node as the ATM ingress. Figure 3 shows the feed-backward
approach being used in subnet H. If the final switch on the path is approach being used in subnet H. If the final switch on the path is
congested (*), it doesn't feed-forward any congestion indications on congested (*), it doesn't feed-forward any congestion indications on
packet (U). Instead it sends a control cell (V) back to the router packet (U). Instead it sends a control cell (V) back to the router
at the ATM ingress. at the ATM ingress.
However, the backward feedback doesn't reach the original data source However, the backward feedback doesn't reach the original data source
directly because IP doesn't support backward feedback (and subnet G directly because IP doesn't support backward feedback (and subnet G
is independent of subnet H). Instead, the router in the middle is independent of subnet H). Instead, the router in the middle
throttles down its sending rate but the original data sources don't throttles down its sending rate but the original data sources don't
reduce their rates. The resulting rate mismatch causes the middle reduce their rates. The resulting rate mismatch causes the middle
router's buffer at layer 3 to back up until it becomes congested, router's buffer at layer 3 to back up until it becomes congested,
skipping to change at page 13, line 35 skipping to change at page 13, line 35
[RFC6040] can be applied directly are: [RFC6040] can be applied directly are:
o L2TP [RFC2661] o L2TP [RFC2661]
o GRE [RFC1701], [RFC2784] o GRE [RFC1701], [RFC2784]
o PPTP [RFC2637] o PPTP [RFC2637]
o GTP [GTPv1], [GTPv1-U], [GTPv2-C] o GTP [GTPv1], [GTPv1-U], [GTPv2-C]
o VXLAN [vxlan]. o VXLAN [RFC7348].
4.2. Wire Protocol Design: Indication of ECN Support 4.2. Wire Protocol Design: Indication of ECN Support
This section is intended to guide the redesign of any lower layer This section is intended to guide the redesign of any lower layer
protocol that encapsulate IP to add native ECN support at the lower protocol that encapsulate IP to add native ECN support at the lower
layer. It reflects the approaches used in [RFC6040] and in layer. It reflects the approaches used in [RFC6040] and in
[RFC5129]. Therefore IP-in-IP tunnels or IP-in-MPLS or MPLS-in-MPLS [RFC5129]. Therefore IP-in-IP tunnels or IP-in-MPLS or MPLS-in-MPLS
encapsulations that already comply with [RFC6040] or [RFC5129] will encapsulations that already comply with [RFC6040] or [RFC5129] will
already satisfy this guidance. already satisfy this guidance.
skipping to change at page 15, line 16 skipping to change at page 15, line 16
marked packet, if the decapsulator discovers that the higher layer marked packet, if the decapsulator discovers that the higher layer
(inner header) indicates the transport is not ECN-capable, it drops (inner header) indicates the transport is not ECN-capable, it drops
the packet--effectively on behalf of the earlier congested node (see the packet--effectively on behalf of the earlier congested node (see
Decapsulation Guideline 1 in Section 4.4). Decapsulation Guideline 1 in Section 4.4).
It was only appropriate to define such an incremental deployment It was only appropriate to define such an incremental deployment
strategy because MPLS is targeted solely at professional operators, strategy because MPLS is targeted solely at professional operators,
who can be expected to ensure that a whole subnetwork is consistently who can be expected to ensure that a whole subnetwork is consistently
configured. This strategy might not be appropriate for other link configured. This strategy might not be appropriate for other link
technologies targeted at zero-configuration deployment or deployment technologies targeted at zero-configuration deployment or deployment
by the general public (e.g. Ethernet). For such 'plug-and-play' by the general public (e.g. Ethernet). For such 'plug-and-play'
environments it will be necessary to invent a failsafe approach that environments it will be necessary to invent a failsafe approach that
ensures congestion markings will never fall into black holes, no ensures congestion markings will never fall into black holes, no
matter how inconsistently a system is put together. Alternatively, matter how inconsistently a system is put together. Alternatively,
congestion notification relying on correct system configuration could congestion notification relying on correct system configuration could
be confined to flavours of Ethernet intended only for professional be confined to flavours of Ethernet intended only for professional
network operators, such as IEEE 802.1ah Provider Backbone Bridges network operators, such as IEEE 802.1ah Provider Backbone Bridges
(PBB). (PBB).
QCN [IEEE802.1Qau] provides another example of how to indicate to QCN [IEEE802.1Qau] provides another example of how to indicate to
lower layer devices that the end-points will not understand ECN. An lower layer devices that the end-points will not understand ECN. An
skipping to change at page 16, line 6 skipping to change at page 16, line 6
congestion notification added to the outer header across the congestion notification added to the outer header across the
subnet. This is necessary in addition to checking that an subnet. This is necessary in addition to checking that an
incoming PDU indicates an ECN-capable (L4) transport. Examples incoming PDU indicates an ECN-capable (L4) transport. Examples
of how this guarantee might be provided include: of how this guarantee might be provided include:
* by configuration (e.g. if any label switches in a domain * by configuration (e.g. if any label switches in a domain
support ECN marking, [RFC5129] requires all egress nodes to support ECN marking, [RFC5129] requires all egress nodes to
have been configured to propagate ECN) have been configured to propagate ECN)
* by the ingress explicitly checking that the egress propagates * by the ingress explicitly checking that the egress propagates
ECN (e.g. TRILL uses IS-IS to check path capabilities before ECN (e.g. TRILL uses IS-IS to check path capabilities before
using critical options [trill-rbridge-options]) using critical options [I-D.ietf-trill-rfc7180bis])
* by inherent design of the protocol (e.g. by encoding ECN * by inherent design of the protocol (e.g. by encoding ECN
marking on the outer header in such a way that a legacy egress marking on the outer header in such a way that a legacy egress
that does not understand ECN will consider the PDU corrupt and that does not understand ECN will consider the PDU corrupt and
discard it, thus at least propagating a form of congestion discard it, thus at least propagating a form of congestion
signal). signal).
2. Egress Fails Capability Check: If the ingress cannot guarantee 2. Egress Fails Capability Check: If the ingress cannot guarantee
that the egress will propagate congestion notification, the that the egress will propagate congestion notification, the
ingress SHOULD disable ECN when it forwards the PDU at the lower ingress SHOULD disable ECN when it forwards the PDU at the lower
skipping to change at page 18, line 29 skipping to change at page 18, line 29
in case they represent an attack. However, an approach using in case they represent an attack. However, an approach using
alarms and policy-mediated drop is preferable to hard-coded drop, alarms and policy-mediated drop is preferable to hard-coded drop,
so that operators can keep track of possible attacks but so that operators can keep track of possible attacks but
currently unused combinations are not precluded from future use currently unused combinations are not precluded from future use
through new standards actions. through new standards actions.
4.5. Sequences of Similar Tunnels or Subnets 4.5. Sequences of Similar Tunnels or Subnets
In some deployments, particularly in 3GPP networks, an IP packet may In some deployments, particularly in 3GPP networks, an IP packet may
traverse two or more IP-in-IP tunnels in sequence that all use traverse two or more IP-in-IP tunnels in sequence that all use
identical technology (e.g. GTP). identical technology (e.g. GTP).
In such cases, it would be sufficient for every encapsulation and In such cases, it would be sufficient for every encapsulation and
decapsulation in the chain to comply with RFC 6040. Alternatively, decapsulation in the chain to comply with RFC 6040. Alternatively,
as an optimisation, a node that decapsulates a packet and immediately as an optimisation, a node that decapsulates a packet and immediately
re-encapsulates it for the next tunnel MAY copy the incoming outer re-encapsulates it for the next tunnel MAY copy the incoming outer
ECN field directly to the outgoing outer and the incoming inner ECN ECN field directly to the outgoing outer and the incoming inner ECN
field directly to the outgoing inner. Then the overall behavior field directly to the outgoing inner. Then the overall behavior
across the sequence of tunnel segments would still be consistent with across the sequence of tunnel segments would still be consistent with
RFC 6040. RFC 6040.
skipping to change at page 20, line 22 skipping to change at page 20, line 22
o IPv4 and IPv6 are not the only layer-3 protocols that might be o IPv4 and IPv6 are not the only layer-3 protocols that might be
encapsulated by lower layer protocols encapsulated by lower layer protocols
o Link-layer encryption might be in use, making the layer-2 payload o Link-layer encryption might be in use, making the layer-2 payload
inaccessible inaccessible
o Many Ethernet switches do not have 'layer-3 switch' capabilities o Many Ethernet switches do not have 'layer-3 switch' capabilities
so they cannot read or modify an IP payload so they cannot read or modify an IP payload
o It might be costly to find an IP header (v4 or v6) when it may be o It might be costly to find an IP header (v4 or v6) when it may be
encapsulated by more than one lower layer header, e.g. Ethernet encapsulated by more than one lower layer header, e.g. Ethernet
MAC in MAC [IEEE802.1Qah]. MAC in MAC [IEEE802.1Qah].
Nonetheless, configuring lower layer equipment to look for an ECN Nonetheless, configuring lower layer equipment to look for an ECN
field in an encapsulated IP header is a useful optimisation. If the field in an encapsulated IP header is a useful optimisation. If the
implementation follows the guidelines below, this optimisation does implementation follows the guidelines below, this optimisation does
not have to be confined to a controlled environment such as within a not have to be confined to a controlled environment such as within a
data centre; it could usefully be applied on any network--even if the data centre; it could usefully be applied on any network--even if the
operator is not sure whether the above issues will never apply: operator is not sure whether the above issues will never apply:
1. If a native lower-layer congestion notification mechanism exists 1. If a native lower-layer congestion notification mechanism exists
skipping to change at page 24, line 29 skipping to change at page 24, line 29
Category for ATM VCs", Design Technote 10415, June 2005. Category for ATM VCs", Design Technote 10415, June 2005.
[Buck00] Buckwalter, J., "Frame Relay: Technology and Practice", [Buck00] Buckwalter, J., "Frame Relay: Technology and Practice",
Pub. Addison Wesley ISBN-13: 978-0201485240, 2000. Pub. Addison Wesley ISBN-13: 978-0201485240, 2000.
[DCTCP] Alizadeh, M., Greenberg, A., Maltz, D., Padhye, J., Patel, [DCTCP] Alizadeh, M., Greenberg, A., Maltz, D., Padhye, J., Patel,
P., Prabhakar, B., Sengupta, S., and M. Sridharan, "Data P., Prabhakar, B., Sengupta, S., and M. Sridharan, "Data
Center TCP (DCTCP)", ACM SIGCOMM CCR 40(4)63--74, October Center TCP (DCTCP)", ACM SIGCOMM CCR 40(4)63--74, October
2010, <http://portal.acm.org/citation.cfm?id=1851192>. 2010, <http://portal.acm.org/citation.cfm?id=1851192>.
[GTPv1] 3GPP, "GPRS Tunnelling Protocol (GTP) across the Gn and Gp
interface", Technical Specification TS 29.060, .
[GTPv1-U] 3GPP, "General Packet Radio System (GPRS) Tunnelling [GTPv1-U] 3GPP, "General Packet Radio System (GPRS) Tunnelling
Protocol User Plane (GTPv1-U)", Technical Specification TS Protocol User Plane (GTPv1-U)", Technical Specification TS
29.281, . 29.281, .
[GTPv1] 3GPP, "GPRS Tunnelling Protocol (GTP) across the Gn and Gp
interface", Technical Specification TS 29.060, .
[GTPv2-C] 3GPP, "Evolved General Packet Radio Service (GPRS) [GTPv2-C] 3GPP, "Evolved General Packet Radio Service (GPRS)
Tunnelling Protocol for Control plane (GTPv2-C)", Tunnelling Protocol for Control plane (GTPv2-C)",
Technical Specification TS 29.274, . Technical Specification TS 29.274, .
[I-D.ietf-conex-abstract-mech] [I-D.ietf-conex-abstract-mech]
Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx)
Concepts and Abstract Mechanism", draft-ietf-conex- Concepts, Abstract Mechanism and Requirements", draft-
abstract-mech-08 (work in progress), October 2013. ietf-conex-abstract-mech-13 (work in progress), October
2014.
[I-D.ietf-trill-rfc7180bis]
Eastlake, D., Zhang, M., Perlman, R., Banerjee, A.,
Ghanwani, A., and S. Gupta, "TRILL: Clarifications,
Corrections, and Updates", draft-ietf-trill-rfc7180bis-00
(work in progress), November 2014.
[I-D.moncaster-tcpm-rcv-cheat] [I-D.moncaster-tcpm-rcv-cheat]
Moncaster, T., "A TCP Test to Allow Senders to Identify Moncaster, T., Briscoe, B., and A. Jacquet, "A TCP Test to
Receiver Non-Compliance", draft-moncaster-tcpm-rcv- Allow Senders to Identify Receiver Non-Compliance", draft-
cheat-01 (work in progress), June 2007. moncaster-tcpm-rcv-cheat-03 (work in progress), July 2014.
[IEEE802.1Qah] [IEEE802.1Qah]
IEEE, "IEEE Standard for Local and Metropolitan Area IEEE, "IEEE Standard for Local and Metropolitan Area
Networks--Virtual Bridged Local Area Networks--Amendment Networks--Virtual Bridged Local Area Networks--Amendment
6: Provider Backbone Bridges", IEEE Std 802.1Qah-2008, 6: Provider Backbone Bridges", IEEE Std 802.1Qah-2008,
August 2008, August 2008,
<http://www.ieee802.org/1/pages/802.1ah.html>. <http://www.ieee802.org/1/pages/802.1ah.html>.
(Access Controlled link within page) (Access Controlled link within page)
[IEEE802.1Qau] [IEEE802.1Qau]
Finn, N., Ed., "IEEE Standard for Local and Metropolitan Finn, N., Ed., "IEEE Standard for Local and Metropolitan
Area Networks--Virtual Bridged Local Area Networks - Area Networks--Virtual Bridged Local Area Networks -
Amendment 13: Congestion Notification", IEEE Std Amendment 13: Congestion Notification", IEEE Std 802.1Qau-
802.1Qau-2010, March 2010, <http://ieeexplore.ieee.org/xpl 2010, March 2010, <http://ieeexplore.ieee.org/xpl/
/mostRecentIssue.jsp?punumber=5454061>. mostRecentIssue.jsp?punumber=5454061>.
(Access Controlled link within page) (Access Controlled link within page)
[ITU-T.I.371] [ITU-T.I.371]
ITU-T, "Traffic Control and Congestion Control in B-ISDN", ITU-T, "Traffic Control and Congestion Control in B-ISDN",
ITU-T Rec. I.371 (03/04), March 2004, ITU-T Rec. I.371 (03/04), March 2004,
<http://ieeexplore.ieee.org/xpl/ <http://ieeexplore.ieee.org/xpl/
mostRecentIssue.jsp?punumber=5454061>. mostRecentIssue.jsp?punumber=5454061>.
[LTE-RA] 3GPP, "Evolved Universal Terrestrial Radio Access (E-UTRA) [LTE-RA] 3GPP, "Evolved Universal Terrestrial Radio Access (E-UTRA)
skipping to change at page 26, line 31 skipping to change at page 26, line 35
Internet Protocol", RFC 4301, December 2005. Internet Protocol", RFC 4301, December 2005.
[RFC6633] Gont, F., "Deprecation of ICMP Source Quench Messages", [RFC6633] Gont, F., "Deprecation of ICMP Source Quench Messages",
RFC 6633, May 2012. RFC 6633, May 2012.
[RFC6660] Briscoe, B., Moncaster, T., and M. Menth, "Encoding Three [RFC6660] Briscoe, B., Moncaster, T., and M. Menth, "Encoding Three
Pre-Congestion Notification (PCN) States in the IP Header Pre-Congestion Notification (PCN) States in the IP Header
Using a Single Diffserv Codepoint (DSCP)", RFC 6660, July Using a Single Diffserv Codepoint (DSCP)", RFC 6660, July
2012. 2012.
[trill-rbridge-options] [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
Eastlake, D., Ghanwani, A., Manral, V., and C. Bestler, L., Sridhar, T., Bursell, M., and C. Wright, "Virtual
"RBridges: Further TRILL Header Extensions", draft-ietf- eXtensible Local Area Network (VXLAN): A Framework for
trill-rbridge-options-07 (work in progress), June 2012. Overlaying Virtualized Layer 2 Networks over Layer 3
Networks", RFC 7348, August 2014.
[vxlan] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
L., Sridhar, T., Bursell, M., and C. Wright, "VXLAN: A
Framework for Overlaying Virtualized Layer 2 Networks over
Layer 3 Networks", draft-mahalingam-dutt-dcops-vxlan-08
(work in progress), February 2014.
Appendix A. Outstanding Document Issues Appendix A. Outstanding Document Issues
1. [GF] Concern that certain guidelines warrant a MUST (NOT) rather 1. [GF] Concern that certain guidelines warrant a MUST (NOT) rather
than a SHOULD (NOT). Given the guidelines say that if any SHOULD than a SHOULD (NOT). Given the guidelines say that if any SHOULD
(NOT)s are not followed, a strong justification will be needed, (NOT)s are not followed, a strong justification will be needed,
they have been left as SHOULD (NOT) pending further list they have been left as SHOULD (NOT) pending further list
discussion. In particular: discussion. In particular:
* If inner is a Not-ECN-PDU and Outer is CE (or highest severity * If inner is a Not-ECN-PDU and Outer is CE (or highest severity
congestion level), MUST (not SHOULD) drop? congestion level), MUST (not SHOULD) drop?
2. Consider whether an IETF Standard Track doc will be needed to 2. Consider whether an IETF Standard Track doc will be needed to
Update the IP-in-IP protocols listed in Section 4.1--at least Update the IP-in-IP protocols listed in Section 4.1--at least
those that the IET those that the IET
Appendix B. Changes in This Version (to be removed by RFC Editor) Appendix B. Changes in This Version (to be removed by RFC Editor)
From ietf-00 to ietf-01: Updated references.
From briscoe-04 to ietf-00: Changed filename following tsvwg From briscoe-04 to ietf-00: Changed filename following tsvwg
adoption. adoption.
From briscoe-03 to 04: From briscoe-03 to 04:
* Re-arranged the introduction to describe the purpose of the * Re-arranged the introduction to describe the purpose of the
document first before introducing ECN in more depth. And document first before introducing ECN in more depth. And
clarified the introduction throughout. clarified the introduction throughout.
* Added applicability to 3GPP TS 36.300. * Added applicability to 3GPP TS 36.300.
 End of changes. 28 change blocks. 
45 lines changed or deleted 50 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/