draft-ietf-tsvwg-ecn-tunnel-08.txt   draft-ietf-tsvwg-ecn-tunnel-09.txt 
Transport Area Working Group B. Briscoe Transport Area Working Group B. Briscoe
Internet-Draft BT Internet-Draft BT
Updates: 3168, 4301 March 03, 2010 Updates: 3168, 4301, 4774 July 30, 2010
(if approved) (if approved)
Intended status: Standards Track Intended status: Standards Track
Expires: September 4, 2010 Expires: January 31, 2011
Tunnelling of Explicit Congestion Notification Tunnelling of Explicit Congestion Notification
draft-ietf-tsvwg-ecn-tunnel-08 draft-ietf-tsvwg-ecn-tunnel-09
Abstract Abstract
This document redefines how the explicit congestion notification This document redefines how the explicit congestion notification
(ECN) field of the IP header should be constructed on entry to and (ECN) field of the IP header should be constructed on entry to and
exit from any IP in IP tunnel. On encapsulation it updates RFC3168 exit from any IP in IP tunnel. On encapsulation it updates RFC3168
to bring all IP in IP tunnels (v4 or v6) into line with RFC4301 IPsec to bring all IP in IP tunnels (v4 or v6) into line with RFC4301 IPsec
ECN processing. On decapsulation it updates both RFC3168 and RFC4301 ECN processing. On decapsulation it updates both RFC3168 and RFC4301
to add new behaviours for previously unused combinations of inner and to add new behaviours for previously unused combinations of inner and
outer header. The new rules ensure the ECN field is correctly outer header. The new rules ensure the ECN field is correctly
propagated across a tunnel whether it is used to signal one or two propagated across a tunnel whether it is used to signal one or two
severity levels of congestion, whereas before only one severity level severity levels of congestion, whereas before only one severity level
was supported. Tunnel endpoints can be updated in any order without was supported. Tunnel endpoints can be updated in any order without
affecting pre-existing uses of the ECN field, providing backward affecting pre-existing uses of the ECN field, thus ensuring backward
compatibility. Nonetheless, operators wanting to support two compatibility. Nonetheless, operators wanting to support two
severity levels (e.g. for pre-congestion notification--PCN) can severity levels (e.g. for pre-congestion notification--PCN) can
require compliance with this new specification. A thorough analysis require compliance with this new specification. A thorough analysis
of the reasoning for these changes and the implications is included. of the reasoning for these changes and the implications is included.
In the unlikely event that the new rules do not meet a specific need, In the unlikely event that the new rules do not meet a specific need,
RFC4774 gives guidance on designing alternate ECN semantics and this RFC4774 gives guidance on designing alternate ECN semantics and this
document extends that to include tunnelling issues. document extends that to include tunnelling issues.
Status of This Memo Status of This Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF). Note that other groups may also distribute
other groups may also distribute working documents as Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at This Internet-Draft will expire on January 31, 2011.
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 4, 2010.
Copyright Notice Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the BSD License. described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 9 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 11 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 12
3. Summary of Pre-Existing RFCs . . . . . . . . . . . . . . . . . 12 3. Summary of Pre-Existing RFCs . . . . . . . . . . . . . . . . . 14
3.1. Encapsulation at Tunnel Ingress . . . . . . . . . . . . . 12 3.1. Encapsulation at Tunnel Ingress . . . . . . . . . . . . . 14
3.2. Decapsulation at Tunnel Egress . . . . . . . . . . . . . . 13 3.2. Decapsulation at Tunnel Egress . . . . . . . . . . . . . . 15
4. New ECN Tunnelling Rules . . . . . . . . . . . . . . . . . . . 14 4. New ECN Tunnelling Rules . . . . . . . . . . . . . . . . . . . 16
4.1. Default Tunnel Ingress Behaviour . . . . . . . . . . . . . 15 4.1. Default Tunnel Ingress Behaviour . . . . . . . . . . . . . 16
4.2. Default Tunnel Egress Behaviour . . . . . . . . . . . . . 15 4.2. Default Tunnel Egress Behaviour . . . . . . . . . . . . . 17
4.3. Encapsulation Modes . . . . . . . . . . . . . . . . . . . 17 4.3. Encapsulation Modes . . . . . . . . . . . . . . . . . . . 19
4.4. Single Mode of Decapsulation . . . . . . . . . . . . . . . 19 4.4. Single Mode of Decapsulation . . . . . . . . . . . . . . . 20
5. Updates to Earlier RFCs . . . . . . . . . . . . . . . . . . . 20 5. Updates to Earlier RFCs . . . . . . . . . . . . . . . . . . . 21
5.1. Changes to RFC4301 ECN processing . . . . . . . . . . . . 20 5.1. Changes to RFC4301 ECN processing . . . . . . . . . . . . 21
5.2. Changes to RFC3168 ECN processing . . . . . . . . . . . . 20 5.2. Changes to RFC3168 ECN processing . . . . . . . . . . . . 22
5.3. Motivation for Changes . . . . . . . . . . . . . . . . . . 22 5.3. Motivation for Changes . . . . . . . . . . . . . . . . . . 23
5.3.1. Motivation for Changing Encapsulation . . . . . . . . 22 5.3.1. Motivation for Changing Encapsulation . . . . . . . . 23
5.3.2. Motivation for Changing Decapsulation . . . . . . . . 23 5.3.2. Motivation for Changing Decapsulation . . . . . . . . 24
6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 25 6. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 27
6.1. Non-Issues Updating Decapsulation . . . . . . . . . . . . 25 6.1. Non-Issues Updating Decapsulation . . . . . . . . . . . . 27
6.2. Non-Update of RFC4301 IPsec Encapsulation . . . . . . . . 26 6.2. Non-Update of RFC4301 IPsec Encapsulation . . . . . . . . 27
6.3. Update to RFC3168 Encapsulation . . . . . . . . . . . . . 26 6.3. Update to RFC3168 Encapsulation . . . . . . . . . . . . . 28
7. Design Principles for Alternate ECN Tunnelling Semantics . . . 27 7. Design Principles for Alternate ECN Tunnelling Semantics . . . 28
8. Security Considerations . . . . . . . . . . . . . . . . . . . 29 8. IANA Considerations (to be removed on publication): . . . . . 30
9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 30 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 32
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 32
11.1. Normative References . . . . . . . . . . . . . . . . . . . 31 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33
11.2. Informative References . . . . . . . . . . . . . . . . . . 32 12.1. Normative References . . . . . . . . . . . . . . . . . . . 33
Editorial Comments . . . . . . . . . . . . . . . . . . . . . . . . 12.2. Informative References . . . . . . . . . . . . . . . . . . 33
Appendix A. Early ECN Tunnelling RFCs . . . . . . . . . . . . . . 34 Appendix A. Early ECN Tunnelling RFCs . . . . . . . . . . . . . . 35
Appendix B. Design Constraints . . . . . . . . . . . . . . . . . 35 Appendix B. Design Constraints . . . . . . . . . . . . . . . . . 35
B.1. Security Constraints . . . . . . . . . . . . . . . . . . . 35 B.1. Security Constraints . . . . . . . . . . . . . . . . . . . 36
B.2. Control Constraints . . . . . . . . . . . . . . . . . . . 37 B.2. Control Constraints . . . . . . . . . . . . . . . . . . . 38
B.3. Management Constraints . . . . . . . . . . . . . . . . . . 38 B.3. Management Constraints . . . . . . . . . . . . . . . . . . 39
Appendix C. Contribution to Congestion across a Tunnel . . . . . 39 Appendix C. Contribution to Congestion across a Tunnel . . . . . 39
Appendix D. Why Losing ECT(1) on Decapsulation Impedes PCN Appendix D. Compromise on Decap with ECT(1) Inner and ECT(0)
(to be removed before publication) . . . . . . . . . 40 Outer . . . . . . . . . . . . . . . . . . . . . . . . 40
Appendix E. Why Resetting ECN on Encapsulation Impedes PCN Appendix E. Open Issues . . . . . . . . . . . . . . . . . . . . . 41
(to be removed before publication) . . . . . . . . . 41
Appendix F. Compromise on Decap with ECT(1) Inner and ECT(0)
Outer . . . . . . . . . . . . . . . . . . . . . . . . 42
Appendix G. Open Issues . . . . . . . . . . . . . . . . . . . . . 43
Request to the RFC Editor (to be removed on publication): Request to the RFC Editor (to be removed on publication):
In the RFC index, RFC3168 should be identified as an update to In the RFC index, RFC3168 should be identified as an update to
RFC2003. RFC4301 should be identified as an update to RFC3168. RFC2003. RFC4301 should be identified as an update to RFC3168.
Changes from previous drafts (to be removed by the RFC Editor) Changes from previous drafts (to be removed by the RFC Editor)
Full text differences between IETF draft versions are available at Full text differences between IETF draft versions are available at
<http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and <http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-ecn-tunnel/>, and
between earlier individual draft versions at between earlier individual draft versions at
<http://www.briscoe.net/pubs.html#ecn-tunnel> <http://www.briscoe.net/pubs.html#ecn-tunnel>
From ietf-06 to ietf-07 (current): From ietf-08 to ietf-09 (current): Added change log entry for -07 to
-08 that was previously omitted.
* Changes to standards action text:
+ Added RFC4774 to 'Updates:' header (the draft always has
extended the advice in RFC4774 (BCP124) which said very
little about tunnels. The GENART reviewer merely pointed
out that the header did not highlight this fact.)
* Editorial changes:
+ Abstract: s/providing backward compatibility./thus ensuring
backward compatibility./
+ Moved PCN-related text motivating changes to decapsulation
from "Default Tunnel Egress Behaviour" (Section 4.2) to
"Motivation for Changing Decapsulation" (Section 5.3.2)
where it was merged with existing similar text.
+ In the non-normative Design Principles avoided using words
in lower case where they were in contexts that might make
them confusable with upper case RFC2119 normative language.
+ Added Stephen Hanna and Ben Campbell to acks and corrected
spelling of Agarwal.
+ Deleted endnote discussing corner case with IKEv2 manual
keying (identified as "to be removed before publication
following SecDir review").
+ Deleted Appendices D & E on why existing ingress & egress
tunnelling behavour impede PCN and the endnotes that
referred to them (identified as "to be removed before
publication").
+ Various minor corrections pointed out by reviewers.
From ietf-07 to ietf-08:
* Changes to standards actions:
+ Section 4: Changed non-RFC2119 phrase 'NOT RECOMMENDED' to
'SHOULD be avoided', wrt alternate ECN tunnelling schemes.
+ Section 4.2: Used upper-case in 'Alarms SHOULD be rate-
limited'.
+ Section 7: Made bullet #1 in the decapsulation guidelines
for alternate schemes more precise. Also changed any upper-
case keywords in this informative section to lower case.
* Editorial changes:
+ Changed copyright notice to allow for pre-5378 material.
+ Shifted supporting text intended for deletion on publication
into editorial comments.
+ Explained how to read the decapsulation matrices in their
captions.
+ Minor clarifications throughout.
From ietf-06 to ietf-07:
* Emphasised that this is the opposite of a fork in the RFC * Emphasised that this is the opposite of a fork in the RFC
series. series.
* Altered Section 5 to focus on updates to implementations of * Altered Section 5 to focus on updates to implementations of
earlier RFCs, rather than on updates to the text of the RFCs. earlier RFCs, rather than on updates to the text of the RFCs.
* Removed potential loop-holes in normative text that * Removed potential loop-holes in normative text that
implementers might have used to claim compliance without implementers might have used to claim compliance without
implementing normal mode. Highlighted the deliberate implementing normal mode. Highlighted the deliberate
skipping to change at page 6, line 36 skipping to change at page 7, line 51
Schemes" after all the normative sections. Schemes" after all the normative sections.
+ Added Appendix A on early history of ECN tunnelling RFCs. + Added Appendix A on early history of ECN tunnelling RFCs.
+ Removed specialist appendix on "Relative Placement of + Removed specialist appendix on "Relative Placement of
Tunnelling and In-Path Load Regulation" (Appendix D in the Tunnelling and In-Path Load Regulation" (Appendix D in the
-02 draft) -02 draft)
+ Moved and updated specialist text on "Compromise on Decap + Moved and updated specialist text on "Compromise on Decap
with ECT(1) Inner and ECT(0) Outer" from Security with ECT(1) Inner and ECT(0) Outer" from Security
Considerations to Appendix F Considerations to Appendix D
* Textual changes: * Textual changes:
+ Simplified vocabulary for non-native-english speakers + Simplified vocabulary for non-native-english speakers
+ Simplified Introduction and defined regularly used terms in + Simplified Introduction and defined regularly used terms in
an expanded Terminology section. an expanded Terminology section.
+ More clearly distinguished statically configured tunnels + More clearly distinguished statically configured tunnels
from dynamic tunnel endpoint discovery, before explaining from dynamic tunnel endpoint discovery, before explaining
skipping to change at page 9, line 9 skipping to change at page 10, line 27
Roadmap), added new Introductory subsection on "Scope" and Roadmap), added new Introductory subsection on "Scope" and
improved clarity; improved clarity;
* Added Design Guidelines for New Encapsulations of Congestion * Added Design Guidelines for New Encapsulations of Congestion
Notification; Notification;
* Considerably clarified the Backward Compatibility section * Considerably clarified the Backward Compatibility section
(Section 6); (Section 6);
* Considerably extended the Security Considerations section * Considerably extended the Security Considerations section
(Section 8); (Section 9);
* Summarised the primary rationale much better in the * Summarised the primary rationale much better in the
conclusions; conclusions;
* Added numerous extra acknowledgements; * Added numerous extra acknowledgements;
* Added Appendix E. "Why resetting CE on encapsulation harms * Added Appendix E. "Why resetting CE on encapsulation harms
PCN", Appendix C. "Contribution to Congestion across a Tunnel" PCN", Appendix C. "Contribution to Congestion across a Tunnel"
and Appendix D. "Ideal Decapsulation Rules"; and Appendix D. "Ideal Decapsulation Rules";
skipping to change at page 17, line 26 skipping to change at page 18, line 49
will not amplify into a flood of alarm messages. It MUST be will not amplify into a flood of alarm messages. It MUST be
possible to suppress alarms or logging, e.g. if it becomes possible to suppress alarms or logging, e.g. if it becomes
apparent that a combination that previously was not used has apparent that a combination that previously was not used has
started to be used for legitimate purposes such as a new standards started to be used for legitimate purposes such as a new standards
action. action.
The above logic allows for ECT(0) and ECT(1) to both represent the The above logic allows for ECT(0) and ECT(1) to both represent the
same severity of congestion marking (e.g. "not congestion marked"). same severity of congestion marking (e.g. "not congestion marked").
But it also allows future schemes to be defined where ECT(1) is a But it also allows future schemes to be defined where ECT(1) is a
more severe marking than ECT(0), in particular enabling the simplest more severe marking than ECT(0), in particular enabling the simplest
possible encoding for PCN [I-D.ietf-pcn-3-in-1-encoding]. Before the possible encoding for PCN [I-D.ietf-pcn-3-in-1-encoding] (see
present specification was written, the PCN working-group had proposed Section 5.3.2). Treating ECT(1) as either the same as ECT(0) or as a
a number of work-rounds to the problem of a tunnel egress not higher severity level is explained in the discussion of the ECN nonce
propagating two severity levels of congestion. Without wishing to
disparage the ingenuity of these work-rounds, none were chosen for [RFC3540] in Section 9, which in turn refers to Appendix D.
the standards track because they were either somewhat wasteful,
imprecise or complicated [Note_PCN_egress]. Treating ECT(1) as
either the same as ECT(0) or as a higher severity level is explained
in the discussion of the ECN nonce [RFC3540] in Section 8, which in
turn refers to Appendix F.
4.3. Encapsulation Modes 4.3. Encapsulation Modes
Section 4.1 introduces two encapsulation modes, normal mode and Section 4.1 introduces two encapsulation modes, normal mode and
compatibility mode, defining their encapsulation behaviour (i.e. compatibility mode, defining their encapsulation behaviour (i.e.
header copying or zeroing respectively). Note that these are modes header copying or zeroing respectively). Note that these are modes
of the ingress tunnel endpoint only, not the tunnel as a whole. of the ingress tunnel endpoint only, not the tunnel as a whole.
To comply with this specification, a tunnel ingress MUST at least To comply with this specification, a tunnel ingress MUST at least
implement `normal mode'. Unless it will never be used with legacy implement `normal mode'. Unless it will never be used with legacy
skipping to change at page 18, line 42 skipping to change at page 20, line 11
it recognises that proprietary tunnel endpoint discovery protocols it recognises that proprietary tunnel endpoint discovery protocols
exist. It therefore sets down some constraints on discovery exist. It therefore sets down some constraints on discovery
protocols to ensure safe interworking. protocols to ensure safe interworking.
If dynamic tunnel endpoint discovery might pair an ingress with a If dynamic tunnel endpoint discovery might pair an ingress with a
legacy egress (RFC2003, RFC2401 or RFC2481 or the limited legacy egress (RFC2003, RFC2401 or RFC2481 or the limited
functionality mode of RFC3168), the ingress MUST implement both functionality mode of RFC3168), the ingress MUST implement both
normal and compatibility mode. If the tunnel discovery process is normal and compatibility mode. If the tunnel discovery process is
arranged to only ever find a tunnel egress that propagates ECN arranged to only ever find a tunnel egress that propagates ECN
(RFC3168 full functionality mode, RFC4301 or this present (RFC3168 full functionality mode, RFC4301 or this present
specification), then a tunnel ingress can be complaint with the specification), then a tunnel ingress can be compliant with the
present specification without implementing compatibility mode. present specification without implementing compatibility mode.
While a compliant tunnel ingress is discovering an egress, it MUST While a compliant tunnel ingress is discovering an egress, it MUST
send packets in compatibility mode in case the egress it discovers send packets in compatibility mode in case the egress it discovers
is a legacy egress. If, through the discovery protocol, the is a legacy egress. If, through the discovery protocol, the
egress indicates that it is compliant with the present egress indicates that it is compliant with the present
specification, with RFC4301 or with RFC3168 full functionality specification, with RFC4301 or with RFC3168 full functionality
mode, the ingress can switch itself into normal mode. If the mode, the ingress can switch itself into normal mode. If the
egress denies compliance with any of these or returns an error egress denies compliance with any of these or returns an error
that implies it does not understand a request to work to any of that implies it does not understand a request to work to any of
skipping to change at page 19, line 33 skipping to change at page 20, line 50
Implementation note: If a compliant node is the ingress for multiple Implementation note: If a compliant node is the ingress for multiple
tunnels, a mode setting will need to be stored for each tunnel tunnels, a mode setting will need to be stored for each tunnel
ingress. However, if a node is the egress for multiple tunnels, ingress. However, if a node is the egress for multiple tunnels,
none of the tunnels will need to store a mode setting, because a none of the tunnels will need to store a mode setting, because a
compliant egress only needs one mode. compliant egress only needs one mode.
4.4. Single Mode of Decapsulation 4.4. Single Mode of Decapsulation
A compliant decapsulator only needs one mode of operation. However, A compliant decapsulator only needs one mode of operation. However,
if a complaint egress is implemented to be dynamically discoverable, if a compliant egress is implemented to be dynamically discoverable,
it may need to respond to discovery requests from various types of it may need to respond to discovery requests from various types of
legacy tunnel ingress. This specification does not define how legacy tunnel ingress. This specification does not define how
dynamic negotiation might be done by (proprietary) discovery dynamic negotiation might be done by (proprietary) discovery
protocols, but it sets down some constraints to ensure safe protocols, but it sets down some constraints to ensure safe
interworking. interworking.
Through the discovery protocol, a tunnel ingress compliant with the Through the discovery protocol, a tunnel ingress compliant with the
present specification might ask if the egress is compliant with the present specification might ask if the egress is compliant with the
present specification, with RFC4301 or with RFC3168 full present specification, with RFC4301 or with RFC3168 full
functionality mode. Or an RFC3168 tunnel ingress might try to functionality mode. Or an RFC3168 tunnel ingress might try to
skipping to change at page 20, line 44 skipping to change at page 22, line 13
dropped rather than forwarded as Not-ECT; dropped rather than forwarded as Not-ECT;
* Certain combinations of inner and outer ECN field have been * Certain combinations of inner and outer ECN field have been
identified as currently unused. These can trigger logging identified as currently unused. These can trigger logging
and/or raise alarms. and/or raise alarms.
Modes: RFC4301 tunnel endpoints do not need modes and are not Modes: RFC4301 tunnel endpoints do not need modes and are not
updated by the modes in the present specification. Effectively an updated by the modes in the present specification. Effectively an
RFC4301 IPsec ingress solely uses the REQUIRED normal mode of RFC4301 IPsec ingress solely uses the REQUIRED normal mode of
encapsulation, which is unchanged from RFC4301 encapsulation. It encapsulation, which is unchanged from RFC4301 encapsulation. It
will never [Note_Manual_Keying] need the OPTIONAL compatibility will never need the OPTIONAL compatibility mode as explained in
mode as explained in Section 4.3. Section 4.3.
5.2. Changes to RFC3168 ECN processing 5.2. Changes to RFC3168 ECN processing
Ingress: On encapsulation, the new rule in Figure 3 that a normal Ingress: On encapsulation, the new rule in Figure 3 that a normal
mode tunnel ingress copies any ECN field into the outer header mode tunnel ingress copies any ECN field into the outer header
updates the full functionality behaviour of an RFC3168 ingress. updates the full functionality behaviour of an RFC3168 ingress.
Nonetheless, the new compatibility mode encapsulates packets Nonetheless, the new compatibility mode encapsulates packets
identically to the limited functionality mode of an RFC3168 identically to the limited functionality mode of an RFC3168
ingress. ingress.
Egress: An RFC3168 egress will need to be updated to the new Egress: An RFC3168 egress will need to be updated to the new
decapsulation behaviour in Figure 4, in order to comply with the decapsulation behaviour in Figure 4, in order to comply with the
present specification. However, the changes are backward present specification. However, the changes are backward
skipping to change at page 21, line 46 skipping to change at page 23, line 11
behaviour covers all cases. behaviour covers all cases.
The normal mode of encapsulation is an update to the encapsulation The normal mode of encapsulation is an update to the encapsulation
behaviour of the full functionality mode of an RFC3168 ingress. behaviour of the full functionality mode of an RFC3168 ingress.
The compatibility mode of encapsulation is identical to the The compatibility mode of encapsulation is identical to the
encapsulation behaviour of the limited functionality mode of an encapsulation behaviour of the limited functionality mode of an
RFC3168 ingress, except it is optional. RFC3168 ingress, except it is optional.
The constraints on how tunnel discovery protocols set modes in The constraints on how tunnel discovery protocols set modes in
Section 4.3 and Section 4.4 are an update to RFC3168, but they are Section 4.3 and Section 4.4 are an update to RFC3168, but they are
unlikely to require code changes as they document safe practice. unlikely to require code changes as they document existing safe
practice.
5.3. Motivation for Changes 5.3. Motivation for Changes
An overriding goal is to ensure the same ECN signals can mean the An overriding goal is to ensure the same ECN signals can mean the
same thing whatever tunnels happen to encapsulate an IP packet flow. same thing whatever tunnels happen to encapsulate an IP packet flow.
This removes gratuitous inconsistency, which otherwise constrains the This removes gratuitous inconsistency, which otherwise constrains the
available design space and makes it harder to design networks and new available design space and makes it harder to design networks and new
protocols that work predictably. protocols that work predictably.
5.3.1. Motivation for Changing Encapsulation 5.3.1. Motivation for Changing Encapsulation
skipping to change at page 22, line 32 skipping to change at page 23, line 41
compatibility with legacy decapsulators that do not propagate ECN compatibility with legacy decapsulators that do not propagate ECN
correctly. correctly.
The trigger that motivated this update to RFC3168 encapsulation was a The trigger that motivated this update to RFC3168 encapsulation was a
standards track proposal for pre-congestion notification (PCN standards track proposal for pre-congestion notification (PCN
[RFC5670]). PCN excess rate marking only works correctly if the ECN [RFC5670]). PCN excess rate marking only works correctly if the ECN
field is copied on encapsulation (as in RFC4301 and RFC5129); it does field is copied on encapsulation (as in RFC4301 and RFC5129); it does
not work if ECN is reset (as in RFC3168). This is because PCN excess not work if ECN is reset (as in RFC3168). This is because PCN excess
rate marking depends on the outer header revealing any congestion rate marking depends on the outer header revealing any congestion
experienced so far on the whole path, not just since the last tunnel experienced so far on the whole path, not just since the last tunnel
ingress [Note_PCN_ingress]. ingress.
PCN allows a network operator to add flow admission and termination PCN allows a network operator to add flow admission and termination
for inelastic traffic at the edges of a Diffserv domain, but without for inelastic traffic at the edges of a Diffserv domain, but without
any per-flow mechanisms in the interior and without the generous any per-flow mechanisms in the interior and without the generous
provisioning typical of Diffserv, aiming to significantly reduce provisioning typical of Diffserv, aiming to significantly reduce
costs. The PCN architecture [RFC5559] states that RFC3168 IP in IP costs. The PCN architecture [RFC5559] states that RFC3168 IP in IP
tunnelling of the ECN field cannot be used for any tunnel ingress in tunnelling of the ECN field cannot be used for any tunnel ingress in
a PCN domain. Prior to the present specification, this left a stark a PCN domain. Prior to the present specification, this left a stark
choice between not being able to use PCN for inelastic traffic choice between not being able to use PCN for inelastic traffic
control or not being able to use the many tunnels already deployed control or not being able to use the many tunnels already deployed
skipping to change at page 24, line 19 skipping to change at page 25, line 32
As well as being useful for general future-proofing, this problem As well as being useful for general future-proofing, this problem
is immediately pressing for standardisation of pre-congestion is immediately pressing for standardisation of pre-congestion
notification (PCN), which uses two severity levels of congestion. notification (PCN), which uses two severity levels of congestion.
If a congested queue used ECT(1) in the outer header to signal If a congested queue used ECT(1) in the outer header to signal
more severe congestion than ECT(0), the pre-existing more severe congestion than ECT(0), the pre-existing
decapsulation rules would have thrown away this congestion decapsulation rules would have thrown away this congestion
signal, preventing tunnelled traffic from ever knowing that it signal, preventing tunnelled traffic from ever knowing that it
should reduce its load. should reduce its load.
The PCN working group has had to consider a number of wasteful or Before the present specification was written, the PCN working
convoluted work-rounds to this problem [Note_PCN_egress]. But by group had to consider a number of wasteful or convoluted work-
far the simplest approach is just to remove the covert channel rounds to this problem. Without wishing to disparage the
blockages from tunnelling behaviour--now deemed unnecessary ingenuity of these work-rounds, none were chosen for the
anyway. Then network operators that want to support two standards track because they were either somewhat wasteful,
congestion severity-levels for PCN can specify that every tunnel imprecise or complicated. Instead a baseline PCN encoding was
egress in a PCN region must comply with this latest specified [RFC5696] that supported only one severity level of
specification. congestion but allowed space for these work-rounds as
experimental extensions.
But by far the simplest approach is that taken by the current
specification: just to remove the covert channel blockages from
tunnelling behaviour--now deemed unnecessary anyway. Then
network operators that want to support two congestion severity-
levels for PCN can specify that every tunnel egress in a PCN
region must comply with this latest specification. Having taken
this step, the simplest possible encoding for PCN with two
severity levels of congestion [I-D.ietf-pcn-3-in-1-encoding] can
be used.
Not only does this make two congestion severity-levels available Not only does this make two congestion severity-levels available
for PCN standardisation, but also for other potential uses of the for PCN, but also for other potential uses of the extra ECN
extra ECN codepoint (e.g. [VCP]). codepoint (e.g. [VCP]).
2. Cases are documented where a middlebox (e.g. a firewall) drops 2. Cases are documented where a middlebox (e.g. a firewall) drops
packets with header values that were currently unused (CU) when packets with header values that were currently unused (CU) when
the box was deployed, often on the grounds that anything the box was deployed, often on the grounds that anything
unexpected might be an attack. This tends to bar future use of unexpected might be an attack. This tends to bar future use of
CU values. The new decapsulation rules specify optional logging CU values. The new decapsulation rules specify optional logging
and/or alarms for specific combinations of inner and outer header and/or alarms for specific combinations of inner and outer header
that are currently unused. The aim is to give implementers a that are currently unused. The aim is to give implementers a
recourse other than drop if they are concerned about the security recourse other than drop if they are concerned about the security
of CU values. It recognises legitimate security concerns about of CU values. It recognises legitimate security concerns about
skipping to change at page 27, line 26 skipping to change at page 28, line 50
7. Design Principles for Alternate ECN Tunnelling Semantics 7. Design Principles for Alternate ECN Tunnelling Semantics
This section is informative not normative. This section is informative not normative.
S.5 of RFC3168 permits the Diffserv codepoint (DSCP)[RFC2474] to S.5 of RFC3168 permits the Diffserv codepoint (DSCP)[RFC2474] to
'switch in' alternative behaviours for marking the ECN field, just as 'switch in' alternative behaviours for marking the ECN field, just as
it switches in different per-hop behaviours (PHBs) for scheduling. it switches in different per-hop behaviours (PHBs) for scheduling.
[RFC4774] gives best current practice for designing such alternative [RFC4774] gives best current practice for designing such alternative
ECN semantics and very briefly mentions in section 5.4 that ECN semantics and very briefly mentions in section 5.4 that
tunnelling should be considered. The guidance below extends RFC4774, tunnelling needs to be considered. The guidance below complements
giving additional guidance on designing any alternate ECN semantics and extends RFC4774, giving additional guidance on designing any
that would also require alternate tunnelling semantics. alternate ECN semantics that would also require alternate tunnelling
semantics.
The overriding guidance is: "Avoid designing alternate ECN tunnelling The overriding guidance is: "Avoid designing alternate ECN tunnelling
semantics, if at all possible." If a scheme requires tunnels to semantics, if at all possible." If a scheme requires tunnels to
implement special processing of the ECN field for certain DSCPs, it implement special processing of the ECN field for certain DSCPs, it
will be hard to guarantee that every implementer of every tunnel will will be hard to guarantee that every implementer of every tunnel will
have added the required exception or that operators will have have added the required exception or that operators will have
ubiquitously deployed the required updates. It is unlikely a single ubiquitously deployed the required updates. It is unlikely a single
authority is even aware of all the tunnels in a network, which may authority is even aware of all the tunnels in a network, which may
include tunnels set up by applications between endpoints, or include tunnels set up by applications between endpoints, or
dynamically created in the network. Therefore it is highly likely dynamically created in the network. Therefore it is highly likely
that some tunnels within a network or on hosts connected to it will that some tunnels within a network or on hosts connected to it will
not implement the required special case. not implement the required special case.
That said, if a non-default scheme for tunnelling the ECN field is That said, if a non-default scheme for tunnelling the ECN field is
really required, the following guidelines may prove useful in its really required, the following guidelines might prove useful in its
design: design:
On encapsulation in any alternate scheme: On encapsulation in any alternate scheme:
1. The ECN field of the outer header should be cleared to Not-ECT 1. The ECN field of the outer header ought to be cleared to Not-
("00") unless it is guaranteed that the corresponding tunnel ECT ("00") unless it is guaranteed that the corresponding
egress will correctly propagate congestion markings introduced tunnel egress will correctly propagate congestion markings
across the tunnel in the outer header. introduced across the tunnel in the outer header.
2. If it has established that ECN will be correctly propagated, 2. If it has established that ECN will be correctly propagated,
an encapsulator should also copy incoming congestion an encapsulator ought to also copy incoming congestion
notification into the outer header. The general principle notification into the outer header. The general principle
here is that the outer header should reflect congestion here is that the outer header should reflect congestion
accumulated along the whole upstream path, not just since the accumulated along the whole upstream path, not just since the
tunnel ingress (Appendix B.3 on management and monitoring tunnel ingress (Appendix B.3 on management and monitoring
explains). explains).
In some circumstances (e.g. pseudowires, PCN), the whole path In some circumstances (e.g. pseudowires, PCN), the whole path
is divided into segments, each with its own congestion is divided into segments, each with its own congestion
notification and feedback loop. In these cases, the function notification and feedback loop. In these cases, the function
that regulates load at the start of each segment will need to that regulates load at the start of each segment will need to
reset congestion notification for its segment. Often the reset congestion notification for its segment. Often the
point where congestion notification is reset will also be point where congestion notification is reset will also be
located at the start of a tunnel. However, the resetting located at the start of a tunnel. However, the resetting
function should be thought of as being applied to packets function can be thought of as being applied to packets after
after the encapsulation function--two logically separate the encapsulation function--two logically separate functions
functions even though they might run on the same physical box. even though they might run on the same physical box. Then the
Then the code module doing encapsulation can keep to the code module doing encapsulation can keep to the copying rule
copying rule and the load regulator module can reset and the load regulator module can reset congestion, without
congestion, without any code in either module being any code in either module being conditional on whether the
conditional on whether the other is there. other is there.
On decapsulation in any new scheme: On decapsulation in any new scheme:
1. If the arriving inner header is Not-ECT it implies the 1. If the arriving inner header is Not-ECT it implies the
transport will not understand other ECN codepoints. If the transport will not understand other ECN codepoints. If the
outer header carries an explicit congestion marking, the outer header carries an explicit congestion marking, the
alternate scheme would be expected to drop the packet--the alternate scheme would be expected to drop the packet--the
only indication of congestion the transport will understand. only indication of congestion the transport will understand.
If the alternate scheme recommends forwarding rather than If the alternate scheme recommends forwarding rather than
dropping such a packet, it must clearly justify this decision. dropping such a packet, it will need to clearly justify this
If the inner is Not-ECT and the outer carries any other ECN decision. If the inner is Not-ECT and the outer carries any
codepoint that does not indicate congestion, the alternate other ECN codepoint that does not indicate congestion, the
scheme can forward the packet, but probably only as Not-ECT. alternate scheme can forward the packet, but probably only as
Not-ECT.
2. If the arriving inner header is other than Not-ECT, the ECN 2. If the arriving inner header is other than Not-ECT, the ECN
field that the alternate decapsulation scheme forwards should field that the alternate decapsulation scheme forwards ought
reflect the more severe congestion marking of the arriving to reflect the more severe congestion marking of the arriving
inner and outer headers. inner and outer headers.
3. Any alternate scheme must define a behaviour for all 3. Any alternate scheme will need to define a behaviour for all
combinations of inner and outer headers, even those that would combinations of inner and outer headers, even those that would
not be expected to result from standards known at the time and not be expected to result from standards known at the time and
even those that would not be expected from the tunnel ingress even those that would not be expected from the tunnel ingress
paired with the egress at run-time. Consideration should be paired with the egress at run-time. Consideration should be
given to logging such unexpected combinations and raising an given to logging such unexpected combinations and raising an
alarm, particularly if there is a danger that the invalid alarm, particularly if there is a danger that the invalid
combination implies congestion signals are not being combination implies congestion signals are not being
propagated correctly. The presence of currently unused propagated correctly. The presence of currently unused
combinations may represent an attack, but the new scheme combinations may represent an attack, but the new scheme
should try to define a way to forward such packets, at least should try to define a way to forward such packets, at least
if a safe outgoing codepoint can be defined. if a safe outgoing codepoint can be defined.
Raising an alarm allows a management system to decide whether Raising an alarm allows a management system to decide whether
the anomaly is indeed an attack, in which case it can decide the anomaly is indeed an attack, in which case it can decide
to drop such packets. This is a preferable approach to hard- to drop such packets. This is a preferable approach to hard-
coded discard of packets that seem anomalous today, but may be coded discard of packets that seem anomalous today, but may be
needed tomorrow in future standards actions. needed tomorrow in future standards actions.
IANA Considerations (to be removed on publication): 8. IANA Considerations (to be removed on publication):
This memo includes no request to IANA. This memo includes no request to IANA.
8. Security Considerations 9. Security Considerations
Appendix B.1 discusses the security constraints imposed on ECN tunnel Appendix B.1 discusses the security constraints imposed on ECN tunnel
processing. The new rules for ECN tunnel processing (Section 4) processing. The new rules for ECN tunnel processing (Section 4)
trade-off between information security (covert channels) and traffic trade-off between information security (covert channels) and traffic
security (congestion monitoring & control). Ensuring congestion security (congestion monitoring & control). Ensuring congestion
markings are not lost is itself an aspect of security, because if we markings are not lost is itself an aspect of security, because if we
allowed congestion notification to be lost, any attempt to enforce a allowed congestion notification to be lost, any attempt to enforce a
response to congestion would be much harder. response to congestion would be much harder.
Specialist security issues: Security issues in unlikely but possible scenarios:
Tunnels intersecting Diffserv regions with alternate ECN semantics: Tunnels intersecting Diffserv regions with alternate ECN semantics:
If alternate congestion notification semantics are defined for a If alternate congestion notification semantics are defined for a
certain Diffserv PHB, the scope of the alternate semantics might certain Diffserv PHB, the scope of the alternate semantics might
typically be bounded by the limits of a Diffserv region or typically be bounded by the limits of a Diffserv region or
regions, as envisaged in [RFC4774] (e.g. the pre-congestion regions, as envisaged in [RFC4774] (e.g. the pre-congestion
notification architecture [RFC5559]). The inner headers in notification architecture [RFC5559]). The inner headers in
tunnels crossing the boundary of such a Diffserv region but ending tunnels crossing the boundary of such a Diffserv region but ending
within the region can potentially leak the external congestion within the region can potentially leak the external congestion
notification semantics into the region, or leak the internal notification semantics into the region, or leak the internal
skipping to change at page 30, line 10 skipping to change at page 31, line 34
other outside the domain. [RFC5559] gives specific advice on this other outside the domain. [RFC5559] gives specific advice on this
for the PCN case, but other definitions of alternate semantics for the PCN case, but other definitions of alternate semantics
will need to discuss the specific security implications in each will need to discuss the specific security implications in each
case. case.
ECN nonce tunnel coverage: The new decapsulation rules improve the ECN nonce tunnel coverage: The new decapsulation rules improve the
coverage of the ECN nonce [RFC3540] relative to the previous rules coverage of the ECN nonce [RFC3540] relative to the previous rules
in RFC3168 and RFC4301. However, nonce coverage is still not in RFC3168 and RFC4301. However, nonce coverage is still not
perfect, as this would have led to a safety problem in another perfect, as this would have led to a safety problem in another
case. Both are corner-cases, so discussion of the compromise case. Both are corner-cases, so discussion of the compromise
between them is deferred to Appendix F. between them is deferred to Appendix D.
Covert channel not turned off: A legacy (RFC3168) tunnel ingress Covert channel not turned off: A legacy (RFC3168) tunnel ingress
could ask an RFC3168 egress to turn off ECN processing as well as could ask an RFC3168 egress to turn off ECN processing as well as
itself turning off ECN. An egress compliant with the present itself turning off ECN. An egress compliant with the present
specification will agree to such a request from a legacy ingress, specification will agree to such a request from a legacy ingress,
but it relies on the ingress always sending Not-ECT in the outer. but it relies on the ingress always sending Not-ECT in the outer.
If the egress receives other ECN codepoints in the outer it will If the egress receives other ECN codepoints in the outer it will
process them as normal, so it will actually still copy congestion process them as normal, so it will actually still copy congestion
markings from the outer to the outgoing header. Referring for markings from the outer to the outgoing header. Referring for
example to Figure 5 (Appendix B.1), although the tunnel ingress example to Figure 5 (Appendix B.1), although the tunnel ingress
'I' will set all ECN fields in outer headers to Not-ECT, 'M' could 'I' will set all ECN fields in outer headers to Not-ECT, 'M' could
still toggle CE or ECT(1) on and off to communicate covertly with still toggle CE or ECT(1) on and off to communicate covertly with
'B', because we have specified that 'E' only has one mode 'B', because we have specified that 'E' only has one mode
regardless of what mode it says it has negotiated. We could have regardless of what mode it says it has negotiated. We could have
specified that 'E' should have a limited functionality mode and specified that 'E' should have a limited functionality mode and
check for such behaviour. But we decided not to add the extra check for such behaviour. But we decided not to add the extra
complexity of two modes on a compliant tunnel egress merely to complexity of two modes on a compliant tunnel egress merely to
cater for an historic security concern that is now considered cater for an historic security concern that is now considered
manageable. manageable.
9. Conclusions 10. Conclusions
This document allows tunnels to propagate an extra level of This document allows tunnels to propagate an extra level of
congestion severity. It uses previously unused combinations of inner congestion severity. It uses previously unused combinations of inner
and outer header to augment the rules for calculating the ECN field and outer header to augment the rules for calculating the ECN field
when decapsulating IP packets at the egress of IPsec (RFC4301) and when decapsulating IP packets at the egress of IPsec (RFC4301) and
non-IPsec (RFC3168) tunnels. non-IPsec (RFC3168) tunnels.
This document also updates the ingress tunnelling encapsulation of This document also updates the ingress tunnelling encapsulation of
RFC3168 ECN to bring all IP in IP tunnels into line with the new RFC3168 ECN to bring all IP in IP tunnels into line with the new
behaviour in the IPsec architecture of RFC4301, which copies rather behaviour in the IPsec architecture of RFC4301, which copies rather
skipping to change at page 31, line 22 skipping to change at page 32, line 47
At the same time as removing these legacy constraints, the At the same time as removing these legacy constraints, the
opportunity has been taken to draw together diverging tunnel opportunity has been taken to draw together diverging tunnel
specifications into a single consistent behaviour. Then any tunnel specifications into a single consistent behaviour. Then any tunnel
can be deployed unilaterally, and it will support the full range of can be deployed unilaterally, and it will support the full range of
congestion control and management schemes without any modes or congestion control and management schemes without any modes or
configuration. Further, any host or router can expect the ECN field configuration. Further, any host or router can expect the ECN field
to behave in the same way, whatever type of tunnel might intervene in to behave in the same way, whatever type of tunnel might intervene in
the path. This new certainty could enable new uses of the ECN field the path. This new certainty could enable new uses of the ECN field
that would otherwise be confounded by ambiguity. that would otherwise be confounded by ambiguity.
10. Acknowledgements 11. Acknowledgements
Thanks to David Black for his insightful reviews and patient Thanks to David Black for his insightful reviews and patient
explanations of better ways to think about function placement and explanations of better ways to think about function placement and
alarms. Thanks to David and to Anil Agawaal for pointing out cases alarms. Thanks to David and to Anil Agarwal for pointing out cases
where it is safe to forward CU combinations of headers. Also thanks where it is safe to forward CU combinations of headers. Also thanks
to Arnaud Jacquet for the idea for Appendix C. Thanks to Gorry to Arnaud Jacquet for the idea for Appendix C. Thanks to Gorry
Fairhurst, Teco Boot, Michael Menth, Bruce Davie, Toby Moncaster, Fairhurst, Teco Boot, Michael Menth, Bruce Davie, Toby Moncaster,
Sally Floyd, Alfred Hoenes, Gabriele Corliano, Ingemar Johansson and Sally Floyd, Alfred Hoenes, Gabriele Corliano, Ingemar Johansson and
Phil Eardley for their thoughts and careful review comments. Philip Eardley for their thoughts and careful review comments, and to
Stephen Hanna and Ben Campbell respectively for conducting the
Security Directorate and General Area reviews.
Bob Briscoe is partly funded by Trilogy, a research project (ICT- Bob Briscoe is partly funded by Trilogy, a research project (ICT-
216372) supported by the European Community under its Seventh 216372) supported by the European Community under its Seventh
Framework Programme. The views expressed here are those of the Framework Programme. The views expressed here are those of the
author only. author only.
Comments Solicited (to be removed by the RFC Editor): Comments Solicited (to be removed by the RFC Editor):
Comments and questions are encouraged and very welcome. They can be Comments and questions are encouraged and very welcome. They can be
addressed to the IETF Transport Area working group mailing list addressed to the IETF Transport Area working group mailing list
<tsvwg@ietf.org>, and/or to the authors. <tsvwg@ietf.org>, and/or to the authors.
11. References 12. References
11.1. Normative References
[RFC2003] Perkins, C., "IP Encapsulation
within IP", RFC 2003, October 1996.
[RFC2119] Bradner, S., "Key words for use in
RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119,
March 1997.
[RFC3168] Ramakrishnan, K., Floyd, S., and D.
Black, "The Addition of Explicit
Congestion Notification (ECN) to
IP", RFC 3168, September 2001.
[RFC4301] Kent, S. and K. Seo, "Security
Architecture for the Internet
Protocol", RFC 4301, December 2005.
11.2. Informative References
[I-D.ietf-pcn-3-in-1-encoding] Briscoe, B. and T. Moncaster, "PCN
3-State Encoding Extension in a
single DSCP",
draft-ietf-pcn-3-in-1-encoding-01
(work in progress), February 2010.
[I-D.ietf-pcn-3-state-encoding] Briscoe, B., Moncaster, T., and M.
Menth, "A PCN encoding using 2
DSCPs to provide 3 or more states",
draft-ietf-pcn-3-state-encoding-01
(work in progress), February 2010.
[I-D.ietf-pcn-psdm-encoding] Menth, M., Babiarz, J., Moncaster, 12.1. Normative References
T., and B. Briscoe, "PCN Encoding
for Packet-Specific Dual Marking
(PSDM)",
draft-ietf-pcn-psdm-encoding-00
(work in progress), June 2009.
[I-D.ietf-pcn-sm-edge-behaviour] Charny, A., Karagiannis, G., Menth, [RFC2003] Perkins, C., "IP Encapsulation within
M., and T. Taylor, "PCN Boundary IP", RFC 2003, October 1996.
Node Behaviour for the Single
Marking (SM) Mode of Operation",
draft-ietf-pcn-sm-edge-behaviour-01
(work in progress), October 2009.
[I-D.satoh-pcn-st-marking] Satoh, D., Ueno, H., Maeda, Y., and [RFC2119] Bradner, S., "Key words for use in
O. Phanachet, "Single PCN Threshold RFCs to Indicate Requirement Levels",
Marking by using PCN baseline BCP 14, RFC 2119, March 1997.
encoding for both admission and
termination controls",
draft-satoh-pcn-st-marking-02 (work
in progress), September 2009.
[RFC2401] Kent, S. and R. Atkinson, "Security [RFC3168] Ramakrishnan, K., Floyd, S., and D.
Architecture for the Internet Black, "The Addition of Explicit
Protocol", RFC 2401, November 1998. Congestion Notification (ECN) to IP",
RFC 3168, September 2001.
[RFC2474] Nichols, K., Blake, S., Baker, F., [RFC4301] Kent, S. and K. Seo, "Security
and D. Black, "Definition of the Architecture for the Internet
Differentiated Services Field (DS Protocol", RFC 4301, December 2005.
Field) in the IPv4 and IPv6
Headers", RFC 2474, December 1998.
[RFC2481] Ramakrishnan, K. and S. Floyd, "A 12.2. Informative References
Proposal to add Explicit Congestion
Notification (ECN) to IP",
RFC 2481, January 1999.
[RFC2983] Black, D., "Differentiated Services [I-D.ietf-pcn-3-in-1-encoding] Briscoe, B., Moncaster, T., and M.
and Tunnels", RFC 2983, Menth, "Encoding 3 PCN-States in the
October 2000. IP header using a single DSCP",
draft-ietf-pcn-3-in-1-encoding-03
(work in progress), July 2010.
[RFC3540] Spring, N., Wetherall, D., and D. [RFC2401] Kent, S. and R. Atkinson, "Security
Ely, "Robust Explicit Congestion Architecture for the Internet
Notification (ECN) Signaling with Protocol", RFC 2401, November 1998.
Nonces", RFC 3540, June 2003.
[RFC4306] Kaufman, C., "Internet Key Exchange [RFC2474] Nichols, K., Blake, S., Baker, F.,
(IKEv2) Protocol", RFC 4306, and D. Black, "Definition of the
December 2005. Differentiated Services Field (DS
Field) in the IPv4 and IPv6 Headers",
RFC 2474, December 1998.
[RFC4774] Floyd, S., "Specifying Alternate [RFC2481] Ramakrishnan, K. and S. Floyd, "A
Semantics for the Explicit Proposal to add Explicit Congestion
Congestion Notification (ECN) Notification (ECN) to IP", RFC 2481,
Field", BCP 124, RFC 4774, January 1999.
November 2006.
[RFC5129] Davie, B., Briscoe, B., and J. Tay, [RFC2983] Black, D., "Differentiated Services
"Explicit Congestion Marking in and Tunnels", RFC 2983, October 2000.
MPLS", RFC 5129, January 2008.
[RFC5559] Eardley, P., "Pre-Congestion [RFC3540] Spring, N., Wetherall, D., and D.
Notification (PCN) Architecture", Ely, "Robust Explicit Congestion
RFC 5559, June 2009. Notification (ECN) Signaling with
Nonces", RFC 3540, June 2003.
[RFC5670] Eardley, P., "Metering and Marking [RFC4306] Kaufman, C., "Internet Key Exchange
Behaviour of PCN-Nodes", RFC 5670, (IKEv2) Protocol", RFC 4306,
November 2009. December 2005.
[RFC5696] Moncaster, T., Briscoe, B., and M. [RFC4774] Floyd, S., "Specifying Alternate
Menth, "Baseline Encoding and Semantics for the Explicit Congestion
Transport of Pre-Congestion Notification (ECN) Field", BCP 124,
Information", RFC 5696, RFC 4774, November 2006.
November 2009.
[VCP] Xia, Y., Subramanian, L., Stoica, [RFC5129] Davie, B., Briscoe, B., and J. Tay,
I., and S. Kalyanaraman, "One more "Explicit Congestion Marking in
bit is enough", Proc. SIGCOMM'05, MPLS", RFC 5129, January 2008.
ACM CCR 35(4)37--48, 2005, <http://
doi.acm.org/10.1145/
1080091.1080098>.
Editorial Comments [RFC5559] Eardley, P., "Pre-Congestion
Notification (PCN) Architecture",
RFC 5559, June 2009.
[Note_Manual_Keying] Bob Briscoe: Note (To be removed by the RFC [RFC5670] Eardley, P., "Metering and Marking
Editor): One corner case can exist where an Behaviour of PCN-Nodes", RFC 5670,
RFC4301 ingress does not use IKEv2, but uses November 2009.
manual keying instead. Then an RFC4301 ingress
could conceivably be configured to tunnel to an
egress with limited functionality ECN handling.
Strictly, for this corner-case, the requirement
to use compatibility mode in this specification
updates RFC4301. However, this is such a remote
possibility that RFC4301 IPsec implementations
are not required to implement compatibility
mode. It is planned to remove this note after
the review process has completed to avoid
unnecessarily complicating the document with a
largely theoretical corner case.
[Note_PCN_egress] Bob Briscoe: During the review process Appendix [RFC5696] Moncaster, T., Briscoe, B., and M.
D is provided to expand on this point, but it Menth, "Baseline Encoding and
will be deleted before publication. Transport of Pre-Congestion
Information", RFC 5696,
November 2009.
[Note_PCN_ingress] Bob Briscoe: During the review process Appendix [VCP] Xia, Y., Subramanian, L., Stoica, I.,
E is provided to expand on this point, but it and S. Kalyanaraman, "One more bit is
will be deleted before publication. enough", Proc. SIGCOMM'05, ACM
CCR 35(4)37--48, 2005, <http://
doi.acm.org/10.1145/1080091.1080098>.
Appendix A. Early ECN Tunnelling RFCs Appendix A. Early ECN Tunnelling RFCs
IP in IP tunnelling was originally defined in [RFC2003]. On IP in IP tunnelling was originally defined in [RFC2003]. On
encapsulation, the incoming header was copied to the outer and on encapsulation, the incoming header was copied to the outer and on
decapsulation the outer was simply discarded. Initially, IPsec decapsulation the outer was simply discarded. Initially, IPsec
tunnelling [RFC2401] followed the same behaviour. tunnelling [RFC2401] followed the same behaviour.
When ECN was introduced experimentally in [RFC2481], legacy (RFC2003 When ECN was introduced experimentally in [RFC2481], legacy (RFC2003
or RFC2401) tunnels would have discarded any congestion markings or RFC2401) tunnels would have discarded any congestion markings
skipping to change at page 40, line 18 skipping to change at page 40, line 28
| | | represents 100 packets | | | represents 100 packets
| 30 | | | 30 | |
| | | p_t = 12/(100-30) | | | p_t = 12/(100-30)
p_t + +---------+ = 12/70 p_t + +---------+ = 12/70
| | 12 | = 17% | | 12 | = 17%
0 +-----+---------+---> 0 +-----+---------+--->
0 30% 100% inner header marking 0 30% 100% inner header marking
Figure 7: Tunnel Marking of Packets Already Marked at Ingress Figure 7: Tunnel Marking of Packets Already Marked at Ingress
Appendix D. Why Losing ECT(1) on Decapsulation Impedes PCN (to be Appendix D. Compromise on Decap with ECT(1) Inner and ECT(0) Outer
removed before publication)
Congestion notification with two severity levels is currently on the
IETF's standards track agenda in the Congestion and Pre-Congestion
Notification (PCN) working group. PCN needs all four possible states
of congestion signalling in the 2-bit ECN field to be propagated at
the egress, but pre-existing tunnels only propagate three. The four
PCN states are: not PCN-enabled, not marked and two increasingly
severe levels of congestion marking. The less severe marking means
'stop admitting new traffic' and the more severe marking means
'terminate some existing flows', which may be needed after reroutes
(see [RFC5559] for more details). (Note on terminology: wherever
this document counts four congestion states, the PCN working group
would count this as three PCN states plus a not-PCN-enabled state.)
Figure 2 (Section 3.2) shows that pre-existing decapsulation
behaviour would have discarded any ECT(1) markings in outer headers
if the inner was ECT(0). This prevented the PCN working group from
using ECT(1) -- if a PCN node used ECT(1) to indicate one of the
severity levels of congestion, any later tunnel egress would revert
the marking to ECT(0) as if nothing had happened. Effectively the
decapsulation rules of RFC4301 and RFC3168 waste one ECT codepoint;
they treat the ECT(0) and ECT(1) codepoints as a single codepoint.
A number of work-rounds to this problem were proposed in the PCN w-g;
to add the fourth state another way or avoid needing it. Without
wishing to disparage the ingenuity of these work-rounds, none were
chosen for the standards track because they were either somewhat
wasteful, imprecise or complicated:
o One uses a pair of Diffserv codepoint(s) in place of each PCN DSCP
to encode the extra state [I-D.ietf-pcn-3-state-encoding], using
up the rapidly exhausting DSCP space while leaving an ECN
codepoint unused.
o Another survives tunnelling without an extra DSCP
[I-D.ietf-pcn-psdm-encoding], but it requires the PCN edge
gateways to share the initial state of a packet out of band.
o Another proposes a more involved marking algorithm in forwarding
elements to encode the three congestion notification states using
only two ECN codepoints [I-D.satoh-pcn-st-marking].
o Another takes a different approach; it compromises the precision
of the admission control mechanism in some network scenarios, but
manages to work with just three encoding states and a single
marking algorithm [I-D.ietf-pcn-sm-edge-behaviour].
Rather than require the IETF to bless any of these experimental
encoding work-rounds, the present specification fixes the root cause
of the problem so that operators deploying PCN can simply require
that tunnel end-points within a PCN region should comply with this
new ECN tunnelling specification. On the public Internet it would
not be possible to know whether all tunnels complied with this new
specification, but universal compliance is feasible for PCN, because
it is intended to be deployed in a controlled Diffserv region.
Given the present specification, the PCN w-g could progress a
trivially simple four-state ECN encoding
[I-D.ietf-pcn-3-in-1-encoding]. This would replace the interim
standards track baseline encoding of just three states [RFC5696]
which makes a fourth state available for any of the experimental
alternatives.
Appendix E. Why Resetting ECN on Encapsulation Impedes PCN (to be
removed before publication)
The PCN architecture says "...if encapsulation is done within the
PCN-domain: Any PCN-marking is copied into the outer header. Note: A
tunnel will not provide this behaviour if it complies with [RFC3168]
tunnelling in either mode, but it will if it complies with [RFC4301]
IPsec tunnelling. "
The specific issue here concerns PCN excess rate marking [RFC5670].
The purpose of excess rate marking is to provide a bulk mechanism for
interior nodes within a PCN domain to mark traffic that is exceeding
a configured threshold bit-rate, perhaps after an unexpected event
such as a reroute, a link or node failure, or a more widespread
disaster. Reroutes are a common cause of QoS degradation in IP
networks. After reroutes it is common for multiple links in a
network to become stressed at once. Therefore, PCN excess rate
marking has been carefully designed to ensure traffic marked at one
queue will not be counted again for marking at subsequent queues (see
the `Excess traffic meter function' of [RFC5670]).
However, if an RFC3168 tunnel ingress intervenes, it resets the ECN
field in all the outer headers. This will cause excess traffic to be
counted more than once, leading to many flows being removed that did
not need to be removed at all. This is why the an RFC3168 tunnel
ingress cannot be used in a PCN domain.
The ECN reset in RFC3168 is no longer deemed necessary, it is
inconsistent with RFC4301, it is not as simple as RFC4301 and it is
impeding deployment of new protocols like PCN. The present
specification corrects this perverse situation.
Appendix F. Compromise on Decap with ECT(1) Inner and ECT(0) Outer
A packet with an ECT(1) inner and an ECT(0) outer should never arise A packet with an ECT(1) inner and an ECT(0) outer should never arise
from any known IETF protocol. Without giving a reason, RFC3168 and from any known IETF protocol. Without giving a reason, RFC3168 and
RFC4301 both say the outer should be ignored when decapsulating such RFC4301 both say the outer should be ignored when decapsulating such
a packet. This appendix explains why it was decided not to change a packet. This appendix explains why it was decided not to change
this advice. this advice.
In summary, ECT(0) always means 'not congested' and ECT(1) may imply In summary, ECT(0) always means 'not congested' and ECT(1) may imply
the same [RFC3168] or it may imply a higher severity congestion the same [RFC3168] or it may imply a higher severity congestion
signal [RFC4774], [I-D.ietf-pcn-3-in-1-encoding], depending on the signal [RFC4774], [I-D.ietf-pcn-3-in-1-encoding], depending on the
skipping to change at page 43, line 20 skipping to change at page 41, line 32
Superficially, the opposite case where the inner and outer carry Superficially, the opposite case where the inner and outer carry
different ECT values, but with an ECT(1) outer and ECT(0) inner, different ECT values, but with an ECT(1) outer and ECT(0) inner,
seems to require a similar compromise. However, because that case is seems to require a similar compromise. However, because that case is
reversed, no compromise is necessary; it is best to forward the outer reversed, no compromise is necessary; it is best to forward the outer
whether the transport expects the ECT(1) to mean a higher severity whether the transport expects the ECT(1) to mean a higher severity
than ECT(0) or the same severity. Forwarding the outer either than ECT(0) or the same severity. Forwarding the outer either
preserves a higher value (if it is higher) or it reveals an anomaly preserves a higher value (if it is higher) or it reveals an anomaly
to the transport (if the two ECT codepoints mean the same severity). to the transport (if the two ECT codepoints mean the same severity).
Appendix G. Open Issues Appendix E. Open Issues
The new decapsulation behaviour defined in Section 4.2 adds support The new decapsulation behaviour defined in Section 4.2 adds support
for propagation of 2 severity levels of congestion. However for propagation of 2 severity levels of congestion. However
transports have no way to discover whether there are any legacy transports have no way to discover whether there are any legacy
tunnels on their path that will not propagate 2 severity levels. It tunnels on their path that will not propagate 2 severity levels. It
would have been nice to add a feature for transports to check path would have been nice to add a feature for transports to check path
support, but this remains an open issue that will have to be support, but this remains an open issue that will have to be
addressed in any future standards action to define an end-to-end addressed in any future standards action to define an end-to-end
scheme that requires 2-severity levels of congestion. PCN avoids scheme that requires 2-severity levels of congestion. PCN avoids
this problem because it is only for a controlled region, so all this problem because it is only for a controlled region, so all
 End of changes. 61 change blocks. 
349 lines changed or deleted 248 lines changed or added

This html diff was produced by rfcdiff 1.38. The latest version is available from http://tools.ietf.org/tools/rfcdiff/