draft-ietf-tsvwg-datagram-plpmtud-17.txt   draft-ietf-tsvwg-datagram-plpmtud-18.txt 
Internet Engineering Task Force G. Fairhurst Internet Engineering Task Force G. Fairhurst
Internet-Draft T. Jones Internet-Draft T. Jones
Updates: 4821, 4960, 6951, 8085, 8261 (if University of Aberdeen Updates: 4821, 4960, 6951, 8085, 8261 (if University of Aberdeen
approved) M. Tuexen approved) M. Tuexen
Intended status: Standards Track I. Ruengeler Intended status: Standards Track I. Ruengeler
Expires: 24 September 2020 T. Voelker Expires: 4 October 2020 T. Voelker
Muenster University of Applied Sciences Muenster University of Applied Sciences
23 March 2020 2 April 2020
Packetization Layer Path MTU Discovery for Datagram Transports Packetization Layer Path MTU Discovery for Datagram Transports
draft-ietf-tsvwg-datagram-plpmtud-17 draft-ietf-tsvwg-datagram-plpmtud-18
Abstract Abstract
This document describes a robust method for Path MTU Discovery This document describes a robust method for Path MTU Discovery
(PMTUD) for datagram Packetization Layers (PLs). It describes an (PMTUD) for datagram Packetization Layers (PLs). It describes an
extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path
MTU Discovery for IPv4 and IPv6. The method allows a PL, or a MTU Discovery for IPv4 and IPv6. The method allows a PL, or a
datagram application that uses a PL, to discover whether a network datagram application that uses a PL, to discover whether a network
path can support the current size of datagram. This can be used to path can support the current size of datagram. This can be used to
detect and reduce the message size when a sender encounters a packet detect and reduce the message size when a sender encounters a packet
skipping to change at page 2, line 15 skipping to change at page 2, line 15
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 24 September 2020. This Internet-Draft will expire on 4 October 2020.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 2, line 41 skipping to change at page 2, line 41
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4
1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6
1.3. Path MTU Discovery for Datagram Services . . . . . . . . 7 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 7
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8
3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 10 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 10
4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 13 4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 13
4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 13 4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 13
4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 14 4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 15
4.3. Black Hole Detection . . . . . . . . . . . . . . . . . . 15 4.3. Black Hole Detection . . . . . . . . . . . . . . . . . . 15
4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 16 4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 16
4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 17 4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 17
4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 17 4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 17
4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 17 4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 17
4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 18 4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 18
5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 20 5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 20
5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 20 5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 21
5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 21 5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 21
5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 21 5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 22
5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 22 5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 22
5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 23 5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 23
5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 25 5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 25
5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 28 5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 28
5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 28 5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 28
5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 29 5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 29
5.3.3. Resilience to Inconsistent Path Information . . . . . 29 5.3.3. Resilience to Inconsistent Path Information . . . . . 29
5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 30 5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 30
6. Specification of Protocol-Specific Methods . . . . . . . . . 30 6. Specification of Protocol-Specific Methods . . . . . . . . . 30
6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 30 6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 30
skipping to change at page 3, line 46 skipping to change at page 3, line 46
6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 35 6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 35
6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 35 6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 35
6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 36 6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 36
6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 36 6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 36
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 36 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 36
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36
9. Security Considerations . . . . . . . . . . . . . . . . . . . 36 9. Security Considerations . . . . . . . . . . . . . . . . . . . 36
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 38
10.1. Normative References . . . . . . . . . . . . . . . . . . 38 10.1. Normative References . . . . . . . . . . . . . . . . . . 38
10.2. Informative References . . . . . . . . . . . . . . . . . 39 10.2. Informative References . . . . . . . . . . . . . . . . . 39
Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 40 Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 41
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45
1. Introduction 1. Introduction
The IETF has specified datagram transport using UDP, SCTP, and DCCP, The IETF has specified datagram transport using UDP, SCTP, and DCCP,
as well as protocols layered on top of these transports (e.g., SCTP/ as well as protocols layered on top of these transports (e.g., SCTP/
UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP
network layer. This document describes a robust method for Path MTU network layer. This document describes a robust method for Path MTU
Discovery (PMTUD) that can be used with these transport protocols (or Discovery (PMTUD) that can be used with these transport protocols (or
the applications that use their transport service) to discover an the applications that use their transport service) to discover an
skipping to change at page 6, line 24 skipping to change at page 6, line 24
validate the message, because validation depends on information validate the message, because validation depends on information
about the active transport flows at an endpoint node (e.g., the about the active transport flows at an endpoint node (e.g., the
socket/address pairs being used, and other protocol header socket/address pairs being used, and other protocol header
information). information).
* When a packet is encapsulated/tunneled over an encrypted * When a packet is encapsulated/tunneled over an encrypted
transport, the tunnel/encapsulation ingress might have transport, the tunnel/encapsulation ingress might have
insufficient context, or computational power, to reconstruct the insufficient context, or computational power, to reconstruct the
transport header that would be needed to perform validation. transport header that would be needed to perform validation.
* When an ICMP message is generated by a router in a network segment
that has inserted a header into a packet, the quoted packet could
contain additional protocol header information that was not
included in the original sent packet, and which the PL sender does
not process or may not know how to process. This could disrupt
the ability of the sender to validate this PTB message.
* A Network Address Translation (NAT) device that translates a * A Network Address Translation (NAT) device that translates a
packet header, ought to also translate ICMP messages and update packet header, ought to also translate ICMP messages and update
the ICMP quoted packet [RFC5508] in that message. If this is not the ICMP quoted packet [RFC5508] in that message. If this is not
correctly translated then the sender would not be able to correctly translated then the sender would not be able to
associate the message with the PL that originated the packet, and associate the message with the PL that originated the packet, and
hence this ICMP message cannot be validated. hence this ICMP message cannot be validated.
1.2. Packetization Layer Path MTU Discovery 1.2. Packetization Layer Path MTU Discovery
The term Packetization Layer (PL) has been introduced to describe the The term Packetization Layer (PL) has been introduced to describe the
skipping to change at page 9, line 20 skipping to change at page 9, line 23
Effective PMTU: The Effective PMTU is the current estimated value Effective PMTU: The Effective PMTU is the current estimated value
for PMTU that is used by a PMTUD. This is equivalent to the for PMTU that is used by a PMTUD. This is equivalent to the
PLPMTU derived by PLPMTUD plus the size of any headers added below PLPMTU derived by PLPMTUD plus the size of any headers added below
the PL, including the IP layer headers. the PL, including the IP layer headers.
EMTU_S: The Effective MTU for sending (EMTU_S) is defined in EMTU_S: The Effective MTU for sending (EMTU_S) is defined in
[RFC1122] as "the maximum IP datagram size that may be sent, for a [RFC1122] as "the maximum IP datagram size that may be sent, for a
particular combination of IP source and destination addresses...". particular combination of IP source and destination addresses...".
EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in
[RFC1122] as the largest datagram size that can be reassembled by [RFC1122] as "the largest datagram size that can be reassembled".
EMTU_R (Effective MTU to receive).
Link: A Link is a communication facility or medium over which nodes Link: A Link is a communication facility or medium over which nodes
can communicate at the link layer, i.e., a layer below the IP can communicate at the link layer, i.e., a layer below the IP
layer. Examples are Ethernet LANs and Internet (or higher) layer layer. Examples are Ethernet LANs and Internet (or higher) layer
and tunnels. and tunnels.
Link MTU: The Link Maximum Transmission Unit (MTU) is the size in Link MTU: The Link Maximum Transmission Unit (MTU) is the size in
bytes of the largest IP packet, including the IP header and bytes of the largest IP packet, including the IP header and
payload, that can be transmitted over a link. Note that this payload, that can be transmitted over a link. Note that this
could more properly be called the IP MTU, to be consistent with could more properly be called the IP MTU, to be consistent with
skipping to change at page 9, line 47 skipping to change at page 9, line 49
[RFC4821], that states "All links MUST enforce their MTU: links [RFC4821], that states "All links MUST enforce their MTU: links
that might non- deterministically deliver packets that are larger that might non- deterministically deliver packets that are larger
than their rated MTU MUST consistently discard such packets." than their rated MTU MUST consistently discard such packets."
MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU that MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU that
DPLPMTUD will attempt to use. DPLPMTUD will attempt to use.
MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that
DPLPMTUD will attempt to use. DPLPMTUD will attempt to use.
MPS: MPS: The Maximum Packet Size (MPS) is the largest size of MPS: The Maximum Packet Size (MPS) is the largest size of
application data block that can be sent across a network path by a application data block that can be sent across a network path by a
PL using a single Datagram. PL using a single Datagram.
Packet: A Packet is the IP header plus the IP payload. Packet: A Packet is the IP header plus the IP payload.
Packetization Layer (PL): The PL is a layer of the network stack Packetization Layer (PL): The PL is a layer of the network stack
that places data into packets and performs transport protocol that places data into packets and performs transport protocol
functions. Examples of a PL include: TCP, SCTP, SCTP over DTLS or functions. Examples of a PL include: TCP, SCTP, SCTP over DTLS or
QUIC. QUIC.
skipping to change at page 11, line 18 skipping to change at page 11, line 21
able to transmit a packet larger than the PLMPMTU. This is used able to transmit a packet larger than the PLMPMTU. This is used
to send a probe packet. In IPv4, a probe packet MUST be sent to send a probe packet. In IPv4, a probe packet MUST be sent
with the Don't Fragment (DF) bit set in the IP header, and with the Don't Fragment (DF) bit set in the IP header, and
without network layer endpoint fragmentation. In IPv6, a probe without network layer endpoint fragmentation. In IPv6, a probe
packet is always sent without source fragmentation (as specified packet is always sent without source fragmentation (as specified
in section 5.4 of [RFC8201]). in section 5.4 of [RFC8201]).
3. Reception feedback: The destination PL endpoint is REQUIRED to 3. Reception feedback: The destination PL endpoint is REQUIRED to
provide a feedback method that indicates to the DPLPMTUD sender provide a feedback method that indicates to the DPLPMTUD sender
when a probe packet has been received by the destination PL when a probe packet has been received by the destination PL
endpoint. endpoint. Section 6 provides examples of how a PL can provide
this acknowledgment of received probe packets.
4. Probe loss recovery: It is RECOMMENDED to use probe packets that 4. Probe loss recovery: It is RECOMMENDED to use probe packets that
do not carry any user data that would require retransmission if do not carry any user data that would require retransmission if
lost. Most datagram transports permit this. If a probe packet lost. Most datagram transports permit this. If a probe packet
contains user data requiring retransmission in case of loss, the contains user data requiring retransmission in case of loss, the
PL (or layers above) are REQUIRED to arrange any retransmission/ PL (or layers above) are REQUIRED to arrange any retransmission/
repair of any resulting loss. The PL is REQUIRED to be robust repair of any resulting loss. The PL is REQUIRED to be robust
in the case where probe packets are lost due to other reasons in the case where probe packets are lost due to other reasons
(including link transmission error, congestion). (including link transmission error, congestion).
skipping to change at page 12, line 23 skipping to change at page 12, line 26
the interval between probe packets MUST be at least one RTT. If the interval between probe packets MUST be at least one RTT. If
transmission of probe packets is limited by the congestion transmission of probe packets is limited by the congestion
controller, this could result in transmission of probe packets controller, this could result in transmission of probe packets
being delayed or suspended during congestion. being delayed or suspended during congestion.
8. Loss of a probe packet SHOULD NOT be treated as an indication of 8. Loss of a probe packet SHOULD NOT be treated as an indication of
congestion and SHOULD NOT trigger a congestion control reaction congestion and SHOULD NOT trigger a congestion control reaction
[RFC4821], because this could result in unnecessary reduction of [RFC4821], because this could result in unnecessary reduction of
the sending rate. the sending rate.
9. An update to the PLPMTU (or MPS) MUST NOT modify the congestion 9. An update to the PLPMTU (or MPS) MUST NOT increase the
window measured in bytes [RFC4821]. Therefore, an increase in congestion window measured in bytes [RFC4821]. Therefore, an
the packet size does not cause an increase the data rate in increase in the packet size does not cause an increase in the
bytes per second. data rate in bytes per second.
10. Probing and flow control: Flow control at the PL concerns the 10. A PL that maintains the congestion window in terms of a limit to
the number of outstanding fixed size packets SHOULD adapt this
limit to compensate for the size of the actual packets.
11. Probing and flow control: Flow control at the PL concerns the
end-to-end flow of data using the PL service. This does not end-to-end flow of data using the PL service. This does not
apply to DPLPMTU when probe packets use a design that does not apply to DPLPMTU when probe packets use a design that does not
carry user data to the remote application. carry user data to the remote application.
11. Shared PLPMTU state: The PMTU value calculated from the PLPMTU 12. Shared PLPMTU state: The PMTU value calculated from the PLPMTU
MAY also be stored with the corresponding entry associated with MAY also be stored with the corresponding entry associated with
the destination in the IP layer cache, and used by other PL the destination in the IP layer cache, and used by other PL
instances. The specification of PLPMTUD [RFC4821] states: "If instances. The specification of PLPMTUD [RFC4821] states: "If
PLPMTUD updates the MTU for a particular path, all Packetization PLPMTUD updates the MTU for a particular path, all Packetization
Layer sessions that share the path representation (as described Layer sessions that share the path representation (as described
in Section 5.2 of [RFC4821]) SHOULD be notified to make use of in Section 5.2 of [RFC4821]) SHOULD be notified to make use of
the new MTU". Such methods MUST be robust to the wide variety the new MTU". Such methods MUST be robust to the wide variety
of underlying network forwarding behaviors. Section 5.2 of of underlying network forwarding behaviors. Section 5.2 of
[RFC8201] provides guidance on the caching of PMTU information [RFC8201] provides guidance on the caching of PMTU information
and also the relation to IPv6 flow labels. and also the relation to IPv6 flow labels.
skipping to change at page 14, line 29 skipping to change at page 14, line 36
Probing using application data and padding data: A probe packet that Probing using application data and padding data: A probe packet that
contains a data block supplied by an application that is combined contains a data block supplied by an application that is combined
with padding to inflate the length of the datagram to the size of with padding to inflate the length of the datagram to the size of
the probe packet. the probe packet.
Probing using application data: A probe packet that contains a data Probing using application data: A probe packet that contains a data
block supplied by an application that matches the size of the block supplied by an application that matches the size of the
probe packet. This method requests the application to issue a probe packet. This method requests the application to issue a
data block of the desired probe size. data block of the desired probe size.
A PL that uses a probe packet carrying an application data and needs A PL that uses a probe packet carrying application data and needs
protection from the loss of this probe packet, could perform protection from the loss of this probe packet could perform
transport-layer retransmission/repair of the data block (e.g., by transport-layer retransmission/repair of the data block (e.g., by
retransmission after loss is detected or by duplicating the data retransmission after loss is detected or by duplicating the data
block in a datagram without the padding data). This retransmited block in a datagram without the padding data). This retransmitted
data block might possibly need to be sent using a smaller PLPMTU, data block might possibly need to be sent using a smaller PLPMTU,
which could need the PL to to use a smaller packet size to traverse which could need the PL to to use a smaller packet size to traverse
the end-to-end path. (This could utilize endpoint network-layer or a the end-to-end path. (This could utilize endpoint network-layer or a
PL that can re-segment the data block into multiple datagrams). PL that can re-segment the data block into multiple datagrams).
DPLPMTUD MAY choose to use only one of these methods to simplify the DPLPMTUD MAY choose to use only one of these methods to simplify the
implementation. implementation.
Probe messages sent by a PL MUST contain enough information to Probe messages sent by a PL MUST contain enough information to
uniquely identify the probe within Maximum Segment Lifetime (e.g., uniquely identify the probe within Maximum Segment Lifetime (e.g.,
skipping to change at page 15, line 21 skipping to change at page 15, line 28
A PL that does not acknowledge data reception (e.g., UDP and UDP- A PL that does not acknowledge data reception (e.g., UDP and UDP-
Lite) is unable itself to detect when the packets that it sends are Lite) is unable itself to detect when the packets that it sends are
discarded because their size is greater than the actual PMTU. These discarded because their size is greater than the actual PMTU. These
PLs need to rely on an application protocol to detect this loss. PLs need to rely on an application protocol to detect this loss.
Section 6 specifies this function for a set of IETF-specified Section 6 specifies this function for a set of IETF-specified
protocols. protocols.
4.3. Black Hole Detection 4.3. Black Hole Detection
The description that follows uses the set of constants defined in
Section 5.1.2 and variables defined in Section 5.1.3.
Black Hole Detection is triggered by an indication that the network Black Hole Detection is triggered by an indication that the network
path could be unable to support the current PLPMTU size. path could be unable to support the current PLPMTU size.
There are three ways to detect black holes: There are three ways to detect black holes:
* A validated PTB message can be received that indicates a * A validated PTB message can be received that indicates a
PL_PTB_SIZE less than the current PLPMTU. A DPLPMTUD method MUST PL_PTB_SIZE less than the current PLPMTU. A DPLPMTUD method MUST
NOT rely solely on this method. NOT rely solely on this method.
* A PL can use the DPLPMTUD probing mechanism to periodically * A PL can use the DPLPMTUD probing mechanism to periodically
skipping to change at page 15, line 48 skipping to change at page 16, line 10
* A PL can utilize an event that indicates the network path no * A PL can utilize an event that indicates the network path no
longer sustains the sender's PLPMTU size. This could use a longer sustains the sender's PLPMTU size. This could use a
mechanism implemented within the PL to detect excessive loss of mechanism implemented within the PL to detect excessive loss of
data sent with a specific packet size and then conclude that this data sent with a specific packet size and then conclude that this
excessive loss could be a result of an invalid PLPMTU (as in excessive loss could be a result of an invalid PLPMTU (as in
PLPMTUD for TCP [RFC4821]). PLPMTUD for TCP [RFC4821]).
A PL MAY inhibit sending probe packets when no application data has A PL MAY inhibit sending probe packets when no application data has
been sent since the previous probe packet. A PL preferring to use an been sent since the previous probe packet. A PL preferring to use an
up-to-data PLPMTU once user data is sent again, MAY choose to up-to-data PLPMTU once user data is sent again MAY choose to continue
continue PLPMTU discovery for each path. However, this could result PLPMTU discovery for each path. However, this could result in
in additional packets being sent. additional packets being sent.
When the method detects the current PLPMTU is not supported, DPLPMTUD When the method detects the current PLPMTU is not supported, DPLPMTUD
sets a lower PLPMTU, and sets a lower MPS. The PL then confirms that sets a lower PLPMTU, and sets a lower MPS. The PL then confirms that
the new PLPMTU can be successfully used across the path. A probe the new PLPMTU can be successfully used across the path. A probe
packet could need to have a size less than the size of the data block packet could need to have a size less than the size of the data block
generated by the application. generated by the application.
4.4. The Maximum Packet Size (MPS) 4.4. The Maximum Packet Size (MPS)
The result of probing determines a usable PLPMTU, which is used to The result of probing determines a usable PLPMTU, which is used to
skipping to change at page 17, line 4 skipping to change at page 17, line 14
If DPLPMTUD results in a change to the MPS, the application needs to If DPLPMTUD results in a change to the MPS, the application needs to
adapt to the new MPS. A particular case can arise when packets have adapt to the new MPS. A particular case can arise when packets have
been sent with a size less than the MPS and the PLPMTU was been sent with a size less than the MPS and the PLPMTU was
subsequently reduced. If these packets are lost, the PL MAY segment subsequently reduced. If these packets are lost, the PL MAY segment
the data using the new MPS. If a PL is unable to re-segment a the data using the new MPS. If a PL is unable to re-segment a
previously sent datagram (e.g., [RFC4960]), then the sender either previously sent datagram (e.g., [RFC4960]), then the sender either
discards the datagram or could perform retransmission using network- discards the datagram or could perform retransmission using network-
layer fragmentation to form multiple IP packets not larger than the layer fragmentation to form multiple IP packets not larger than the
PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is
preferred over clearing the DF-bit in the IPv4 header. Operational preferred over clearing the DF bit in the IPv4 header. Operational
experience reveals that IP fragmentation can reduce the reliability experience reveals that IP fragmentation can reduce the reliability
of Internet communication [I-D.ietf-intarea-frag-fragile], which may of Internet communication [I-D.ietf-intarea-frag-fragile], which may
reduce the success of retransmission. reduce the success of retransmission.
4.5. Disabling the Effect of PMTUD 4.5. Disabling the Effect of PMTUD
A PL implementing this specification MUST suspend network layer A PL implementing this specification MUST suspend network layer
processing of outgoing packets that enforces a PMTU processing of outgoing packets that enforces a PMTU
[RFC1191][RFC8201] for each flow utilizing DPLPMTUD, and instead use [RFC1191][RFC8201] for each flow utilizing DPLPMTUD, and instead use
DPLPMTUD to control the size of packets that are sent by a flow. DPLPMTUD to control the size of packets that are sent by a flow.
skipping to change at page 17, line 38 skipping to change at page 17, line 48
both of which are referred to as PTB messages in this document. both of which are referred to as PTB messages in this document.
4.6.1. Validation of PTB Messages 4.6.1. Validation of PTB Messages
This section specifies utilization of PTB messages. This section specifies utilization of PTB messages.
* A simple implementation MAY ignore received PTB messages and in * A simple implementation MAY ignore received PTB messages and in
this case the PLPMTU is not updated when a PTB message is this case the PLPMTU is not updated when a PTB message is
received. received.
* An implementation that supports PTB messages MUST validate * A PL that supports PTB messages MUST validate these messages
messages before they are further processed. before they are further processed.
A PL that receives a PTB message from a router or middlebox, performs A PL that receives a PTB message from a router or middlebox performs
ICMP validation as specified in Section 5.2 of [RFC8085][RFC8201]. ICMP validation as specified in Section 5.2 of [RFC8085][RFC8201].
Because DPLPMTUD operates at the PL, the PL needs to check that each Because DPLPMTUD operates at the PL, the PL needs to check that each
received PTB message is received in response to a packet transmitted received PTB message is received in response to a packet transmitted
by the endpoint PL performing DPLPMTUD. by the endpoint PL performing DPLPMTUD.
The PL MUST check the protocol information in the quoted packet The PL MUST check the protocol information in the quoted packet
carried in an ICMP PTB message payload to validate the message carried in an ICMP PTB message payload to validate the message
originated from the sending node. This validation includes originated from the sending node. This validation includes
determining that the combination of the IP addresses, the protocol, determining that the combination of the IP addresses, the protocol,
the source port and destination port match those returned in the the source port and destination port match those returned in the
quoted packet - this is also necessary for the PTB message to be quoted packet - this is also necessary for the PTB message to be
skipping to change at page 18, line 8 skipping to change at page 18, line 18
The PL MUST check the protocol information in the quoted packet The PL MUST check the protocol information in the quoted packet
carried in an ICMP PTB message payload to validate the message carried in an ICMP PTB message payload to validate the message
originated from the sending node. This validation includes originated from the sending node. This validation includes
determining that the combination of the IP addresses, the protocol, determining that the combination of the IP addresses, the protocol,
the source port and destination port match those returned in the the source port and destination port match those returned in the
quoted packet - this is also necessary for the PTB message to be quoted packet - this is also necessary for the PTB message to be
passed to the corresponding PL. passed to the corresponding PL.
The validation SHOULD utilize information that it is not simple for The validation SHOULD utilize information that it is not simple for
an off-path attacker to determine [RFC8085]. For example, by an off-path attacker to determine [RFC8085]. For example, it could
checking the value of a protocol header field known only to the two check the value of a protocol header field known only to the two PL
PL endpoints. A datagram application that uses well-known source and endpoints. A datagram application that uses well-known source and
destination ports ought to also rely on other information to complete destination ports ought to also rely on other information to complete
this validation. this validation.
These checks are intended to provide protection from packets that These checks are intended to provide protection from packets that
originate from a node that is not on the network path. A PTB message originate from a node that is not on the network path. A PTB message
that does not complete the validation MUST NOT be further utilized by that does not complete the validation MUST NOT be further utilized by
the DPLPMTUD method. the DPLPMTUD method.
PTB messages that have been validated MAY be utilized by the DPLPMTUD PTB messages that have been validated MAY be utilized by the DPLPMTUD
algorithm, but MUST NOT be used directly to set the PLPMTU. The algorithm, but MUST NOT be used directly to set the PLPMTU. The
skipping to change at page 18, line 48 skipping to change at page 19, line 11
This section provides a summary of how PTB messages can be utilized. This section provides a summary of how PTB messages can be utilized.
This processing depends on the PL_PTB_SIZE and the current value of a This processing depends on the PL_PTB_SIZE and the current value of a
set of variables: set of variables:
PL_PTB_SIZE < MIN_PLPMTU PL_PTB_SIZE < MIN_PLPMTU
* Invalid PL_PTB_SIZE see Section 4.6.1. * Invalid PL_PTB_SIZE see Section 4.6.1.
* PTB message ought to be discarded without further processing * PTB message ought to be discarded without further processing
(i.e., PLPMTU is not modified). (i.e., PLPMTU is not modified).
* The information could be utilized as an input to a trigger that * The information could be utilized as an input that triggers
would enable a resilience mode. enabling a resilience mode (see Section 5.3.3).
MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU
* A robust PL MAY enter an error state (see Section 5.2) for an * A robust PL MAY enter an error state (see Section 5.2) for an
IPv4 path when the PL_PTB_SIZE reported in the PTB message is IPv4 path when the PL_PTB_SIZE reported in the PTB message is
larger than or equal to 68 bytes [RFC0791] and when this is larger than or equal to 68 bytes [RFC0791] and when this is
less than the BASE_PLPMTU. less than the BASE_PLPMTU.
* A robust PL MAY enter an error state (see Section 5.2) for an * A robust PL MAY enter an error state (see Section 5.2) for an
IPv6 path when the PL_PTB_SIZE reported in the PTB message is IPv6 path when the PL_PTB_SIZE reported in the PTB message is
larger than or equal to 1280 bytes [RFC8200] and when this is larger than or equal to 1280 bytes [RFC8200] and when this is
less than the BASE_PLPMTU. less than the BASE_PLPMTU.
PL_PTB_SIZE = PLPMTU
* Completes the search for a larger PLPMTU.
PL_PTB_SIZE > PROBED_SIZE
* Inconsistent network signal.
* PTB message ought to be discarded without further processing
(i.e., PLPMTU is not modified).
* The information could be utilized as an input to trigger
enabling a resilience mode.
BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU
* This could be an indication of a black hole. The PLPMTU SHOULD * This could be an indication of a black hole. The PLPMTU SHOULD
be set to BASE_PLPMTU (the PLPMTU is reduced to the BASE_PLPMTU be set to BASE_PLPMTU (the PLPMTU is reduced to the BASE_PLPMTU
to avoid unnecessary packet loss when a black hole is to avoid unnecessary packet loss when a black hole is
encountered). encountered).
* The PL ought to start a search to quickly discover the new * The PL ought to start a search to quickly discover the new
PLPMTU. The PL_PTB_SIZE reported in the PTB message can be PLPMTU. The PL_PTB_SIZE reported in the PTB message can be
used to initialize a search algorithm. used to initialize a search algorithm.
PL_PTB_SIZE = PLPMTU
* Completes the search for a larger PLPMTU.
PLPMTU < PL_PTB_SIZE < PROBED_SIZE PLPMTU < PL_PTB_SIZE < PROBED_SIZE
* The PLPMTU continues to be valid, but the size of a packet used * The PLPMTU continues to be valid, but the size of a packet used
to search (PROBED_SIZE) was larger than the actual PMTU. to search (PROBED_SIZE) was larger than the actual PMTU.
* The PLPMTU is not updated. * The PLPMTU is not updated.
* The PL can use the reported PL_PTB_SIZE from the PTB message as * The PL can use the reported PL_PTB_SIZE from the PTB message as
the next search point when it resumes the search algorithm. the next search point when it resumes the search algorithm.
PL_PTB_SIZE > PROBED_SIZE
* Inconsistent network signal.
* PTB message ought to be discarded without further processing
(i.e., PLPMTU is not modified).
* The information could be utilized as an input to trigger
enabling a resilience mode.
5. Datagram Packetization Layer PMTUD 5. Datagram Packetization Layer PMTUD
This section specifies Datagram PLPMTUD (DPLPMTUD). The method can This section specifies Datagram PLPMTUD (DPLPMTUD). The method can
be introduced at various points (as indicated with * in the figure be introduced at various points (as indicated with * in the figure
below) in the IP protocol stack to discover the PLPMTU so that an below) in the IP protocol stack to discover the PLPMTU so that an
application can utilize an appropriate MPS for the current network application can utilize an appropriate MPS for the current network
path. path.
DPLPMTUD SHOULD NOT be used by an upper PL or application if it is DPLPMTUD SHOULD NOT be used by an upper PL or application if it is
already used in a lower layer, DPLPMTUD SHOULD only be performed once already used in a lower layer DPLPMTUD SHOULD only be performed once
between a pair of endpoints. A PL MUST adjust the MPS indicated by between a pair of endpoints. A PL MUST adjust the MPS indicated by
DPLPMTUD to account for any additional overhead introduced by the PL. DPLPMTUD to account for any additional overhead introduced by the PL.
+----------------------+ +----------------------+
| Application* | | Application* |
+-----+------------+---+ +-----+------------+---+
| | | |
+---+--+ +--+--+ +---+--+ +--+--+
| QUIC*| |SCTP*| | QUIC*| |SCTP*|
+---+--+ +-+-+-+ +---+--+ +-+-+-+
skipping to change at page 21, line 23 skipping to change at page 21, line 28
timer value are provided in section 3.1.1 of the UDP Usage timer value are provided in section 3.1.1 of the UDP Usage
Guidelines [RFC8085]. Guidelines [RFC8085].
PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period a PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period a
sender will continue to use the current PLPMTU, after which it re- sender will continue to use the current PLPMTU, after which it re-
enters the Search phase. This timer has a period of 600 seconds, enters the Search phase. This timer has a period of 600 seconds,
as recommended by PLPMTUD [RFC4821]. as recommended by PLPMTUD [RFC4821].
DPLPMTUD MAY inhibit sending probe packets when no application DPLPMTUD MAY inhibit sending probe packets when no application
data has been sent since the previous probe packet. A PL data has been sent since the previous probe packet. A PL
preferring to use an up-to-data PMTU once user data is sent again, preferring to use an up-to-date PMTU once user data is sent again,
can choose to continue PMTU discovery for each path. However, can choose to continue PMTU discovery for each path. However,
this could result in sending additional packets. this could result in sending additional packets.
CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST
NOT be used. For other PLs, the CONFIRMATION_TIMER is configured NOT be used. For other PLs, the CONFIRMATION_TIMER is configured
to the period a PL sender waits before confirming the current to the period a PL sender waits before confirming the current
PLPMTU is still supported. This is less than the PMTU_RAISE_TIMER PLPMTU is still supported. This is less than the PMTU_RAISE_TIMER
and used to decrease the PLPMTU (e.g., when a black hole is and used to decrease the PLPMTU (e.g., when a black hole is
encountered). Confirmation needs to be frequent enough when data encountered). Confirmation needs to be frequent enough when data
is flowing that the sending PL does not black hole extensive is flowing that the sending PL does not black hole extensive
amounts of traffic. Guidance on selection of the timer value are amounts of traffic. Guidance on selection of the timer value are
provided in section 3.1.1 of the UDP Usage Guidelines [RFC8085]. provided in section 3.1.1 of the UDP Usage Guidelines [RFC8085].
DPLPMTUD MAY inhibit sending probe packets when no application DPLPMTUD MAY inhibit sending probe packets when no application
data has been sent since the previous probe packet. A PL data has been sent since the previous probe packet. A PL
preferring to use an up-to-data PMTU once user data is sent again, preferring to use an up-to-data PMTU once user data is sent again,
can choose to continue PMTU discovery for each path. However, can choose to continue PMTU discovery for each path. However,
this could result in sending additional packets. this could result in sending additional packets.
An implementation could implement the various timers using a single The various timers could be implemented using a single timer
timer.
5.1.2. Constants 5.1.2. Constants
The following constants are defined: The following constants are defined:
MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT
counter (see Section 5.1.3). MAX_PROBES represents the limit for counter (see Section 5.1.3). MAX_PROBES represents the limit for
the number of consecutive probe attempts of any size. Search the number of consecutive probe attempts of any size. Search
algorithms benefit from a MAX_PROBES valugreater than 1 because algorithms benefit from a MAX_PROBES value greater than 1 because
this can provide robustness to isolated packet loss. The default this can provide robustness to isolated packet loss. The default
value of MAX_PROBES is 3. value of MAX_PROBES is 3.
MIN_PLPMTU: The MIN_PLPMTU is the smallest allowed probe packet MIN_PLPMTU: The MIN_PLPMTU is the smallest allowed probe packet
size. For IPv6, this value is 1280 bytes, as specified in size. For IPv6, this value is 1280 bytes, as specified in
[RFC8200]. For IPv4, the minimum value is 68 bytes. [RFC8200]. For IPv4, the minimum value is 68 bytes.
Note: An IPv4 router is required to be able to forward a datagram Note: An IPv4 router is required to be able to forward a datagram
of 68 bytes without further fragmentation. This is the combined of 68 bytes without further fragmentation. This is the combined
size of an IPv4 header and the minimum fragment size of 8 bytes. size of an IPv4 header and the minimum fragment size of 8 bytes.
skipping to change at page 23, line 28 skipping to change at page 24, line 5
Figure 3: Relationships between packet size constants and variables Figure 3: Relationships between packet size constants and variables
5.1.4. Overview of DPLPMTUD Phases 5.1.4. Overview of DPLPMTUD Phases
This section provides a high-level informative view of the DPLPMTUD This section provides a high-level informative view of the DPLPMTUD
method, by describing the movement of the method through several method, by describing the movement of the method through several
phases of operation. More detail is available in the state machine phases of operation. More detail is available in the state machine
Section 5.2. Section 5.2.
+------+ +------+
+------->| Base |-----------------+ Connectivity +------->| Base |-----------------+ Connectivity
| +------+ | or BASE_PLPMTU | +------+ | or BASE_PLPMTU
| | | confirmation failed | | | confirmation failed
| | v | | v
| | Connectivity +-------+ | | Connectivity +-------+
| | and BASE_PLPMTU | Error | | | and BASE_PLPMTU | Error |
| | confirmed +-------+ | | confirmed +-------+
| | | Consistent | | | Consistent
| v | connectivity | v | connectivity
PLPMTU | +--------+ | and BASE_PLPMTU Black Hole | +--------+ | and BASE_PLPMTU
confirmation | | Search |<---------------+ confirmed detected | | Search |<---------------+ confirmed
failed | +--------+ | +--------+
| ^ | | ^ |
| | | | | |
| Raise | | Search | Raise | | Search
| timer | | algorithm | timer | | algorithm
| expired | | completed | expired | | completed
| | | | | |
| | v | | v
| +-----------------+ | +-----------------+
+---| Search Complete | +---| Search Complete |
+-----------------+ +-----------------+
Figure 4: DPLPMTUD Phases Figure 4: DPLPMTUD Phases
Base: The Base Phase confirms connectivity to the remote peer using Base: The Base Phase confirms connectivity to the remote peer using
packets of the BASE_PLPMTU. This phase is implicit for a packets of the BASE_PLPMTU. This phase is implicit for a
connection-oriented PL (where it can be performed in a PL connection-oriented PL (where it can be performed in a PL
connection handshake). A connectionless PL sends a probe packet connection handshake). A connectionless PL sends a probe packet
and uses acknowledgment of this probe packet to confirm that the and uses acknowledgment of this probe packet to confirm that the
remote peer is reachable. remote peer is reachable.
skipping to change at page 24, line 41 skipping to change at page 25, line 18
Complete Phase. Complete Phase.
A PL could respond to PTB messages using the PTB to advance or A PL could respond to PTB messages using the PTB to advance or
terminate the search, see Section 4.6. terminate the search, see Section 4.6.
Search Complete: The Search Complete Phase is entered when the Search Complete: The Search Complete Phase is entered when the
PLPMTU is supported across the network path. A PL can use a PLPMTU is supported across the network path. A PL can use a
CONFIRMATION_TIMER to periodically repeat a probe packet for the CONFIRMATION_TIMER to periodically repeat a probe packet for the
current PLPMTU size. If the sender is unable to confirm current PLPMTU size. If the sender is unable to confirm
reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL
signals a lack of reachability, DPLPMTUD enters the Base phase. signals a lack of reachability, a black hole has been detected and
DPLPMTUD enters the Base phase.
The PMTU_RAISE_TIMER is used to periodically resume the search The PMTU_RAISE_TIMER is used to periodically resume the search
phase to discover if the PLPMTU can be raised. Black Hole phase to discover if the PLPMTU can be raised. Black Hole
Detection causes the sender to enter the Base Phase. Detection causes the sender to enter the Base Phase.
Error: The Error Phase is entered when there is conflicting or Error: The Error Phase is entered when there is conflicting or
invalid PLPMTU information for the path (e.g., a failure to invalid PLPMTU information for the path (e.g., a failure to
support the BASE_PLPMTU) that cause DPLPMTUD to be unable to support the BASE_PLPMTU) that cause DPLPMTUD to be unable to
progress and the PLPMTU is lowered. progress and the PLPMTU is lowered.
DPLPMTUD remains in the Error Phase until a consistent view of the DPLPMTUD remains in the Error Phase until a consistent view of the
path can be discovered and it has also been confirmed that the path can be discovered and it has also been confirmed that the
path supports the BASE_PLPMTU (or DPLPMTUD is suspended). path supports the BASE_PLPMTU (or DPLPMTUD is suspended).
An implementation that only reduces the PLPMTU to a suitable size A method that only reduces the PLPMTU to a suitable size would be
would be sufficient to ensure reliable operation, but can be very sufficient to ensure reliable operation, but can be very inefficient
inefficient when the actual PMTU changes or when the method (for when the actual PMTU changes or when the method (for whatever reason)
whatever reason) makes a suboptimal choice for the PLPMTU. makes a suboptimal choice for the PLPMTU.
A full implementation of DPLPMTUD provides an algorithm enabling the A full implementation of DPLPMTUD provides an algorithm enabling the
DPLPMTUD sender to increase the PLPMTU following a change in the DPLPMTUD sender to increase the PLPMTU following a change in the
characteristics of the path, such as when a link is reconfigured with characteristics of the path, such as when a link is reconfigured with
a larger MTU, or when there is a change in the set of links traversed a larger MTU, or when there is a change in the set of links traversed
by an end-to-end flow (e.g., after a routing or path fail-over by an end-to-end flow (e.g., after a routing or path fail-over
decision). decision).
5.2. State Machine 5.2. State Machine
skipping to change at page 29, line 41 skipping to change at page 29, line 41
sizes from a table of common PMTU sizes. When selecting the sizes from a table of common PMTU sizes. When selecting the
appropriate next size to search, an implementer ought to also appropriate next size to search, an implementer ought to also
consider that there can be common sizes of MPS that applications seek consider that there can be common sizes of MPS that applications seek
to use, and their could be common sizes of MTU used within the to use, and their could be common sizes of MTU used within the
network. network.
5.3.3. Resilience to Inconsistent Path Information 5.3.3. Resilience to Inconsistent Path Information
A decision to increase the PLPMTU needs to be resilient to the A decision to increase the PLPMTU needs to be resilient to the
possibility that information learned about the network path is possibility that information learned about the network path is
inconsistent. A path is inconsistent, when, for example, probe inconsistent. A path is inconsistent when, for example, probe
packets are lost due to other reasons (i.e., not packet size) or due packets are lost due to other reasons (i.e., not packet size) or due
to frequent path changes. Frequent path changes could occur by to frequent path changes. Frequent path changes could occur by
unexpected "flapping" - where some packets from a flow pass along one unexpected "flapping" - where some packets from a flow pass along one
path, but other packets follow a different path with different path, but other packets follow a different path with different
properties. properties.
A PL sender is able to detect inconsistency from the sequence of A PL sender is able to detect inconsistency from the sequence of
PLPMTU probes that are acknowledged or the sequence of PTB messages PLPMTU probes that are acknowledged or the sequence of PTB messages
that it receives. When inconsistent path information is detected, a that it receives. When inconsistent path information is detected, a
PL sender could use an alternate search mode that clamps the offered PL sender could use an alternate search mode that clamps the offered
MPS to a smaller value for a period of time. This avoids unnecessary MPS to a smaller value for a period of time. This avoids unnecessary
loss of packets. loss of packets.
5.4. Robustness to Inconsistent Paths 5.4. Robustness to Inconsistent Paths
Some paths could be unable to sustain packets of the BASE_PLPMTU Some paths could be unable to sustain packets of the BASE_PLPMTU
size. To be robust to these paths an implementation could implement size. The Error State could be implemented to provide rubustness to
the Error State. This allows fallback to a smaller than desired such paths. This allows fallback to a smaller than desired PLPMTU,
PLPMTU, rather than suffer connectivity failure. This could utilize rather than suffer connectivity failure. This could utilize methods
methods such as endpoint IP fragmentation to enable the PL sender to such as endpoint IP fragmentation to enable the PL sender to
communicate using packets smaller than the BASE_PLPMTU. communicate using packets smaller than the BASE_PLPMTU.
6. Specification of Protocol-Specific Methods 6. Specification of Protocol-Specific Methods
DPLPMTUD requires protocol-specific details to be specified for each DPLPMTUD requires protocol-specific details to be specified for each
PL that is used. PL that is used.
The first subsection provides guidance on how to implement the The first subsection provides guidance on how to implement the
DPLPMTUD method as a part of an application using UDP or UDP-Lite. DPLPMTUD method as a part of an application using UDP or UDP-Lite.
The guidance also applies to other datagram services that do not The guidance also applies to other datagram services that do not
skipping to change at page 31, line 11 skipping to change at page 31, line 11
use common method for managing the PLPMTU has benefits, both in the use common method for managing the PLPMTU has benefits, both in the
ability to share state between different processes and opportunities ability to share state between different processes and opportunities
to coordinate probing. to coordinate probing.
6.1.1. Application Request 6.1.1. Application Request
An application needs an application-layer protocol mechanism (such as An application needs an application-layer protocol mechanism (such as
a message acknowledgment method) that solicits a response from a a message acknowledgment method) that solicits a response from a
destination endpoint. The method SHOULD allow the sender to check destination endpoint. The method SHOULD allow the sender to check
the value returned in the response to provide additional protection the value returned in the response to provide additional protection
from off-path insertion of data [RFC8085], suitable methods include a from off-path insertion of data [RFC8085]. Suitable methods include
parameter known only to the two endpoints, such as a session ID or a parameter known only to the two endpoints, such as a session ID or
initialized sequence number. initialized sequence number.
6.1.2. Application Response 6.1.2. Application Response
An application needs an application-layer protocol mechanism to An application needs an application-layer protocol mechanism to
communicate the response from the destination endpoint. This communicate the response from the destination endpoint. This
response could indicate successful reception of the probe across the response could indicate successful reception of the probe across the
path, but could also indicate that some (or all packets) have failed path, but could also indicate that some (or all packets) have failed
to reach the destination. to reach the destination.
skipping to change at page 36, line 4 skipping to change at page 36, line 4
A probe packet consists of a QUIC Header and a payload containing A probe packet consists of a QUIC Header and a payload containing
PADDING Frames and a PING Frame. PADDING Frames are a single octet PADDING Frames and a PING Frame. PADDING Frames are a single octet
(0x00) and several of these can be used to create a probe packet of (0x00) and several of these can be used to create a probe packet of
size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can
therefore enter the BASE state as soon as connectivity has been therefore enter the BASE state as soon as connectivity has been
confirmed. confirmed.
The current specification of QUIC sets the following: The current specification of QUIC sets the following:
* BASE_PLPMTU: A QUIC sender pads initial packets to confirm the * BASE_PLPMTU: A QUIC sender pads initial packets to confirm the
path can support packets of the required size, this sets the path can support packets of the required size, which sets the
BASE_PLPMTU and MIN_PLPMTU. BASE_PLPMTU and MIN_PLPMTU.
* MIN_PLPMTU: A QUIC sender that determines the MIN_PLPMTU has * MIN_PLPMTU: A QUIC sender that determines the MIN_PLPMTU has
fallen MUST immediately stop sending on the affected path. fallen MUST immediately stop sending on the affected path.
6.3.3. Validating the Path with QUIC 6.3.3. Validating the Path with QUIC
QUIC provides an acknowledged PL. A sender therefore MUST NOT QUIC provides an acknowledged PL. A sender therefore MUST NOT
implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.
skipping to change at page 36, line 50 skipping to change at page 36, line 50
The security considerations for the use of UDP and SCTP are provided The security considerations for the use of UDP and SCTP are provided
in the referenced RFCs. in the referenced RFCs.
To avoid excessive load, the interval between individual probe To avoid excessive load, the interval between individual probe
packets MUST be at least one RTT, and the interval between rounds of packets MUST be at least one RTT, and the interval between rounds of
probing is determined by the PMTU_RAISE_TIMER. probing is determined by the PMTU_RAISE_TIMER.
A PL sender needs to ensure that the method used to confirm reception A PL sender needs to ensure that the method used to confirm reception
of probe packets protects from off-path attackers injecting packets of probe packets protects from off-path attackers injecting packets
into the path. This protection if provided in IETF-defined protocols into the path. This protection is provided in IETF-defined protocols
(e.g., TCP, SCTP) using a randomly-initialized sequence number. A (e.g., TCP, SCTP) using a randomly-initialized sequence number. A
description of one way to do this when using UDP is provided in description of one way to do this when using UDP is provided in
section 5.1 of [RFC8085]). section 5.1 of [RFC8085]).
There are cases where ICMP Packet Too Big (PTB) messages are not There are cases where ICMP Packet Too Big (PTB) messages are not
delivered due to policy, configuration or equipment design (see delivered due to policy, configuration or equipment design (see
Section 1.1), this method therefore does not rely upon PTB messages Section 1.1). This method therefore does not rely upon PTB messages
being received, but is able to utilize these when they are received being received, but is able to utilize these when they are received
by the sender. PTB messages could potentially be used to cause a by the sender. PTB messages could potentially be used to cause a
node to inappropriately reduce the PLPMTU. A node supporting node to inappropriately reduce the PLPMTU. A node supporting
DPLPMTUD MUST therefore appropriately validate the payload of PTB DPLPMTUD MUST therefore appropriately validate the payload of PTB
messages to ensure these are received in response to transmitted messages to ensure these are received in response to transmitted
traffic (i.e., a reported error condition that corresponds to a traffic (i.e., a reported error condition that corresponds to a
datagram actually sent by the path layer, see Section 4.6.1). datagram actually sent by the path layer, see Section 4.6.1).
An on-path attacker, able to create a PTB message could forge PTB An on-path attacker able to create a PTB message could forge PTB
messages that include a valid quoted IP packet. Such an attack could messages that include a valid quoted IP packet. Such an attack could
be used to drive down the PLPMTU. There are two ways this method can be used to drive down the PLPMTU. There are two ways this method can
be mitigated against such attacks: First, by ensuring that a PL be mitigated against such attacks: First, by ensuring that a PL
sender never reduces the PLPMTU below the base size, solely in sender never reduces the PLPMTU below the base size, solely in
response to receiving a PTB message. This is achieved by first response to receiving a PTB message. This is achieved by first
entering the BASE state when such a message is received. Second, the entering the BASE state when such a message is received. Second, the
design does not require processing of PTB messages, a PL sender could design does not require processing of PTB messages, a PL sender could
therefore suspend processing of PTB messages (e.g., in a robustness therefore suspend processing of PTB messages (e.g., in a robustness
mode after detecting that subsequent probes actually confirm that a mode after detecting that subsequent probes actually confirm that a
size larger than the PTB_SIZE is supported by a path). size larger than the PTB_SIZE is supported by a path).
Parsing the quoted packet inside a PTB message can introduce addional
per-packet processing at the PL sender. This processing SHOULD be
limited to avoid a denial of service attack when arbitrary headers
are included. Rate-limiting the processing could result in PTB
messages not being received by a PL, however the DPLPMTUD method is
robust to such loss.
The successful processing of an ICMP message can trigger a probe when The successful processing of an ICMP message can trigger a probe when
the reported PTB size is valid, but this does not directly update the the reported PTB size is valid, but this does not directly update the
PLPMTU for the path. This prevents a message attempting to black PLPMTU for the path. This prevents a message attempting to black
hole data by indicating a size larger than supported by the path. hole data by indicating a size larger than supported by the path.
Parallel forwarding paths SHOULD be considered. Section 5.4 It is possible that the information about a path is not stable. This
identifies the need for robustness in the method because the path could be a result of forwarding across more than one path that has a
information might be inconsistent. different actual PMTU or a single path presents a varying PMTU. The
design of a PLPMTUD implementation SHOULD consider how to mitigate
the effects of varying path information. One possible mitigation is
to provide robustness (see Section 5.4) in the method that avoids
oscillation in the MPS.
A node performing DPLPMTUD could experience conflicting information A node performing DPLPMTUD could experience conflicting information
about the size of supported probe packets. This could occur when about the size of supported probe packets. This could occur when
there are multiple paths are concurrently in use and these exhibit a multiple paths are concurrently in use and these exhibit a different
different PMTU. If not considered, this could result in packets not PMTU. If not considered, this could result in packets not being
being delivered (black holed) when the PLPMTU results in a packet delivered (black holed) when the PLPMTU results in a packet larger
larger than the smallest actual PMTU. than the smallest actual PMTU.
DPLPMTUD methods can introduce padding data to inflate the length of DPLPMTUD methods can introduce padding data to inflate the length of
the datagram to the total size required for a probe packet. The the datagram to the total size required for a probe packet. The
total size of a probe packet includes all headers and padding added total size of a probe packet includes all headers and padding added
to the payload data being sent (e.g., including security-related to the payload data being sent (e.g., including security-related
fields such as an AEAD tag and TLS record layer padding). The value fields such as an AEAD tag and TLS record layer padding). The value
of the padding data does not influence the DPLPMTUD search algorithm, of the padding data does not influence the DPLPMTUD search algorithm,
and therefore needs to be set consistent with the policy of the PL. and therefore needs to be set consistent with the policy of the PL.
If a PL can make use of cryptographic confidentiality or data- If a PL can make use of cryptographic confidentiality or data-
skipping to change at page 45, line 11 skipping to change at page 45, line 23
Working group draft -17: Working group draft -17:
* Updated text after GENART and IETF-LC. * Updated text after GENART and IETF-LC.
* Renamed BASE_MTU to BASE_PLPMTU, and MIN and MAX PMTU to PLPMTU * Renamed BASE_MTU to BASE_PLPMTU, and MIN and MAX PMTU to PLPMTU
(because these are about a base for the PLPMTU), and ensured (because these are about a base for the PLPMTU), and ensured
consistent separation of PMTU and PLPMTU. consistent separation of PMTU and PLPMTU.
* Adopted US-style English throughout. * Adopted US-style English throughout.
Working group draft -18:
* Updated text and address nits from OPSDIR, ART and IESG reviews.
* Order PTB processing based on PL_PTB_SIZE
Authors' Addresses Authors' Addresses
Godred Fairhurst Godred Fairhurst
University of Aberdeen University of Aberdeen
School of Engineering School of Engineering
Fraser Noble Building Fraser Noble Building
Aberdeen Aberdeen
AB24 3UE AB24 3UE
United Kingdom United Kingdom
 End of changes. 46 change blocks. 
96 lines changed or deleted 128 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/