draft-ietf-mmusic-ice-04.txt   draft-ietf-mmusic-ice-05.txt 
MMUSIC J. Rosenberg MMUSIC J. Rosenberg
Internet-Draft Cisco Systems Internet-Draft Cisco Systems
Expires: August 22, 2005 February 21, 2005 Expires: January 18, 2006 July 17, 2005
Interactive Connectivity Establishment (ICE): A Methodology for Interactive Connectivity Establishment (ICE): A Methodology for Network
Network Address Translator (NAT) Traversal for Multimedia Session Address Translator (NAT) Traversal for Offer/Answer Protocols
Establishment Protocols draft-ietf-mmusic-ice-05
draft-ietf-mmusic-ice-04
Status of this Memo Status of this Memo
This document is an Internet-Draft and is subject to all provisions By submitting this Internet-Draft, each author represents that any
of section 3 of RFC 3667. By submitting this Internet-Draft, each applicable patent or other IPR claims of which he or she is aware
author represents that any applicable patent or other IPR claims of have been or will be disclosed, and any of which he or she becomes
which he or she is aware have been or will be disclosed, and any of aware will be disclosed, in accordance with Section 6 of BCP 79.
which he or she become aware will be disclosed, in accordance with
RFC 3668.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as other groups may also distribute working documents as Internet-
Internet-Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 22, 2005. This Internet-Draft will expire on January 18, 2006.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2005). Copyright (C) The Internet Society (2005).
Abstract Abstract
This document describes a methodology for Network Address Translator This document describes a methodology for Network Address Translator
(NAT) traversal for multimedia session signaling protocols, such as (NAT) traversal for multimedia session signaling protocols, such as
the Session Initiation Protocol (SIP). This methodology is called the Session Initiation Protocol (SIP). This methodology is called
Interactive Connectivity Establishment (ICE). ICE makes use of Interactive Connectivity Establishment (ICE). ICE makes use of
existing protocols, such as Simple Traversal of UDP Through NAT existing protocols, such as Simple Traversal of UDP Through NAT
(STUN) and Traversal Using Relay NAT (TURN). ICE makes use of STUN (STUN) and Traversal Using Relay NAT (TURN). ICE makes use of STUN
in peer-to-peer cooperative fashion, allowing participants to in peer-to-peer cooperative fashion, allowing participants to
discover, create and verify mutual connectivity. discover, create and verify mutual connectivity.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Multimedia Signaling Protocol Abstraction . . . . . . . . . . 5 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 3. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . . 6
4. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . . 8 4. Sending the Initial Offer . . . . . . . . . . . . . . . . . . 8
5. Detailed ICE Algorithm . . . . . . . . . . . . . . . . . . . . 10 5. Receipt of the Offer and Generation of the Answer . . . . . . 9
5.1 Initiator Processing . . . . . . . . . . . . . . . . . . . 11 6. Processing the Answer . . . . . . . . . . . . . . . . . . . . 9
5.1.1 Sending the Initiate Message . . . . . . . . . . . . . 11 7. Common Procedures . . . . . . . . . . . . . . . . . . . . . . 10
5.1.2 Processing the Accept . . . . . . . . . . . . . . . . 12 7.1 Gathering Candidates . . . . . . . . . . . . . . . . . . . 10
5.2 Responder Processing . . . . . . . . . . . . . . . . . . . 12 7.2 Encoding Candidates into SDP . . . . . . . . . . . . . . . 13
5.2.1 Processing the Initiate Message . . . . . . . . . . . 12 7.3 Prioritizing the Transport Addresses and Choosing an
5.3 Common Procedures . . . . . . . . . . . . . . . . . . . . 13 Active One . . . . . . . . . . . . . . . . . . . . . . . . 15
5.3.1 Gathering Transport Addresses . . . . . . . . . . . . 13 7.4 Connectivity Checks . . . . . . . . . . . . . . . . . . . 17
5.3.2 Enabling STUN on Each Local Transport Address . . . . 15 7.4.1 UDP Connectivity Checks . . . . . . . . . . . . . . . 19
5.3.3 Prioritizing the Transport Addresses and Choosing 7.4.1.1 Send Validation . . . . . . . . . . . . . . . . . 19
a Default . . . . . . . . . . . . . . . . . . . . . . 17 7.4.1.2 Receive Validation . . . . . . . . . . . . . . . . 20
5.3.4 Sending STUN Connectivity Checks . . . . . . . . . . . 19 7.4.1.3 Learning New Candidates from Connectivity
5.3.5 Receiving STUN Requests . . . . . . . . . . . . . . . 24 Checks . . . . . . . . . . . . . . . . . . . . . . 22
5.3.6 Management of Resources . . . . . . . . . . . . . . . 25 7.4.1.3.1 On Receipt of a Binding Request . . . . . . . 23
5.3.7 Binding Keepalives . . . . . . . . . . . . . . . . . . 25 7.4.1.3.2 On Receipt of a Binding Response . . . . . . . 26
6. Running STUN on Derived Transport Addresses . . . . . . . . . 26 7.4.2 TCP Connectivity Checks . . . . . . . . . . . . . . . 26
6.1 STUN on a TURN Derived Transport Address . . . . . . . . . 27 7.4.2.1 Connection Establishment . . . . . . . . . . . . . 26
6.2 STUN on a STUN Derived Transport Address . . . . . . . . . 29 7.4.2.2 Sending STUN Binding Requests . . . . . . . . . . 27
7. XML Schema for ICE Messages . . . . . . . . . . . . . . . . . 30 7.4.2.3 Receiving STUN Requests . . . . . . . . . . . . . 29
8. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 7.5 Promoting a Valid Candidate to Active . . . . . . . . . . 30
9. Mapping ICE into SIP . . . . . . . . . . . . . . . . . . . . . 35 7.5.1 Minimum Requirements . . . . . . . . . . . . . . . . . 30
9.1 Message Mapping . . . . . . . . . . . . . . . . . . . . . 35 7.5.2 Suggested Algorithm . . . . . . . . . . . . . . . . . 31
9.2 SIP and SDP Specific Security Considerations . . . . . . . 37 7.6 Subsequent Offer/Answer Exchanges . . . . . . . . . . . . 33
9.3 Updates in the Offer/Answer Model . . . . . . . . . . . . 37 7.6.1 Sending of an Offer . . . . . . . . . . . . . . . . . 33
10. Security Considerations . . . . . . . . . . . . . . . . . . 37 7.6.2 Receiving the Offer and Sending an Answer . . . . . . 34
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . 38 7.6.3 Receiving the Answer . . . . . . . . . . . . . . . . . 36
11.1 SDP Attribute Name . . . . . . . . . . . . . . . . . . . . 38 7.7 Binding Keepalives . . . . . . . . . . . . . . . . . . . . 37
11.2 URN Sub-Namespace Registration . . . . . . . . . . . . . . 39 7.8 Sending Media . . . . . . . . . . . . . . . . . . . . . . 38
11.3 XML Schema Registration . . . . . . . . . . . . . . . . . 40 8. Interactions with Forking . . . . . . . . . . . . . . . . . . 38
12. IAB Considerations . . . . . . . . . . . . . . . . . . . . . 40 9. Interactions with Preconditions . . . . . . . . . . . . . . . 38
12.1 Problem Definition . . . . . . . . . . . . . . . . . . . . 41 10. Example . . . . . . . . . . . . . . . . . . . . . . . . . . 39
12.2 Exit Strategy . . . . . . . . . . . . . . . . . . . . . . 41 11. Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . 39
12.3 Brittleness Introduced by ICE . . . . . . . . . . . . . . 42 12. Security Considerations . . . . . . . . . . . . . . . . . . 40
12.4 Requirements for a Long Term Solution . . . . . . . . . . 42 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . 42
12.5 Issues with Existing NAPT Boxes . . . . . . . . . . . . . 43 14. IAB Considerations . . . . . . . . . . . . . . . . . . . . . 42
13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 43 14.1 Problem Definition . . . . . . . . . . . . . . . . . . . . 42
14. References . . . . . . . . . . . . . . . . . . . . . . . . . 43 14.2 Exit Strategy . . . . . . . . . . . . . . . . . . . . . . 43
14.1 Normative References . . . . . . . . . . . . . . . . . . . . 43 14.3 Brittleness Introduced by ICE . . . . . . . . . . . . . . 43
14.2 Informative References . . . . . . . . . . . . . . . . . . . 44 14.4 Requirements for a Long Term Solution . . . . . . . . . . 44
Author's Address . . . . . . . . . . . . . . . . . . . . . . . 45 14.5 Issues with Existing NAPT Boxes . . . . . . . . . . . . . 45
Intellectual Property and Copyright Statements . . . . . . . . 46 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 45
16. References . . . . . . . . . . . . . . . . . . . . . . . . . 45
16.1 Normative References . . . . . . . . . . . . . . . . . . . 45
16.2 Informative References . . . . . . . . . . . . . . . . . . 46
Author's Address . . . . . . . . . . . . . . . . . . . . . . . 47
Intellectual Property and Copyright Statements . . . . . . . . 48
1. Introduction 1. Introduction
A multimedia session signaling protocol is a protocol that exchanges A multimedia session signaling protocol is a protocol that exchanges
control messages between a pair of agents for the purposes of control messages between a pair of agents for the purposes of
establishing the flow of media traffic between them. This media flow establishing the flow of media traffic between them. This media flow
is distinct from the flow of control messages, and may take a is distinct from the flow of control messages, and may take a
different path through the network. Examples of such protocols are different path through the network. Examples of such protocols are
the Session Initiation Protocol (SIP) [3], the Real Time Streaming the Session Initiation Protocol (SIP) [3], the Real Time Streaming
Protocol (RTSP) [9] and the International Telecommunications Union Protocol (RTSP) [16] and the International Telecommunications Union
(ITU) H.323. (ITU) H.323.
These protocols, by nature of their design, are difficult to operate These protocols, by nature of their design, are difficult to operate
through Network Address Translators (NAT). Because their purpose in through Network Address Translators (NAT). Because their purpose in
life is to establish a flow of packets, they tend to carry IP life is to establish a flow of packets, they tend to carry IP
addresses within their messages, which is known to be problematic addresses within their messages, which is known to be problematic
through NAT [10]. The protocols also seek to create a media flow through NAT [17]. The protocols also seek to create a media flow
directly between participants, so that there is no application layer directly between participants, so that there is no application layer
intermediary between them. This is done to reduce media latency, intermediary between them. This is done to reduce media latency,
decrease packet loss, and reduce the operational costs of deploying decrease packet loss, and reduce the operational costs of deploying
the application. However, this is difficult to accomplish through the application. However, this is difficult to accomplish through
NAT. A full treatment of the reasons for this is beyond the scope of NAT. A full treatment of the reasons for this is beyond the scope of
this specification. this specification.
Numerous solutions have been proposed for allowing these protocols to Numerous solutions have been proposed for allowing these protocols to
operate through NAT. These include Application Layer Gateways operate through NAT. These include Application Layer Gateways
(ALGs), the Middlebox Control Protocol [11], Simple Traversal of UDP (ALGs), the Middlebox Control Protocol [18], Simple Traversal of UDP
through NAT (STUN) [1], Traversal Using Relay NAT [8], and Realm through NAT (STUN) [1], Traversal Using Relay NAT [14], and Realm
Specific IP [12][13] along with session description extensions needed Specific IP [19] [20] along with session description extensions
to make them work, such as the SDP attribute for RTCP [2]. needed to make them work, such as the Session Description Protocol
(SDP) [7] attribute for the Real Time Control Protocol (RTCP) [2].
Unfortunately, these techniques all have pros and cons which make Unfortunately, these techniques all have pros and cons which make
each one optimal in some network topologies, but a poor choice in each one optimal in some network topologies, but a poor choice in
others. The result is that administrators and implementors are others. The result is that administrators and implementors are
making assumptions about the topologies of the networks in which making assumptions about the topologies of the networks in which
their solutions will be deployed. This introduces a lot of their solutions will be deployed. This introduces complexity and
complexity and brittleness into the system. What is needed is a brittleness into the system. What is needed is a single solution
single solution which is flexible enough to work well in all which is flexible enough to work well in all situations.
situations.
This specification provides that solution. It is called Interactive
Connectivity Establishment, or ICE. ICE makes use of many of the
protocols above, but uses them in a specific methodology which avoids
many of the pitfalls of using any one alone. ICE uses STUN and TURN
without extension, and allows for other similar protocols to be used
as well. However, it does require additional signaling capabilities
to be introduced into the multimedia session signaling protocols.
For those protocols which make use of the Session Description
Protocol (SDP), this specification defines the necessary extensions
to it. Other protocols will need to define their own mechanisms.
2. Multimedia Signaling Protocol Abstraction
This specification defines a general methodology that allows the
media streams of multimedia signaling protocols to successfully
traverse NAT. This methodology is independent of any particular
signaling protocol. In order to discuss the methodology, we need to
to define an abstraction of a multimedia signaling system, and define
terms that can be used throughout this specification. Figure 1 shows
the abstraction.
+-----------+
| |
| |
> | Signaling |\
/ | Relay | \
/ | | \
Initiate / | | \ Initiate
Message / / +-----------+ \ Message
/ / < \
/ / \ \
/ / \ \
/ / Accept Accept \ \
/ / Message Message \ >
/ / \
+-----------+ / \ +-----------+
| | < | |
| | Media Stream | |
| Session | ................................ | Session |
| Initiator | | Responder |
| | Media Stream | |
| | ................................ | |
+-----------+ +-----------+
Figure 1
Communications occur between two clients - the session initiator and
the session responder, also referred to as the initiator and
responder. The initiator is the one that decides to engage in
communications. To do so, it sends an initiate message. The
initiate message contains parameters that describe the capabilities
and configuration of media streams for the initiator. This message
may travel through signaling intermediaries, called a signaling
relay, before finally arriving at the session responder. Assuming
the session responder wishes to communicate, it generates an accept
message, which is relayed back to the initiator. This message
contains capabilities and configuration of media streams for the
responder. As a result, media streams are established between the
initiator and responder. The signaling protocol may also support an
operation that allows for termination of the communications session.
We refer to this signaling message as a terminate message.
This abstraction is readily mapped to SIP, RTSP, and H.323, amongst
others. For SIP, the initiator is the the user agent that generates
an SDP offer [4], the responder is a SIP user agent that generates an
SDP answer to the offer, the initiate message is a SIP message
containing an SDP offer (for example, an INVITE), the accept message
is a SIP message containing an SDP answer (for example, a 200 OK),
and the terminate message is a BYE. For RTSP, the initiator is the
RTSP client, the responder is the RTSP server, the initiate message
is a SETUP message, and the accept message is a SETUP response.
The initiate and accept messages need to contain parameters, defined This specification provides that solution for protocols based on the
by this specification, for the protocol to operate. The initiate and offer-answer model, RFC 3264 [4]. It is called Interactive
accept mesages are therefore defined by this specification as XML Connectivity Establishment, or ICE. ICE makes use of STUN and TURN,
documents containing the relevant information. Of course, multimedia but uses them in a specific methodology which avoids many of the
signaling protocols will not use these XML documents directly. pitfalls of using any one alone.
Rather, those protocols will need to define extensions as needed to
show how the initiate, accept and terminate messages map to messages
in the actual protocol, and how every element and attribute in the
XML document for those messages maps into parameters of the actual
protocol. Section 9 provides such a mapping for SIP.
3. Terminology 2. Terminology
Several new terms are introduced in this specification: Several new terms are introduced in this specification:
Session Initiator: A software or hardware entity that, at the request Peer: From the perspective of one of the agents in a session, its
of a user, tries to establish communications with another entity, peer is the other agent. Specifically, from the perspective of
called the session responder. A session initiator is also called the offerer, the peer is the answerer. From the perspective of
an initiator. the answerer, the peer is the offeror.
Initiator: Another term for a session initiator.
Session Responder: A software or hardware entity that receives a
request for establishment of communications from the session
initiator, and either accepts or declines the request. A session
responder is also called a responder.
Responder: Another term for a session responder.
Client: Either the initiator or responder.
Peer: From the perspective of one of the clients in a session, its
peer is the other client. Specifically, from the perspective of
the initiator, the peer is the responder. From the perspective of
the responder, the peer is the initiator.
Signaling Relay: An intermediary of signaling messages. Examples are
SIP proxies and H.323 Gatekeepers.
Initiate Message: The signaling message used by an initiator to
establish communications. It contains capabilities and other
information needed by the responder to send media to the
initiator. For SIP, this is any SIP message that contains an
offer. Usually, this is the initial INVITE.
Accept Message: The signaling message used by a responder to agree to
communications. It contains capabilities and other information
needed by the initiator to send media to the responder. For SIP,
this is any SIP message that contains an answer. Usually, this is
a 200 OK.
Terminate Message The signaling message used by a client to terminate
the session and associated media streams.
Transport Address: The combination of an IP address and port. Transport Address: The combination of an IP address and port.
Local Transport Address: A local transport address a transport Local Transport Address: A local transport address a transport
address that has been allocated from the operating system on the address that has been allocated from the operating system on the
host. This includes transport addresses obtained through Virtual host. This includes transport addresses obtained through Virtual
Private Networks (VPNs) and transport addresses obtained through Private Networks (VPNs) and transport addresses obtained through
Realm Specific IP (RSIP) [12] (which lives at the operating system Realm Specific IP (RSIP) [19] (which lives at the operating system
level). Transport addresses are typically obtained by binding to level). Transport addresses are typically obtained by binding to
an interface. an interface.
Usable Local Transport Address: A local transport address created for m/c line: The media and connection lines in the SDP, which together
the purposes of advertisement to ICE peers. hold the transport address used for the receipt of media.
Associated Local Transport Address: An associated transport address
is a local transport address used solely to obtain a derived
transport address. Associated local transport addresses are never
advertised in ICE messages. However, packets are received on them
when sent to the derived transport address.
Derived Transport Address: A derived transport address is a transport Derived Transport Address: A derived transport address is a transport
address which is derived from an associated local transport address which is derived from a local transport address. The
address. The derived transport address is related to the derived transport address is related to the associated local
associated local transport address in that packets sent to the transport address in that packets sent to the derived transport
derived transport address are received on the socket bound to its address are received on the socket bound to its associated local
associated local transport address. Derived addresses are transport address. Derived addresses are obtained using protocols
obtained using protocols like STUN and TURN, and more generally, like STUN and TURN, and more generally, any UNSAF protocol [21].
any UNSAF protocol [14].
Advertised Transport Addresses: The union of the usable local Candidate Transport Address: A transport address advertised by a
transport addresses and the derived transport addresses. These agent in an offer or answer. A candidate transport address can
are the ones used in ICE messages. either by a local transport address or a derived transport
address.
Peer Derived Transport Address: A peer derived transport address is a Peer Derived Transport Address: A peer derived transport address is a
derived transport address learned from a STUN server running derived transport address learned from a STUN server running
within a peer in a media session. within a peer in a media session.
TURN Derived Transport Address: A derived transport address obtained TURN Derived Transport Address: A derived transport address obtained
from a TURN server. from a TURN server.
STUN Derived Transport Address: A derived transport address obtained STUN Derived Transport Address: A derived transport address obtained
from a STUN server whose address has been provisioned into the UA. from a STUN server whose address has been provisioned into the UA.
This, by definition, excludes Peer Derived Transport Addresses. This, by definition, excludes Peer Derived Transport Addresses.
Unilateral Allocations: Queries made to a network server which Candidate: A sequence of candidate transport addresses that form an
provides an UNSAF service. atomic set for usage with a particular media stream. In the case
of RTP, there are two candidate transport addresses per candidate:
one for RTP, and another for RTCP. Connectivity is verified to
all of the candidate transport addresses within a candidate before
that candidate is used. The transport addresses that compose a
candidate are all of the same type - local, STUN derived, TURN
derived or peer derived.
Bilateral Allocations: Addresses obtained by using an UNSAF service Local Candidate: A candidate whose transport addresses are local
that actually runs on the peer of the communications session. transport addresses.
Peer derived transport addresses are synonymous with bilateral
allocations.
4. Overview of ICE STUN Candidate: A candidate whose transport addresses are STUN
derived transport addresses.
TURN Candidate: A candidate whose transport addresses are TURN
derived transport addresses.
Peer Candidate: A candidate whose transport addresses are peer
derived transport addresses.
Active Candidate: The candidate that is in use for exchange of media.
This is the one that an agent places in the m/c line of an offer
or answer.
3. Overview of ICE
ICE makes the fundamental assumption that clients exist in a network ICE makes the fundamental assumption that clients exist in a network
of segmented connectivity. This segmentation is the result of a of segmented connectivity. This segmentation is the result of a
number of addressing realms in which a client can simultaneously be number of addressing realms in which a client can simultaneously be
connected. We use "realms" here in the broadest sense. A realm is connected. We use "realms" here in the broadest sense. A realm is
defined purely by connectivity. Two clients are in the same realm defined purely by connectivity. Two clients are in the same realm
if, when they exchange the addresses each has in that realm, they are if, when they exchange the addresses each has in that realm, they are
able to send packets to each other. This includes IPv6 and IPv4 able to send packets to each other. This includes IPv6 and IPv4
realms, which actually use different address spaces, in addition to realms, which actually use different address spaces, in addition to
private networks connected to the public Internet through NAT. private networks connected to the public Internet through NAT.
The key assumption in ICE is that a client cannot know, apriori, The key assumption in ICE is that a client cannot know, apriori,
which address realms it shares with any peer it may wish to which address realms it shares with any peer it may wish to
communicate with. Therefore, in order to communicate, it has to try communicate with. Therefore, in order to communicate, it has to try
connecting to addresses in all of the realms. connecting to addresses in all of the realms.
Initiator TURN,STUN Servers Responder Agent A TURN,STUN Servers Agent B
|(1) Gather Addresses | | |(1) Gather Addresses | |
|-------------------->| | |-------------------->| |
|(2) Initiate Msg. | | |(2) Offer | |
|------------------------------------------>| |------------------------------------------>|
| |(3) Gather Addresses | | |(3) Gather Addresses |
| |<--------------------| | |<--------------------|
|(4) Accept Msg. | | |(4) Answer | |
|<------------------------------------------| |<------------------------------------------|
|(5) STUN Checks | | |(5) Media | |
|<------------------------------------------| |<------------------------------------------|
|(6) STUN Checks | | |(6) Media | |
|------------------------------------------>| |------------------------------------------>|
|(7) Media | | |(7) STUN Checks | |
|<------------------------------------------| |<------------------------------------------|
|(8) Media | | |(8) STUN Checks | |
|------------------------------------------>|
|(9) Offer | |
|------------------------------------------>|
|(10) Answer | |
|<------------------------------------------|
|(11) Media | |
|<------------------------------------------|
|(12) Media | |
|------------------------------------------>| |------------------------------------------>|
Figure 2 Figure 1
The basic flow of operation for ICE is shown in Figure 2. Before the
initiator establishes a session, it obtains as many IP address and
port combinations in as many address realms as it can. These
adresses all represent potential points at which the initiator will
receive a specific media stream. Any protocol that provides a client
with an IP address and port on which it can receive traffic can be
used. These include STUN, TURN, RSIP, and even a VPN. The client
also uses any local interface addresses. A dual-stack v4/v6 client
will obtain both a v6 and a v4 address/port. The only requirement is
that, across all of these addresses, the initiator can be certain
that at least one of them will work for any responder it might
communicate with. Unfortunately, if the initiator communicates with
a peer that doesn't support ICE, only one address can be provided to
that peer. As such, the client will need to choose one default
address, which will be used by non-ICE clients. This would typically
be a TURN derived transport address, as it is most likely to work
with unknown non-ICE peers.
The initiator then runs a STUN server on each the local transport
addresses it has obtained. These include ones that will be
advertised directly through ICE, and so-called associated local
transport addresses, which are not directly advertised; rather, the
transport address derived from them is advertised. The initiator
will need to be able to demultiplex STUN messages and media messages
received on that IP address and port, and process them appropriately.
All of these addresses are placed into the initiate message, and they
are ordered in terms of preference. Preference is a matter of local
policy, but typically, lowest preference would be given to transport
addresses learned from a TURN server (i.e., TURN derived transport
addresses). The initiate message also conveys the one half of the
STUN username and the password which are required to gain access to
the STUN server on each address/port combination.
The initiate message is sent to the responder. This specification
does not address the issue of how the signaling messages themselves
traverse NAT. It is assumed that signaling protocol specific
mechanisms are used for that purpose. The responder follows a
similar process as the initiator followed; it obtains addresses from
local interfaces, STUN servers, TURN servers, etc., and it places all
of them, along with the other half of the STUN username and its
password, into the accept message.
Once the responder receives the initiate message, it has a set of
potential addresses it can use to communicate with the initiator.
The initiator will be running a STUN server at each address. The
responder sends a STUN request to each address, in parallel. When
the initiator receives these, it sends a STUN response. If the
responder receives the STUN response, it knows that it can reach its
peer at that address. It can then begin to send media to that
address. As additional STUN responses arrive, the responder will
learn about additional transport addresses which work. If one of
those has a higher priority than the one currently in use, it starts
sending media to that one instead. No additional control messages
(i.e., SIP signaling) occur for this change.
The STUN messages described above happen while the accept message is
being sent to the intitiator. Once the intitiator receives the
accept message, it too will have a set of potential addresses with
which it can communicate to the responder. It follows exactly the
same process described above.
Furthermore, when a either the initiator or responder receives a STUN
request, it takes note of the source IP address and port of that
request. It compares that transport address to the existing set of
potential addresses. If it's not amongst them, it gets added as
another potential address. The incoming STUN message provides the
client with enough context to associate that transport address with a
STUN username, STUN password, and priority, just as if it had been
sent in an initiate or accept message. As such, the client begins
sending STUN messages to it as well, and if those succeed, the
address can be used if it has a higher priority.
5. Detailed ICE Algorithm
This section describes the detailed processing needed for ICE.
5.1 Initiator Processing
5.1.1 Sending the Initiate Message
When the initiator wishes to begin communications, it starts by
gathering transport addresses, as described in Section 5.3.1, and
starting a STUN server on each local transport address, both usable
and associated, as described in Section 5.3.2. This process can
actually happen at any time before sending an initiate message. A
client can pre-gather transport addresses, using a user interface cue
(such as picking up the phone, or entry into an address book) as a
hint that communications is imminent. Doing so eliminates any
additional perceivable call setup delays due to address gathering.
When it comes time to initiate communications, it determines a
priority for each one and identifies one as a default, as described
in Section 5.3.3.
The next step is to construct the initiate message. Section 7 The basic flow of operation for ICE is shown in Figure 1. Before the
provides the XML schema for the initiate message. The message offeror establishes a session, it obtains local transport addresses
consists of a series of media streams. For each media stream, there from its operating system on as many interfaces as it has access to.
is an IPv4 and/or an IPv6 default address, and a list of candidates. These interfaces can include IPv4 and IPv6 interfaces, in addition to
Each candidate has information for RTP and optionally RTCP. RTCP Virtual Private Network (VPN) interfaces or ones associated with
information is optional since, unfortunately, many systems don't RSIP. For media protocols that support both UDP and TCP (such as the
support it. If ICE did not indicate that RTCP was not supported, Real Time Transport Protocol (RTP) [22], which can run over either),
connectivity checks would be made to the RTCP ports and fail, it obtains both TCP and UDP transport addresses. In addition, the
confusing operation and adding unneccesary overhead. agent obtains derived transport addresses from each local transport
address using protocols such as STUN and TURN. Each local and
derived transport address becomes a candidate for receipt of media
traffic.
The default address is the one that will be used by responders that The agent will choose one of its candidate transport addresses as its
don't understand ICE (for SIP, this is accomplished by mapping the initial media transport address for inclusion in the connection and
default address into the m and c line in the SDP). The candidates media lines in the offer. This transport address will be utilized
represent addresses that the responder should try using the for media traffic while connectivity is verified to all of the
mechanisms of this specification. The list of candidates includes candidates. Since these checks may take time to execute, media
the defaults. In SIP, the candidates are conveyed with the new SDP clipping will occur if the media transport address is not reachable
candidate parameter. by the peer. To minimize the probability of clipping, the transport
address that is most likely to work is chosen. This is normally a
TURN-derived tranport address, but others can be utilized based on
local policy.
The client then encodes its usable local transport addresses and Each candidate transport address (including the one being used as the
derived transport addresses (including the one set as the default) as media transport address) is listed in an a=candidate attribute in the
a series of candidate elements. Each candidate element conveys a offer. Each candidate is given a preference. Preference is a matter
transport address for RTP, a transport address for RTCP, a STUN of local policy, but typically, lowest preference would be given to
username fragment and STUN password for RTP, and one for RTCP. The transport addresses learned from a TURN server (i.e., TURN derived
client MUST assign each candidate a unique identifier. These transport addresses). Each candidate is also assigned a distinct ID,
identifiers MUST be unique across all candidates used within the called a transport ID (tid).
session. Though they are not used in this specification, they serve
as a convenient and short handle for each candidate within the
document. Experience has shown that explicit identifiers for
elements in SDP is a good idea. This identifier is encoded in the
"id" attribute of the <candidate> element. The priority for the
transport address, as computed above, is included as an attribute as
well.
Once the initiate message is constructed, it is sent. The offer is then sent to the answerer. This specification does not
address the issue of how the signaling messages themselves traverse
NAT. It is assumed that signaling protocol specific mechanisms are
used for that purpose. The answerer follows a similar process as the
offeror followed; it obtains addresses from local interfaces, obtains
derived transport addresses from those, and the combination becomes
its set of candidate transport addresses. It picks one as its
initial media transport address and places it into the m/c line in
the answer, and then lists all of them in the a=candidate attributes
in the answer, along with a preference and tid.
5.1.2 Processing the Accept Once the offer/answer exchange has completed, each agent sends media
from its media transport address to the media transport address of
its peer. This media stream may or may not work, depending on
whether or not the media transport address is reachable. In parallel
with the transmission of media, a connectivity check begins. This
check makes use of STUN messages sent from each candidate to each
other candidate. These checks will allow each agent to determine
whether it can send packets from a particular candidate to a
candidate from its peer, and whether packets can be sent back. If,
after a certain period of time, an agent determines that a pair of
candidates works, and has a higher priority than the transport
addresses currently in use for media (perhaps because the ones in use
don't work), it sends a new offer that "promotes" its candidate into
the m/c line. This causes the media traffic to switch to this new
transport address.
There are two possible cases for processing of the Accept message. 4. Sending the Initial Offer
If the recipient of the Initiate message did not support ICE, the
Accept message will only contain the default address information. As
a result, the initiator knows that it cannot perform its connectivity
checks. In this case, it SHOULD just send to the transport address
listed. However, if local configuration information tells the
initiator to try connectivity checks by sending them through the TURN
server, this means that packets sent directly to responder may be
dropped by a local firewall. To deal with this, the initiator SHOULD
issue a SEND command using this new transport address as the
destination. The SEND command contains the media packet to send to
the responder. Once this command has been accepted, the initiator
SHOULD send all media packets through the TURN server, which will
then forward them towards the responder.
If the Accept message contains candidates, it implies that the When an agent wishes to begin a session by sending an initial offer,
responder supported ICE. In that case, the initiator takes each it starts by gathering transport addresses, as described in
candidate transport address, STUN username fragment, STUN password Section 7.1. This will produce a set of candidates, including local
and priority, and places them into a list, called the candidate list. ones, STUN-derived ones, and TURN-derived ones.
It then begins processing the candidate list as described in Section
5.3.4. That processing associates a state with each transport
address. As described there, once a successful STUN query is made to
the STUN server at an address, the initiator can begin sending media
to that address.
5.2 Responder Processing This process of gathering candidates can actually happen at any time
before sending the initial offer. A agent can pre-gather transport
addresses, using a user interface cue (such as picking up the phone,
or entry into an address book) as a hint that communications is
imminent. Doing so eliminates any additional perceivable call setup
delays due to address gathering.
5.2.1 Processing the Initiate Message When it comes time to offer communications, it determines a priority
for each candidate and identifies the active candidate that will be
used for receipt of media, as described in Section 7.3.
Upon receipt of the initiate message, the client starts gathering The next step is to construct the offer message. For each media
transport addresses, as described in Section 5.3.1, and starts a STUN stream, it places its candidates into a=candidate attributes in the
server on each local transport address, as described in Section offer and puts its active candidate into the m/c line. The process
5.3.2. This processing is done immediately on receipt of the for doing this is described in Section 7.2. The offer is then sent.
request, to prepare for the case where the user should accept the
call, or early media needs to be generated. By gathering addresses
while the user is being alerted to the request for communications,
session establishment delays due to that gathering can be eliminated.
At some point, the responder will decide to accept or reject the 5. Receipt of the Offer and Generation of the Answer
communications. A rejection terminates ICE processing, of course.
In the case of acceptance, the accept message is constructed as
follows.
The client first determines a priority for each usable local Upon receipt of the offer message, the agent checks if the offer
transport address and derived transport address it has gathered, and contains any a=candidate attributes. If it does, the offeror
identifies one as a default, as described in Section 5.3.3. supports ICE. In that case, it starts gathering candidates, as
described in Section 7.1, and prioritizes them Section 7.3. This
processing is done immediately on receipt of the offer, to prepare
for the case where the user should accept the call, or early media
needs to be generated. By gathering candidates while the user is
being alerted to the request for communications, session
establishment delays due to that gathering can be eliminated.
Constructing the accept message proceeds identically to the way in At some point, the answerer will decide to accept or reject the
which the initiate message is constructed (Section 5.1.1). communications. A rejection terminates ICE processing. In the case
of acceptance, the answer is constructed, and if the offeror
supported ICE, the candidates are encoded into the SDP as described
in Section 7.2. The answer is then sent. If the offeror supported
ICE, the answerer begins its connectivity checks as described in
Section 7.4.
The accept message is then sent. In addition, and regardless if the offeror supported ICE, the
answerer can begin sending media packets as it normally would. It
sends media according to the procedures in Section 7.8.
5.3 Common Procedures 6. Processing the Answer
This section discusses procedures that are common between initiator There are two possible cases for processing of the answer. If the
and responder. answerer did not support ICE, the answer will not contain any
a=candidate attributes. As a result, the offeror knows that it
cannot perform its connectivity checks. In this case, it proceeds
with normal media processing as if ICE was not in use. The
procedures for sending media, described in Section 7.8, MUST be
followed however.
5.3.1 Gathering Transport Addresses If the answer contains candidates, it implies that the answerer
supported ICE. In that case, the offeror begins connectivity checks
as described in Section 7.4. It also starts sending media, using the
candidate in the m/c line, based on the procedures described in
Section 7.8.
A client gathers addresses when it believes that communications is 7. Common Procedures
imminent. For initiators, this occurs before sending an initiate
message (Section 5.1.1). For responders, it occurs before sending a
accept message (Section 5.2.1).
There are two types of addresses a client can gather - usable local This section discusses procedures that are common between offeror and
transport addresses and derived transport addresses. Usable local answerer.
transport addresses are obtained by binding to an ephemeral port on
an interface (physical or virtual) on the host. A multi-homed host
SHOULD attempt to bind on all interfaces for all media streams it
wishes to receive. For media streams carried using the Real Time
Transport Protocol (RTP) [15], the client will need to bind to an
ephemeral port for both RTP and RTCP.
The result will be a set of usable local transport addresses. The 7.1 Gathering Candidates
client may also have access to servers that provide unilateral
self-address fixing (UNSAF) [14]. Examples of such protocols include
STUN, TURN, and TEREDO [18]. UNSAF protocols work by having the
client send, from a specific associated local transport address, some
kind of message to a server. The server provides to the client, in
some kind of response, an additional transport address, called a
derived transport address. This derived transport address is derived
from the associated local transport address. Here, derivation means
that a request sent to the derived transport address might (under
good network conditions) reach the client on its associated local
transport address.
All ICE implementations SHOULD implement and use STUN and TURN for An agent gathers candidates when it believes that communications is
unilateral allocation. STUN is an integral part of this imminent. For offerors, this occurs before sending an offer
specification for connectivity checks and will always be present for (Section 4). For answerers, it occurs before sending an answer
that purpose. The usage of TURN and STUN for unilateral allocations (Section 5).
is at SHOULD strength, and not MUST, since there are many network
environments, and there will be deployments for which one of these
will never be used and will impose needless cost. However, one of
the key ideas behind ICE is that network conditions and connectivity
assumptions can, and will change. Just because a client is
communicating with a server on the public network today, doesn't mean
that it won't need to communicate with one behind a NAT tomorrow.
Just because a client is behind a full cone NAT today, doesn't mean
that tomorrow they won't pick up their client and take it to a public
network access point where there is a symmetric NAT. The way to
handle these cases and build a reliable system is for clients to
implement a diverse set of techniques for allocating addresses, so
that at least one of them is almost certainly going to work in any
situation. The combination of TURN, STUN and local address
allocations provide sufficient coverage to handle nearly any NAT
configuration. Implementors should consider very carefully any
assumptions that they make about deployments before electing not to
implement one of these mechanisms for address allocation. In
particular, implementors should consider whether the elements in the
system may be mobile, and connect through different networks with
different connectivity. They should also consider whether endpoints
which are under their control, in terms of location and network
connectivity, would always be under their control. Only in cases
where implementors truly believe that these cases will not require
either TURN or STUN allocations, should those techniques not be
implemented.
For each UNSAF protocol, the client may have access to a multiplicity Each candidate is composed of a series of transport addresses of the
of servers. For example, a user connected to a natted cable access same type. In the case of RTP, the candidate is composed of either
network might have access to a STUN server in the private cable one or two transport addresses. Normally there are two - one for
network and in the public Internet. For each server for each UNSAF RTP, and one for RTCP. However, if RTCP is not in use, a candidate
protocol, the client MUST bind to a new local transport address, and will only contain a single transport address.
uses it to obtain a single derived transport address for it. This
local IP address and port is called an associated transport address.
These addresses are not advertised to peers in ICE messages; their
derived transport addresses are. As a result of using a different
local transport address for each derived transport address, every
transport address advertised in an ICE message is either a unique
local transport address, or else is derived from a unique local
transport address.
If a derived transport address is equal to the associated local The first step is to gather local candidates. Local candidates are
transport address from which it was derived, the local transport obtained by binding to ephemeral ports on an interface (physical or
address SHOULD be promoted to a usable local transport address. It virtual, including VPN interfaces) on the host. Specifically, for
is preferable to do this than to use a new local transport address; each UDP-only media stream the agent wishes to use, the agent SHOULD
the UNSAF protocol may have caused pinholes to open in intervening obtain a set of candidates (one for each interface) by binding to N
firewalls. ephemeral UDP ports on each interface, where N is the number of
transport addresses needed for the candidate. For RTP, N is
typically two. For each TCP-only media stream the agent wishes to
use, the agent SHOULD obtain a set of candidates by binding to N
ephemeral TCP ports on each interface, where N is the number of
transport addresses needed for the candidate. For media streams that
can support either UDP or TCP, the agent SHOULD obtain a set of
candidates by binding to N ephemeral UDP and N ephemeral TCP ports on
each interface, where N is the number of transport addresses needed
for the candidate.
Implementations MAY use other protocols that provide derived If a host has K local interfaces, this will result in K candidates
transport addresses, as long as those techniques meet the following for each UDP stream (requiring K*N transport addresses), K candidates
conditions: for each TCP stream (requiring K*N transport addresses), and 2K
candidates for streams that support UDP and TCP (requiring 2*K*N
transport addresses).
1. The technique does not require its peer to know about, or Media streams carried using the Real Time Transport Protocol (RTP)
understand the technique in order to interoperate. [22] can run over TCP [27]. As such, it is RECOMMENDED that both UDP
and TCP candidates be obtained. Transmission of real time media over
UDP is generally preferred to TCP. However, many network
environments, for better or for worse, permit only TCP traffic.
Obtaining a TCP candidate, and then using it in conjunction with a
TURN relay as described below, allows for ICE to make use of the TCP
media only when UDP connectivity is non-existent, as it may be in
these restricted environments. However, providers of real-time
communications services may decide that it is preferable to have no
media at all than it is to have media over TCP. To allow for choice,
it is RECOMMENDED that agents be configurable with whether they
obtain TCP candidates for real time media.
2. The technique can provide the client with an IP address and port Having it be configurable, and then configuring it to be off, is
that may be reachable by some peers. far better than not having the capability at all. An important
goal of this specification is to provide a single mechanism that
can be used across all types of endpoints. As such, it is
preferable to account for provider and network variation through
configuration, instead of hard-coded limitations in an
implementation. Furthermore, network characteristics and
connectivity assumptions can, and will change over time. Just
because a agent is communicating with a server on the public
network today, doesn't mean that it won't need to communicate with
one behind a NAT tomorrow. Just because a agent is behind a full
cone NAT today, doesn't mean that tomorrow they won't pick up
their agent and take it to a public network access point where
there is a symmetric NAT or one that only allows outbound TCP.
The way to handle these cases and build a reliable system is for
agents to implement a diverse set of techniques for allocating
addresses, so that at least one of them is almost certainly going
to work in any situation. Implementors should consider very
carefully any assumptions that they make about deployments before
electing not to implement one of the mechanisms for address
allocation. In particular, implementors should consider whether
the elements in the system may be mobile, and connect through
different networks with different connectivity. They should also
consider whether endpoints which are under their control, in terms
of location and network connectivity, would always be under their
control. Only in cases where there isn't now, and never will be,
endpoint mobility or nomadicity of any sort, should a technique be
omitted.
3. The technique allows the client to receive STUN connectivity Once the agent has obtained local candidates, it obtains candidates
checks in addition to media packets on the same IP address and with derived transport addresses. Agents which serve end users
port. directly, such as softphones, hardphones, terminal adaptors and so
on, MUST implement STUN and SHOULD use it to obtain STUN candidates.
These devices SHOULD implement and SHOULD use TURN to obtain TURN
candidates. They MAY implement and MAY use other protocols that
provide derived transport addresses, such as TEREDO [25]. As with
TCP, usage of STUN and TURN is at SHOULD strength to allow for
provider variation. If it is not to be used, it is also RECOMMENDED
that it be implemented and just disabled through configuration, so
that it can re-enabled through configuration if conditions change in
the future.
4. The technique allows the client to send packets to a peer, so Agents which represent network servers under the control of a service
that the peer will see the derived transport address as the provider, such as gateways to the telephone network, media servers,
source IP address and port of the packet. or conferencing servers that are targeted at deployment only in
networks with public IP addresses MAY use STUN, TURN or other similar
protocols to obtain candidates.
5.3.2 Enabling STUN on Each Local Transport Address Why would these types of endpoints even bother to implement ICE?
The answer is that such an implementation greatly facilitates NAT
traversal for endpoints that connect to it. The ability to
process STUN connectivity checks allows for the network server to
obtain peer-derived transport addresses that can be used to
provide relay-free traversal of symmetric NAT for endpoints that
connect to it. Furthermore, implementation of the STUN
connectivity checks allows for NAT bindings along the way to be
kept open. ICE also provides numerous security properties that
are independent of NAT traversal, and would benefit any multimedia
endpoint. See Section 12 for a discussion on these benefits.
Once the client has obtained a set of transport addresses, it starts To obtain STUN candidates (which are always UDP), the client takes a
a STUN server on each local transport address, including both local UDP candidate, and for each configured STUN server, produces a
associated local transport addresses and usable transport addresses. STUN candidate. It is anticipated that clients may have a
These include ones used for both RTP and RTCP. This, by definition, multiplicity of STUN servers configured in network environments where
means that the STUN service will be reached for requests sent to the there are multiple layers of NAT, and that layering is known to the
derived addresses. provider of the client. To produce the STUN candidate from the local
candidate, it follows the procedures of Section 9 of RFC 3489 for
each local transport address in the local candidate. It obtains a
shared secret from the STUN server and then initiates a Binding
Request transaction from the local transport address to that server.
The Binding Response will provide the client with its STUN derived
transport address in the MAPPED-ADDRESS attribute. If the client had
K local candidates, this will produce S*K STUN candidates, where S is
the number of configured STUN servers.
However, the client does not need to provide STUN service on any To obtain UDP TURN candidates, the client takes a local UDP
other IP address or port, unlike the STUN usage described in [1]. candidate, and for each configured TURN server, produces a TURN
The need to run the service on multiple ports is to support the candidate. It is anticipated that clients may have a multiplicity of
change flags. However, those flags are not needed with ICE, and the TURN servers configured in network environments where there are
server SHOULD reject, with a 400 response, any STUN requests with multiple layers of NAT, and that layering is known to the provider of
these flags set. The CHANGED-ADDRESS attribute in a BindingResponse the client. To produce the TURN candidate from the local candidate,
is set to the transport address on which the server is running. it follows the procedures of Section 8 of [14] for each local
transport address in the local candidate. It initiates an Allocate
Request transaction from the local transport address to that server.
Furthermore, there is no need to support TLS or to be prepared to The Allocate Response will provide the client with its TURN derived
receive SharedSecret request messages. Those messages are used to transport address in the MAPPED-ADDRESS attribute. If the client had
obtain shared secrets to be used with BindingRequests. However, with K local candidates, this will produce S*K UDP TURN candidates, where
ICE, usernames and passwords are exchanged in the signaling protocol. S is the number of configured TURN servers.
The client will receive both STUN requests and media packets on each To obtain a TURN-derived TCP candidates, the client takes a local TCP
local transport address. The client MUST be able to disambiguate candidate, and for each configured TURN server, produces a TCP TURN
them. In the case of RTP/RTCP, this disambiguation is easy. RTP and candidate. It is anticipated that clients may have a multiplicity of
RTCP packets start with the bits 0b10 (v=2). The first two bits in TURN servers configured in network environments where there are
STUN are always 0b00. This disambiguation also works for packets multiple layers of NAT, and that layering is known to the provider of
sent using Secure RTP [16], since the RTP header is in the clear. the client. To produce the TURN candidate from the local candidate,
Disambiguating STUN with other media stream protocols may be more it iterates through the local transport addresses in the local
complicated. However, it can always be possible with arbitrarily candidate, and for for each one, initiates a TCP connection from the
high probabilities by selecting an appropriately random username (see same interface the local transport address to the TURN server. It is
below). not neccesary to initiate the connection from the actual port in the
local transport address. Following the procedures of Section 8 of
[14], it initiates an Allocate Request transaction over the
connection. The Allocate Response will provide the client with its
TCP TURN derived transport address in the MAPPED-ADDRESS attribute.
If the client had K local TCP candidates, this will produce S*K TCP
TURN candidates, where S is the number of configured TURN servers.
The need to run STUN on the same transport address as the media 7.2 Encoding Candidates into SDP
stream represents the "ugliest" piece of ICE. However, it is an
essential part of the story. By sending STUN requests to the very
same place media is sent, any bindings learned through STUN will be
useful even when communicating through symmetric NATs. This results
in a substantial increase in the scope of applicability of STUN.
For each transport address advertised in the initiate message, the For each candidate to be placed into the SDP, the agent includes a
client MUST choose a username fragment and a password. The username series of a=candidate attributes as media-level attributes, one for
fragment created by the client (called the local username fragment) each transport address in the candidate. Each of the transport
is concatenated with the fragment created by its peer (called the addresses for the same candidate MUST have the same value of the
remote username fragment) to create the actual username used for candidate-id attribute. The a=candidate attributes for different
access to the STUN server that will receive packets sent to that candidates MUST be unique within that media stream. Using a simple
transport address. This username will be present in STUN requests sequence number, incrementing by one for each candidate for a media
sent by its peer. By creating the username as a combination of stream, meets these requirements. The transport, unicast-address and
information from each side of a call, it allows a client to correlate port of the attribute are set to those for the candidate. The qvalue
the source of the request with a candidate transport address. This is set to the priority of this candidate (note that, for RTP, the RTP
is discussed further below. and RTCP transport addresses MUST have equal priority values). The
tid MUST be chosen randomly with 128 bits of randomness. The tid is
chosen only when the transport address is placed into the SDP for the
first time; subsequent offers or answers within the same session
containing that same transport address would use the same tid used
previously.
The username fragment MUST be globally unique with high probability, The tid serves as a unique identifier for each transport address. It
and different for each advertised transport address. It SHOULD be also gets combined, through concatenation, with the tid of a peer
persistently used over time for that particular transport address. A candidate to form the username and password that is placed in the
value computed as the 128 bit hash of the transport address STUN checks between the peers. This allows the STUN message to
concatenated with a 128 bit random number selected to identify the uniquely identify the pairing whose connectivity it is checking. The
host will meet these requirements. This results in two properties. tid is needed as a unique identifier because the IP address within
First - each transport address can be uniquely identified. Secondly, the candidate fails to provide that uniqueness as a consequence of
no other host will select a username with the same value. The NAT.
password MUST be random with at least 128 bits of randomness and is
selected separately for each transport address advertised as part of
a distinct session. This means that RTP and RTCP, which run on
different transport addresses, will get different usernames and
passwords. The password will remain constant during a session with a
peer, but will otherwise vary across sessions. The username fragment
and password will be passed to its peer in an initiate or accept
message. Because the password is conveyed through these signaling
protocols, those protocols MUST provide facilities for encryption,
authentication and message integrity, and those facilities SHOULD be
used when ICE is employed. As such, the process described in this
section will associate, with each local transport address, a username
fragment and password. The client also associates this same username
fragment and password with any transport addresses derived from the
local transport address.
The global uniqueness requirement stems from the lack of uniquenes Consider agents A, B, and C. A and B are within private enterprise 1,
afforded by IP addresses. Consider clients A, B, and C. A and B are which is using 10.0.0.0/8. C is within private enterprise 2, which
within private enterprise 1, which is using 10.0.0.0/8. C is within is also using 10.0.0.0/8. As it turns out, B and C both have IP
private enterprise 2, which is also using 10.0.0.0/8. As it turns address 10.0.1.1. A sends an offer to C. C, in its answer, provides
out, B and C both have IP address 10.0.1.1. A initiates A with its transport addresses. In this case, thats 10.0.1.1:8866
communications to C. C, in its accept message, provides A with its and 8877. As it turns out, B is in a session at that same time, and
transport addresses. In this case, thats 10.0.1.1:8866 and 8877. As is also using 10.0.1.1:8866 and 8877. This means that B is prepared
it turns out, B is in a session at that same time, and is also using to accept STUN messages on those ports, just as C is. A will send a
10.0.1.1:8866 and 8877. This means that B has a STUN server running STUN request to 10.0.1.1:8866 and 8877. However, these do not go to
on those ports, just as C does. A will send a STUN request to C as expected. Instead, they go to B. If B just replied to them, A
10.0.1.1:8866 and 8877. However, these do not go to C as expected. would believe it has connectivity to C, when in fact it has
Instead, they go to B. If B just replied to them, A would believe it connectivity to a completely different user, B. To fix this, tid
has connectivity to C, when in fact it has connectivity to a takes on the role of a unique identifier. C provides A with an
completely different user, B. To fix this, the STUN username identifier for its transport address, and A provides one to C. A
fragment takes on the role of a unique identifier. C provides A with concatenates these two identifiers and uses the result as the
a unique username fragment, and A provides one to C. A uses these username and password in its STUN query to 10.0.1.1:8866. This STUN
two fragments to construct the username in its STUN query to query arrives at B. However, the username is unknown to B, and so the
10.0.1.1:8866. This STUN query arrives at B. However, the username request is rejected. A treats the rejected STUN request as if there
is unknown to B, and so the request is rejected. A treats the were no connectivity to C (which is actually true). Therefore, the
rejected STUN request as if there were no connectivity to C (which is error is avoided.
actually true). Therefore, the error is avoided.
An unfortunate consequence of the non-uniqueness of IP addresses is An unfortunate consequence of the non-uniqueness of IP addresses is
that, in the above example, B might not even be an ICE client. It that, in the above example, B might not even be an ICE agent. It
could be any host, and the port to which the STUN packet is directed could be any host, and the port to which the STUN packet is directed
could be any ephemeral port on that host. If there is an application could be any ephemeral port on that host. If there is an application
listening on this socket for packets, and it is not prepared to listening on this socket for packets, and it is not prepared to
handle malformed packets for whatever protocol is in use, the handle malformed packets for whatever protocol is in use, the
operation of that application could be effected. Fortunately, since operation of that application could be effected. Fortunately, since
the ports exchanged in SDP are ephemeral and ususally drawn from the the ports exchanged in SDP are ephemeral and ususally drawn from the
dynamic or registered range, the odds are good that the port is not dynamic or registered range, the odds are good that the port is not
used to run a server on host B, but rather is the client side of some used to run a server on host B, but rather is the agent side of some
protocol. This decreases the probability of hitting a port in-use, protocol. This decreases the probability of hitting a port in-use,
due to the transient nature of port usage in this range. However, due to the transient nature of port usage in this range. However,
the possibility of a problem does exist, and network deployers should the possibility of a problem does exist, and network deployers should
be prepared for it. be prepared for it.
Termination of the local STUN servers is discussed in Section 5.3.6. Note that, because there are separate transport addresses for RTP and
RTCP, each will have a distinct tid.
5.3.3 Prioritizing the Transport Addresses and Choosing a Default The active candidate is placed into the m/c lines of the SDP. For
RTP streams, this is done by placing the RTP address and port into
the c and m lines in the SDP respectively. If the agent it utilizing
RTCP, it MUST encode its address and port using the a=rtcp attribute
as defined in RFC 3605 [2]. If RTCP is not in use, the agent MUST
signal that using b=RS:0 and b=RR:0 as defined in RFC 3556 [8].
The prioritization process takes the list of the advertised transport For media streams that are inherently TCP-based (as opposed to ones
addresses, and associates each with a priority. This priority where TCP is a fallback and would be listed as a candidate but not
reflects the desire that the UA has to receive media on that address, the initial active address), the connections MUST be signaled using
and is assigned as a value from 0 to 1 (1 being most preferred). comedia [13], and those connections MUST be in "holdconn" mode. This
Priorities are ordinal, so that their significance is only relative has the effect of suspending connection attempts via the comedia
to other transport address priorities in the same list. mechanisms, allowing ICE to open the connections instead. These
connections then get removed from holdconn mode when the ICE
procedures complete and an updated offer/answer exchange takes place
that promotes one of the existing ICE-established connections to
active. Note that this has the result of increasing the post-dial-
delay for TCP-oriented media, but brings with it substantial security
and NAT traversal properties.
7.3 Prioritizing the Transport Addresses and Choosing an Active One
The prioritization process takes the set of candidates and associates
each with a priority. This priority reflects the desire that the
agent has to receive media on that address, and is assigned as a
value from 0 to 1 (1 being most preferred). Priorities are ordinal,
so that their significance is only meaningful relative to other
candidates for a particular media stream.
This specification makes no normative recommendations on how the This specification makes no normative recommendations on how the
prioritization is done. However, some useful guidelines are prioritization is done. However, some useful guidelines are
suggested on how such a prioritization can be determined. suggested on how such a prioritization can be determined.
One criteria for choosing one transport address over another is One criteria for choosing one candidate over another is whether or
whether or not that transport address involves the use of a relay. not that candidate involves the use of a relay. That is, if media is
That is, if media is sent to that transport address, will the media sent to that candidate, will the media first transit a relay before
first transit a relay before being received. TURN derived transport being received. TURN candidates make use of relays (the TURN
addresses make use of relays (the TURN server), as do any local server), as do any local candidates associated with a VPN server.
transport addresses associated with a VPN server. When media is When media is transited through a relay, it can increase the latency
transited through a relay, it can increase the latency between between transmission and reception. It can increase the packet
transmission and reception. It can increase the packet losses, losses, because of the additional router hops that may be taken. It
because of the additional router hops that may be taken. It may may increase the cost of providing service, since media will be
increase the cost of providing service, since media will be routed in routed in and right back out of a relay run by the provider. If
and right back out of a relay run by the provider. If these concerns these concerns are important, candidates with this property can be
are important, transport addresses with this property can be listed listed with lower priority.
with lower priority.
Another criteria for choosing one address over another is IP address Another criteria for choosing one candidate over another is IP
family. ICE works with both IPv4 and IPv6. It therefore provides a address family. ICE works with both IPv4 and IPv6. It therefore
transition mechanism that allows dual-stack hosts to prefer provides a transition mechanism that allows dual-stack hosts to
connectivity over IPv6, but to fall back to IPv4 in case the v6 prefer connectivity over IPv6, but to fall back to IPv4 in case the
networks are disconnected (due, for example, to a failure in a 6to4 v6 networks are disconnected (due, for example, to a failure in a
relay) [17]. It can also help with hosts that have both a native 6to4 relay) [24]. It can also help with hosts that have both a
IPv6 address and a 6to4 address. In such a case, higher priority native IPv6 address and a 6to4 address. In such a case, higher
could be afforded to the native v6 address, followed by the 6to4 priority could be afforded to the native v6 address, followed by the
address, followed by a native v4 address. This allows a site to 6to4 address, followed by a native v4 address. This allows a site to
obtain and begin using native v6 addresss immediately, yet still obtain and begin using native v6 addresss immediately, yet still
fallback to 6to4 addresses when communicating with clients in other fallback to 6to4 addresses when communicating with agents in other
sites that do not yet have native v6 connectivity. sites that do not yet have native v6 connectivity.
Another criteria for choosing one address over another is security. Another criteria for choosing one candidate over another is security.
If a user is a telecommuter, and therefore connected to their If a user is a telecommuter, and therefore connected to their
corporate network and a local home network, they may prefer their corporate network and a local home network, they may prefer their
voice traffic to be routed over the VPN in order to keep it on the voice traffic to be routed over the VPN in order to keep it on the
corporate network when communicating within the enterprise, but use corporate network when communicating within the enterprise, but use
the local network when communicating with users outside of the the local network when communicating with users outside of the
enterprise. enterprise.
Another criteria for choosing one address over another is topological Another criteria for choosing one address over another is topological
awareness. This is most useful for transport addresses which make awareness. This is most useful for candidates which make use of
use of relays (including TURN and VPN). In those cases, if a client relays (including TURN and VPN). In those cases, if a agent has
has preconfigured or dynamically discovered knowledge of the preconfigured or dynamically discovered knowledge of the topological
topological proximity of the relays to itself, it can use that to proximity of the relays to itself, it can use that to select closer
select closer relays with higher priority. relays with higher priority.
Once the transport addresses have been prioritized, one is selected Finally, the transport protocol itself is a criteria for choosing one
as the default. This is the address that will be used by a peer that candidate over another. If a particular media stream can run over
doesn't understand ICE. The default has no relevance when UDP or TCP, the UDP candidates might be preferred over the TCP
communicating with an ICE capable peer. As such, it is RECOMMENDED candidates. This allows ICE to use the lower latency UDP
that the default be chosen based on the likelihood of that address connectivity if it exists, but fallback to TCP if UDP doesn't work.
being useful when communicating with a peer that doesn't support ICE.
Unfortunately, it is difficult to ascertain which address that might Once the candidates have been prioritized, one is selected as the
be. As an example, consider a user within an enterprise. To reach active one. This is the candidate that will be used for actual
non-ICE capable clients within the enterprise, a local transport exchange of media, until replaced by an updated offer or answer.
address has to be used, since the enterprise policies may prevent Since the ICE connectivity checks can take a few seconds to execute,
media clipping can occur is this candidate doesn't work. The active
candidate will also be used to receive media from ICE-unaware peers.
As such, it is RECOMMENDED that one be chosen based on the likelihood
of that candidate to work with the peer that is being contacted.
Unfortunately, it is difficult to ascertain which candidate that
might be. As an example, consider a user within an enterprise. To
reach non-ICE capable agents within the enterprise, a local candidate
has to be used, since the enterprise policies may prevent
communication between elements using a relay on the public network. communication between elements using a relay on the public network.
However, when communicating to peers outside of the enterprise, a However, when communicating to peers outside of the enterprise, a
TURN-based public address is needed. TURN-based candidate from a publically accessible TURN server is
needed.
Indeed, the difficulty in picking just one address that will work is Indeed, the difficulty in picking just one address that will work is
the whole problem that motivated the development of this the whole problem that motivated the development of this
specification in the first place. As such, it is RECOMMENDED that specification in the first place. As such, it is RECOMMENDED that
the default address be a TURN derived transport address from a TURN the default address be a TURN candidate from a TURN server providing
server providing public IP addresses. Furthermore, ICE is only truly public IP addresses. Furthermore, ICE is only truly effective when
effective when it is supported on both sides of the session. It is it is supported on both sides of the session. It is therefore most
therefore most prudent to deploy it to close-knit communities as a prudent to deploy it to close-knit communities as a whole, rather
whole, rather than piecemeal. In the example above, this would mean than piecemeal. In the example above, this would mean that ICE would
that ICE would ideally be deployed completely within the enterprise, ideally be deployed completely within the enterprise, rather than
rather than just to parts of it. just to parts of it.
5.3.4 Sending STUN Connectivity Checks 7.4 Connectivity Checks
Once a responder has received an initiate message, or an initiator Once the offer/answer exchange has completed, both agents will have a
has received an accept message, the list of transport addresses is set of candidates for each media stream. Each agent forms a set of
extracted from the message. These transport addresses, called the pairings for each media stream by combining each of its UDP
remote transport addresses, along with the username fragment from the candidates with each of the UDP candidates of its peer, and by
peer (called the remote username fragment), the password from the combining each of its TCP candidates with each of the TCP candidates
peer (called the remote password), and priority from the peer (called of its peer. If candidates for other transport protocols were
the remote priority) are placed into a table called the candidate signaled through the offer/answer exchange, a pairing is performed
table. There is a candidate table for RTP for each media stream, and between each of those as well. If an offer/answer exchange took
for RTCP for each media stream. So, if a session is established with place for a session comprised of an audio and a video stream, and
audio and video, there would be four tables - audio RTP, audio RTCP, each stream had two UDP and two TCP candidates from each agent, there
video RTP and video RTCP. An example of a candidate table for RTP would be 16 pairings, 8 for audio and 8 for video. Each of those
audio is shown below. eight would be comprised of four UDP and four TCP. Note that there
is no requirement that the number of candidates from each peer be the
same. One agent can offer two UDP candidates for a media stream, and
the answer can contain three UDP candidates for the same media
stream. In that case, there would be six UDP pairings.
Remote Remote Remote Remote Each candidate has a number of transport addresses. In the case of
Transport Username Password Priority RTP, there are either one or two. Within the pairing, the transport
Address Fragment addresses of each candidate are linked together one-to-one to form a
-------------------------------------------------------------------- transport address pair. In the case of RTP, the result will either
10.0.1.1:38746 asd9f8f8== 9asfhfvva9==affahnz 0.4 be one or two transport address pairs - one for RTP, and possibly
192.0.2.77:44634 xcyca87sbb f99fhaz0ftrafdgl99d 0.2 another for RTCP. The relationship between a candidate, transport
address, pairing and transport address pair are shown in Figure 2.
This figure shows the pairing as seen by the agent that owns the
candidate {A,B}. The candidate owned by that agent is called the
native candidate, and the one owned by its peer is the remote
candidate. As the figure shows, there is one pairing between two
candidates, and two transport address pairs ({A,C} and {B,D}). If
one of the candidates only had one transport address (in the case
where RTCP was not being used by one agent), there would only be one
transport address pair, {A,C}. Each transport address is associated
with a tid. Furthermore, each transport address pair is associated
with an ID, the transport address pair ID. This ID is equal to the
concatenation of the tid of the native transport address with the tid
of the remote transport address. This means that the identifiers are
different for each agent. For the agent that owns {A,B}, the
transport address pair ID is WY for the first transport address pair,
and XZ for the second. For the agent that owns {C,D}, it would be
reversed - YW for the first transport address pair, and ZX for the
second.
Figure 3 ...........................................
The client then creates a new table, called the connection table. . .
There is a row in this table for each gathered address and remote .......... . . ..........
transport address pair. This table has a column for the local . . . ............. ............. . . .
transport address, which is equal to the gathered address if it was a . . . . . . . . . .
usable local transport address, else equal to the associated local . -- . . . -- . . -- . . . -- .
transport address if the gathered address was a derived address. . | A|<<<<<<<<<<| A|--------------------| C|>>>>>>>>>>>>| K| .
There is also a column for the remote transport address, the local . -- . . . -- . Transport . -- . . . -- .
username fragment, the remote username fragment, the remote password . . . . Transport . Address . Transport . . . .
and the state. Each row in this table is called a connection, and it . . . . Address . Pair . Address . . . .
provides information on the connectivity when sending packets from . . . . tid=W . ID=WY . tid=Y . . . .
the local transport address to the remote transport address. . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. -- . . . -- . . -- . . . -- .
. | J|<<<<<<<<<<| B|--------------------| D|>>>>>>>>>>>>| D| .
. -- . . . -- . Transport . -- . . . -- .
.......... . . Transport . Address . Transport . . ..........
Associated . . Address . Pair . Address . . Associated
Local . . tid=X . ID=XZ . tid=Z . . Local
Transport . . . . . . Transport
Addresses . ............. ............. . Addresses
. Native Remote .
. Candidate Candidate .
. and and .
. Transport Addresses Transport Addresses .
. .
...........................................
There are four possible states for each connection. These states Pairing
are:
INIT: No STUN transaction has been completed towards this remote Figure 2
transport address from this local transport address.
HANDSHAKING: One or more STUN transactions have failed, but The figure also shows that each transport address has an associated
insufficient time has passed since leaving the INIT state to be local transport address. The associated local transport address is
certain that the remote transport address is unreachable from this the local transport address at which the agent will receive packets
local transport address. This state is important for connectivity sent to the transport address. For a local transport address, its
checks made to STUN derived transport addresses through port associated local transport address is the same. That is the case of
restricted NAT. transport address A and D in the diagram. For STUN derived and TURN
derived transport addresses, however, they are not the same. The
associated local transport address is the one from which the STUN or
TURN transport was derived.
BAD: All STUN transactions to this remote transport address from this Next, each agent begins sending connectivity checks for each
local transport address have either timed out, or failed with a transport address pair. The procedure differs for UDP and TCP.
600 response, and a sufficient amount of time has elapsed since
the INIT state to have high confidence that the remote transport
address cannot be reached from this local transport address.
GOOD: A STUN transaction to this remote transport address from this 7.4.1 UDP Connectivity Checks
local transport address was successful.
When the client first populates the tables from the initiate or An agent considers a UDP pairing validated when all of its transport
accept message, all of the connections are set to the INIT state. address pairs have been validated. Each transport address pair is
validated if an agent successfully completed a STUN Binding Request
transaction from its native transport address to the corresponding
remote transport address, and when it has received a STUN Binding
Request transaction on its native transport address, sent from the
remote transport address. This ensures that packets can flow in each
direction.
Consider the the following example. An initiator sends an initiate Because validation of a transport address pair involves a STUN
message with one media stream (audio), with two RTP transport transaction in each direction, a pair can be in one of five states -
addresses, 10.0.1.1:38746 (which we denote "A" for shorthand) and unknown, invalid, send-valid, receive-valid and valid. Each
192.0.2.77:44634 (which we denote "B" for shorthand). A is a usable transport address pair starts in the unknown state.
local transport address, and B is a STUN derived transport address
(although that fact is not signaled in the message). The usernames
and passwords for these transport addresses are shown in Figure 3.
The initiate message is sent to the responder. The responder has a
local transport address (10.0.1.76:43443), and a a STUN derived
transport address (192.0.2.64:54766) derived from (10.0.1.76:43444).
Call these two local transport addresses X and Y respectively. The
connection table created by the responder would have four rows (two
local transport addresses times two remote transport addresses).
Such a table might look like this:
Remote Local Remote Local Remote Remote 7.4.1.1 Send Validation
Trans. Trans. Username Username Password Priority
Address Address Fragment Fragment State
------------------------------------------------------------------------
A X asd9f8f8== 8asd77fa9 9asfhfvva9==affahnz 0.4 INIT
A Y asd9f8f8== zhff8dga^ 9asfhfvva9==affahnz 0.4 INIT
B X xcyca87sbb 8asd77fa9 f99fhaz0ftrafdgl99d 0.2 INIT
B Y xcyca87sbb zhff8dga^ f99fhaz0ftrafdgl99d 0.2 INIT
The client begins a STUN BindingRequest transaction for each To validate a transport address pair in the send direction, an agent
connection. This STUN transaction is sent to the IP address and port needs to complete a successful STUN Binding Request transaction.
from the Remote Transport Address column. It sends the request from This means it needs to send a Binding Request from its native
the IP address and port in the Local Transport Address column. The transport address to the remote transport address, and receive a
STUN USERNAME attribute MUST be present. It is set to the successful Binding Response back.
concatenation of the remote user fragment with the local user
fragment from the table. Thus, for the candidate with remote For UDP-based transport addresses, an agent initiates a STUN Binding
transport address A and local transport address X, the USERNAME would Request transaction by sending from its native transport address, and
be set to "asd9f8f8==8asd77fa9". The BindingRequest SHOULD contain a sends it to the remote transport address. The meaning of "sending
MESSAGE-INTEGRITY attribute, computed using the username in the from its native transport address" is clear in the case of a local
USERNAME attribute, and the password from the password field in the transport address - the request is sent such that the source IP
row. The BindingRequest MUST NOT contain the CHANGE-REQUEST or address and port of the packet is equal to that local transport
RESPONSE-ADDRESS attribute. address. However, the meaning is different for STUN and TURN derived
transport addresses. For STUN derived transport address, it is sent
by sending from the local transport address used to derive that STUN
address. For TURN derived transport addresses, it is sent by using
TURN mechanisms to send the request through the TURN server (using
the SEND primitive). Sending the request through the TURN server
neccesarily requires that the request be sent from the client, using
the local transport address used to derive the TURN transport
address.
The Binding Request sent by the agent MUST contain the USERNAME
attribute. This attribute MUST be set to the transport address pair
ID of the corresponding transport address pair as seen by its peer.
Thus, for the first transport address pair in the example above, if
the agent on the left sends the STUN Binding Request, the USERNAME
will have the value YW. The request MAY contain the MESSAGE-
INTEGRITY attribute, computed according to RFC 3489 procedures. The
MESSAGE-INTEGRITY The Binding Request MUST NOT contain the CHANGE-
REQUEST or ANSWER-ADDRESS attribute.
Each of these STUN transactions will generate either a timeout, or a Each of these STUN transactions will generate either a timeout, or a
response. If the response is a 420, 500, or 401, the client should response. If the response is a 420, 500, or 401, the agent should
try again as described in RFC 3489. Either initially, or after such try again as described in RFC 3489. Either initially, or after such
a retry, the STUN transaction will produce a timeout result, a a retry, the STUN transaction might produce a non-recoverable failure
success result, a fundamentally non-recoverable failure result (error response (error codes 400, 431, or 600) or a failure result
codes 400, 431, or 600) or a failure result inapplicable to this inapplicable to this usage of STUN and thus unrecoverable (432, 433).
usage of STUN and thus unrecoverable (432, 433), or a 430 error. If this happens the transport address pair and its corresponding
These correspond to the "timeout", "success", "error" and candidate is considered invalid. If the STUN transaction produces a
"race-failure" events, respectively. The 430 response code, as 430 error or times out, the client SHOULD retry with a new STUN
described below, is generated when the server doesn't recognize the Binding Request transaction. The 430 response code, as described
STUN username, presumably because the BindingRequest was sent to the below, is generated when the server doesn't recognize the STUN
initiator prior to receipt of the ICE Accept message by the username because the BindingRequest was sent received prior to the
initiator. It ocurrence is thus a result of a failed race between receipt of the answer. Its ocurrence is a result of a failed race
the BindingRequest and Accept message. As the state machine below between the BindingRequest and the answer. This is remedied by
discusses, the client will retry in this case. retrying, which allows the "slower" answer to be received. These
retry transactions carry the same USERNAME value as the original
Binding Request, and differ only in their STUN transaction ID. If
these retries have not produced a success response after Tg seconds,
the transport address pair is considered invalid. Tg SHOULD be
configurable. It is RECOMMENDED that it default to 50 seconds. This
is a reasonable approximation of the maximum SIP transaction
duration.
These events are fed into the finite state machine (FSM) described in If the STUN transaction succeeds for a UDP transport address pair
Figure 5. This figure shows the transitions between states that (producing a success response), and the pair was previously in the
occur on the completion of the STUN BindingRequest transaction or receive-valid state, it is considered valid. If the pair was
upon the expiration of timers set by the FSM. previously in the unknown state, it is considered send-valid.
race-failure,.......... If a transport address pair is send-valid or valid, an agent MUST
timeout/ . . .......... generate a new STUN Binding Request transaction every Tr seconds.
Set . . . . Retry Fires/ This transaction ensures that NAT bindings for the transport address
Retry Timer,. V . . Retry pair remain open while the candidate is under consideration. They
+---------+ . +---------+ . can also be used to keep the bindings alive when the candidate is
| | . | | . promoted to active, as described in Section 7.7. Tr SHOULD be
| | .......| |<.... configurable, and SHOULD default to 15 seconds. Each new Binding
| INIT |......................>| HAND | Request transaction is processed according to the procedures in this
| | race-failure, | SHAKING | Section. It is possible for a previously valid candidate to later be
| | timeout/ | | invalidated by a subsequent STUN transaction. This happens in cases
+---------+ Set +---------+ where the NAT bindings expire.
. . Retry Timer, error, . .
. . Giveup Timer Giveup . . 7.4.1.2 Receive Validation
error . . Fires . .
. . ............................. . success As a result of providing a list of candidates in its offer or answer,
. . . . an ICE implementation will receive STUN Binding Request messages. An
. ...C.............................. . agent MUST be prepared to receive STUN Binding Requests on each local
. . success . . transport address from the moment it sends an offer or answer that
contains a candidate with that local transport address. Similarly,
it MUST be prepared to receive STUN Binding Requests on a local
transport address the moment it sends an offer or answer that
contains a STUN or TURN candidate derived from a local candidate
containing that local transport address. It can cease listening for
STUN messages on that local transport address after reliably sending
an updated offer or answer which does not include any candidates
equal to or derived from that local transport address. Here,
"reliably" means that the agent knows that the offer or answer was
received by its peer. This knowledge is based on the protocol
carrying the offer/answer exchanges. In the case of SIP, if the
offer is in an INVITE, the agent knows this was received by its peer
when a 200 OK or reliable provisional response [9] is received with
the answer. If the offer is in a reliable provisional response, the
agent knows it was reliably received when the PRACK arrives. If an
answer is in a 200 OK response, the agent knows this was received
when the ACK is received.
The agent does not need to provide STUN service on any other IP
address or port, unlike the STUN usage described in [1]. The need to
run the service on multiple ports is to support the change flags.
However, those flags are not needed with ICE, and the server SHOULD
reject, with a 400 answer, any STUN requests with these flags set.
The CHANGED-ADDRESS attribute in a BindingAnswer is set to the
transport address on which the server is running.
Furthermore, there is no need to support TLS or to be prepared to
receive SharedSecret request messages. Those messages are used to
obtain shared secrets to be used with BindingRequests. However, with
ICE, a shared secret is not needed. The tid's that are exchanged and
used to form the STUN USERNAME attribute do not actually require the
security properties associated with a shared secret in order for ICE
to operate securely; this is because ICE security is bootstrapped off
of the protocol carrying the offer/answer exchanges.
One of the candidates will be in use as the active candidate. For
the transport addresses comprising that candidate, the agent will
receive both STUN requests and media packets on its associated local
transport addresses. The agent MUST be able to disambiguate them.
In the case of RTP/RTCP, this disambiguation is easy. RTP and RTCP
packets start with the bits 0b10 (v=2). The first two bits in STUN
are always 0b00. This disambiguation also works for packets sent
using Secure RTP [23], since the RTP header is in the clear.
Disambiguating STUN with other media stream protocols may be more
complicated. However, it can always be possible with arbitrarily
high probabilities by selecting an appropriately random username (see
below).
The STUN Binding Request can only be usefully processed once an
offer/answer exchange has completed. As a result, if an offeror
receives a STUN Binding Request message prior to the receipt of an
answer to its offer, it MUST reject the request with a 430 response.
This will cause the answerer to retry, and give time for the answer
(which is in transit) to arrive at the offerer.
If the offer/answer exchange has completed, the agent MUST follow the
procedures defined in RFC 3489 and verify that the USERNAME attribute
is known to the server. Here, this is done by taking the USERNAME
attribute, and comparing it against the transport address pair
identifiers for each transport address pair as seen by that agent.
If there is no match, the STUN Binding Request generates a 400. If
there is a match, the resulting transport address pair is called the
matching transport address pair. The user agent proceeds with the
processing of the request and generation of a response as per RFC
3489. In addition, the if the state of that transport address pair
was previously unknown, it changes to receive-valid. If the state
was previously send-valid, it moves to valid.
An agent will continue to receive periodic STUN transactions as long
as it had listed its transport address in an a=candidate attribute.
It MUST process those transactions according to this section. It is
possible that a transport address pair that was previously valid may
become invalidated as a result of a subsequent failed STUN
transaction.
7.4.1.3 Learning New Candidates from Connectivity Checks
ICE makes use of candidate addresses learned through protocols like
STUN, as described in Section 7.1. These addresses are learned when
STUN requests are sent to configured STUN servers. However, the
peer-to-peer STUN connectivity checks can themselves provide
additional candidates that ICE can make use of. This happens when
two agents are separated by a symmetric NAT. When the agent behind
the symmetric NAT sends a Binding Request to the other agent (which
can have a public address or be behind any type of NAT except for
symmetric), the symmetric NAT will create a new NAT binding for this
Binding Request. Because of the properties of symmetric NAT, that
binding can be used be the agent on the public side of the symmetric
NAT to send packets back to the agent behind the symmetric NAT.
To do this, ICE agents dynamically learn new candidates by examining
the source IP addresses and MAPPED-ADDRESS attributes in STUN Binding
Requests and Responses respectively. If they don't match any
existing candidates, a new candidate is added. This candidate
corresponds to the new IP address and port created by the symmetric
NAT, and is a new point of contact for the agent behind the symmetric
NAT. Since that candidate is only reachable from the very specific
IP address and port where the STUN request was sent to, the new
candidate is paired up with that transport address on the other
agent. Since all candidates need to have properties, such as tids,
priorities and candidate IDs, these are all computed algorithmically,
so that they can be determined by both agents just from the STUN
message.
The specific procedures on receipt of a Binding Request and Response
for accomplishing this are described here.
7.4.1.3.1 On Receipt of a Binding Request
When a STUN Binding Request is received which generates a success
response, the source IP address and port of that request is compared
all existing remote transport addresses. If there is no match, the
agent creates a new remote candidate, and adds a transport address to
it. It sets the IP address and port of this new remote transport
address to the IP address and port that was present in the incoming
Binding Request. Since this is a new candidate transport address, it
requires a new tid. The agent creates one algorithmically, by
concatenating the tid of the remote transport address in the matching
transport address pair (recall that the matching transport address
pair is the one whose transport address pair ID matched the username
of the incoming Binding Request) with the string representation of
the source IP address and port from the incoming Binding Request.
This string representation is defined using the grammar for
"hostport" from RFC 3261 [3], which defines the familiar notation of
the IP address and port separated by a colon.
The priority of the new candidate MUST be set to the priority of the
remote candidate in the matching transport address pair. There is no
need to compute the candidate ID for this new candidate.
Though this is a valid transport address, the agent does not pair it
up with each of its own transport addresses. Rather, it pairs it up
only with the native transport address from the matching transport
address pair. This creates a new transport address pair. Since
connectivity has been verified in the receive direction, the agent
sets its state to receive-valid. As with all other transport address
pairs, the agent will attempt to validate send capabilities by
sending a STUN Binding Request according to the procedures in
Section 7.4.1.1.
It is important to note that this process creates a new remote
transport address, not a whole new remote candidate. For a whole
remote candidate to come into existence, all of its component
transport addresses must come into existence, and all must have been
obtained as a result of a STUN Binding Requests between transport
address pairs in the same pairing. As an example, consider the
pairing in Figure 2. If the peer is behind a symmetric NAT, the
Binding Request sent from C to A might produce a new remote transport
address for RTP. To create a full candidate, a STUN Binding Request
from D to B has to also create a new remote transport address, to be
used for RTCP. If this were to happen, the resulting set of
relationships is shown in Figure 3. To simplify the diagram,
associated local transport address relationships have been omitted.
Notice how the tids of the new remote candidate have been constructed
by concatenating the tids of the original remote candidate with the
newly discovered transport addresses, here, {R,S}.
............. .............
. . . . . . . .
V V V V . -- . . -- .
+---------+ +---------+ . | A|---------------------------------------| C| .
| | | | . -- -----------+ Transport . -- .
| | | | . Transport . | Address . Transport .
| BAD |. | GOOD | . Address . | Pair . Address .
| | | | . tid=W . | ID=WY . tid=Y .
| | | | . . | . .
+---------+ +---------+ . . | . .
. . | . .
. -- . | . -- .
. | B|-----------C---------------------------| D| .
. -- ---------+ | Transport . -- .
. Transport . | | Address . Transport .
. Address . | | Pair . Address .
. tid=X . | | ID=XZ . tid=Z .
. . | | . .
............. | | .............
| | remote
native | | candidate
candidate | |
| | .............
| | . .
| | . -- .
| +---------------------------| R| .
| Transport . -- .
| Address . Transport .
| Pair . Address .
| ID=WYR . tid=YR .
| . .
| . .
| . .
| . -- .
+-----------------------------| S| .
Transport . -- .
Address . Transport .
Pair . Address .
ID=XZS . tid=ZS .
. .
.............
peer-derived
remote candidate
Figure 5 Figure 3
Starting in the INIT state, if the transaction is successful, the 7.4.1.3.2 On Receipt of a Binding Response
client has verified connectivity to that remote transport address
when sending from that local transport address. This means that
media packets sent in exactly the same way will get through. As
such, the FSM transitions to the GOOD state. If, from the INIT
state, the STUN transaction times out, the FSM enters the HANDSHAKING
state. At this point, there are two likely reasons that the STUN
transaction might have timed out. One reason is that the candidate
is simply unreachable. The other reason is that the peer is behind a
port restricted NAT, and so STUN requests from the client cannot get
through until its peer creates a permission by generating its own
STUN request. It may take some time to generate that STUN request,
as it may depend on a response message getting delivered. It is also
possible that the STUN transaction timed out due to a persistent
network failure, in which case, a retry is in order. As such, the
HANDSHAKING state allows for rapid retry of the STUN transaction
until enough time has passed to be certain that the remote transport
address is actually unreachable. Thus, upon entering the HANDSHAKING
state, two timers are set. The first, called the Rapid Retry timer,
determines how long until the next attempt. This timer SHOULD be
configurable. It is RECOMMENDED that it default to 50ms. Note that
this timer does not mean that a STUN request is repeated every 50ms.
It means that a new STUN transaction begins 50ms after the completion
of the previous one. STUN transactions themselves employ
exponentially back off retransmit timers. The second timer, called
the Giveup Timer, determines how long the client will keep trying
until it decides that the remote transport address is unreachable.
This timer SHOULD be configurable. It is RECOMMENDED that it default
to 50 seconds. This is a reasonable approximation of the maximum SIP
transaction duration.
If, from the INIT state the STUN transaction generates a race-failure When an agent receives a successful Binding Response, it examines the
event, it means that the peer has not yet completed the MAPPED-ADDRESS attribute in that response. If the MAPPED-ADDRESS
initiate/accept exchange, and thus the username has not been does match any of the existing candidate transport addresses, this
allocated. Another BindingRequest transaction needs to take place to represents a new peer-derived transport address.
try again. Thus, the same retry and giveup timers as in the timeout
event are started.
If, from the INIT state, the STUN transaction generates an error, the The agent creates a new local candidate, and adds a transport address
FSM moves into the BAD state. This state means that the connection to it. It sets the IP address and port of this new native transport
is definitively unreachable, and it will not be used subsequently in address to the IP address and port that was present in the MAPPED-
the session. ADDRESS attribute of the Binding Response. Since this is a new
candidate transport address, it requires a new tid. The agent
creates one algorithmically, by concatenating the tid of the native
transport address in the transport address pair that was being
validated by the Binding Request with the string representation of
the source IP address and port from the MAPPED-ADDRESS attribute.
This string representation is defined using the grammar for
"hostport" from RFC 3261 [3], which defines the familiar notation of
the IP address and port separated by a colon.
If, while in the HANDSHAKING state, the Giveup timer fires, or the The priority of the new candidate MUST be set to the priority of the
STUN transaction results in an error, the client moves into the BAD native candidate that was being validated by the Binding Request.
state. If, while in the HANDSHAKING state, the Rapid Retry timer The agent SHOULD assign a new candidate ID to this candidate.
fires, a new STUN transaction is started. The output of that
transaction will be subsequently fed into the FSM, but upon
initiation of the retry attempt there is no change in state. If the
pending BindingRequest transaction succeeds, the FSM moves into the
GOOD state. This transport connection is viable for communications.
Once one of the connections in the connection table enters the GOOD Though this is a valid transport address, the agent does not pair it
state, the client SHOULD begin using it for communications. It up with each of the remote transport addresses. Rather, it pairs it
SHOULD cease any ongoing transactions and terminate FSMs for up only with the remote transport address from the transport address
connections of lower priority. If, another connection of higher pair that was being validated. This creates a new transport address
priority should subsequently enter the GOOD state, the client SHOULD pair. Since connectivity has been verified in the send direction,
switch to that one, and once more cease all ongoing transactions and the agent sets its state to send-valid. As with all other transport
terminate FSMs for connections of lower priority. It SHOULD perform address pairs, the agent will attempt to validate receive
this switch after waiting a small period of time (2 seconds is capabilities by waiting for a a STUN Binding Request according to the
RECOMMENDED) to prevent against quick changes in transport address as procedures in Section 7.4.1.2.
each of the ongoing connectivity checks completes. If there are
multiple GOOD connections whose priorities are equal and higher than
any other GOOD connection, the client SHOULD pick one randomly and
use that. It SHOULD NOT change to another one of equal priority
later on. Each change in address is likely to cause a change in
transport characteristics, and manifest itself as a "glitch" to the
user.
To send media on a connection, the client sends media packets It is important to note that this process creates a new native
(whether they are RTP or RTCP or something else) to the remote transport address, not a whole new candidate. For a whole native
transport address, from the local transport address. candidate to come into existence, all of its component transport
addresses must come into existence, and all must have been obtained
as a result of a STUN Binding Requests between transport address
pairs in the same pairing.
5.3.5 Receiving STUN Requests 7.4.2 TCP Connectivity Checks
When a client receives a STUN request (presumably after 7.4.2.1 Connection Establishment
disambiguating it from a media packet), it follows the logic
described in this section.
The client MUST follow the procedures defined in RFC 3489 and verify Because of the connection-oriented nature of TCP, the connectivity
that the USERNAME attribute is known to the server. Here, this is checks work differently. After the offer/answer exchange completes,
done by taking the USERNAME attribute, and doing a prefix match each agent will have a set of TCP candidates at which it is waiting
against the "local username fragment" column in the connection table. to receive a connection on, and it will have a similar set from its
If it doesn't match any rows, the client generates a 400 response. peer. Thus, a pairing of TCP candidates allows for the possibility
If it matches one or more rows, the client checks the suffix of the of TCP connections in each direction. Unlike the UDP checks, where
username against the "remote username fragment" column in those the STUN packets are sent from the native transport addresses to the
matching rows. If the final result doesn't match any rows, the remote ones, the TCP connections are not opened from the native TCP
client generates a 430 response. If the final result matches a transport addresses to the remote ones. This would represent a
single row, that row identifies the connection on which the STUN simultaneous open, and represent an unusual condition that would
request was received. The client then proceeds with the processing either fail, or at best result in a single TCP connection. Rather,
of the request and generation of a response as per RFC 3489. ICE desires to attempt two connections, one in each direction, and
use one of them if both happen to succeed.
Once the response is sent, the client examines the source IP and port To accomplish this, each agent will attempt to open a connection to
where the request came from. It matches those against the remote each remote transport address in the transport address pair, and do
transport addresses of the matching connection from the previous so "from" its native transport address. Here, however, "from" means
paragraph. If they don't match, and that remote transport address is something different than the UDP case. If the native transport
not elsewhere in the table, this source transport address is itself address is a local transport address, the agent opens the TCP
another possible candidate. As with other candidates, it must be connection from the same IP interface used to obtain the local
associated with a STUN remote username fragment, remote password and transport address, but from a different and ephemeral port. Indeed,
remote priority. These are obtained from the values of these columns that port MUST NOT be the same as the port in the local transport
for the matching connection in the table. This candidate is then address. If the native transport address is a TURN-derived TCP
paired with each local transport address, and the resulting set of transport address, no attempt is made to open a connection at all.
connections are added to the connection table and verified using STUN TURN-derived TCP transport addresses can only be used in passive
connectivity checks as per Section 5.3.4. mode.
When will the source transport address of the BindingRequest not As such, for each TCP transport address pair, there will be either
match an existing candidate remote transport address? This happens zero, one, or two connection attempts. If the transport address
when there is a NAT between the peers which is not on the path pairs are both TURN-derived, there will be zero (both sides passive).
between each peer and the UNSAF servers. If one of the transport addresses is local, and the other TURN
derived, there will be one connection attempt. The agent owning the
local transport address will be in active mode, and the agent owning
the TURN-derived one will be in passive mode. If both are local
transport address, there will be two attempts, and each agent will
act in active mode.
5.3.6 Management of Resources Because a transport address pair can produce multiple connections,
validity becomes a property of the TCP connection itself. A
transport address pair is considered valid if at least one valid
connection has been established within it. An entire pairing is
valid if all transport address pairs are valid.
The beginning of a multimedia session results in the creation of 7.4.2.2 Sending STUN Binding Requests
several resources to support ICE. These include gathered addresses,
both local and derived, along with the local STUN servers that run on
the local addresses. These resources must be maintained and
eventually freed.
It is RECOMMENDED that all gathered addresses be retained for the Once the connection is established, the agent which opened the
duration of the session. Even if they are not used initially, this connection (that is, acted in active mode) sends a STUN Binding
allows them to be used later in the session should conditions change, Request over that connection. STUN Binding Requests as described in
requiring a signaling operation to update the set of candidate RFC 3489 are not normally sent over UDP, but when used in conjunction
addresses. Maintaining these resources depends on the type of with ICE for connectivity checks, they are sent over TCP.
resource. For a local transport address, nothing is required. The
socket is maintained until freed by the ICE application. For STUN
derived transport addresses, the bindings in the NAT for that address
need to be maintained. If the derived transport address is used by
the peer for media, the media itself serves to keep the bindings
alive (see Section 5.3.7). A client can determine that a STUN
derived transport address was used for media when the RTP packet
arrives at the associated local transport address. For the other
STUN derived transport addresses, the client SHOULD periodically
generate STUN transactions to the STUN server. Every 20 seconds is
RECOMMENDED.
For TURN derived transport addresses, the bindings in the NAT along This unusual operation requires some explanation. At first glance, a
with the mappings in the TURN server need to be maintained. Media successful TCP connection ought to be sufficient. Clearly,
traffic itself can accomplish that. The client will know that its connectivity is established, as TCP packets were exchanged in both
TURN derived transport address is in use when an RTP packet arrives directions via the TCP handshake. While that is true, the STUN
at the associated local transport address. For other TURN derived Binding Requests serve many purposes, only one of which is to
transport addresses, the TURN keepalive mechanisms SHOULD be used. literally test connectivity. The STUN requests also serve as a
correlation vehicle, allowing the agent to match the source of a
connection attempt with the offer/answer signaling driving the entire
mechanism. For example, in the case of a forked SIP INVITE carrying
an offer, the UAC may receive two connection attempts to each of its
passive TCP addresses, one from each branch of the fork. These are
readily disambiguated by the STUN Binding Request which will follow,
as the tid in the USERNAME tells the UAC which branch has initiated
the connection.
Once the STUN servers are started on the local transport addresses, More importantly, however, the STUN Binding Request is an essential
they MUST run until a valid media packet is detected on that part of the security properties of ICE. Without it, an entity
transport address. Once a media packet is received, it signals that eavesdropping the signaling messages would be able to deny service or
the peer has completed its connectivity checks and has decided to use hijack media connections, and such attacks would require encryption
that transport address (or the derived transport address, as the case of the offer/answer exchanges (using a mechanism like SIPS [3]) to
may be) for media communications. While the server is running, it prevent. However, when a STUN Binding Request exchange is added,
MUST act as a normal STUN server, but MUST only accept STUN requests these attacks are completely foiled without the need for SIPS,
from clients that authenticate, as discussed below in Section 5.3.5 raising the overall security of ICE substantially with minimal cost.
These properties of ICE are discussed thoroughly in Section 12.
5.3.7 Binding Keepalives As such, once an agent has actively opened a TCP connection to the
remote agent, it sends a STUN Binding Request over that connection.
Recall that STUN messages include length indicators, allowing them to
be framed over a connection-oriented transport protocol. The Binding
Request MUST contain the USERNAME attribute. This attribute MUST be
set to the transport address pair ID of the corresponding transport
address pair as seen by its peer. Thus, for the first transport
address pair in Figure 2, if the agent on the left sends the STUN
Binding Request, the USERNAME will have the value YW. The request
MAY contain the MESSAGE-INTEGRITY attribute, computed according to
RFC 3489 procedures. The MESSAGE-INTEGRITY The Binding Request MUST
NOT contain the CHANGE-REQUEST or ANSWER-ADDRESS attribute. The STUN
BindingRequest message SHOULD NOT be retransmitted over the
connection.
Once the STUN connectivity checks complete, STUN packets are no The STUN will generate either a timeout, or a response. If the
longer used. However, bindings in intermediate NATs need to be kept response is a 420, 500, or 401, the agent should try again as
alive so that the media can continue to flow. Doing so is the described in RFC 3489. Either initially, or after such a retry, the
responsibility of the media protocol. STUN transaction might produce a non-recoverable failure response
(error codes 400, 431, or 600) or a failure result inapplicable to
this usage of STUN and thus unrecoverable (432, 433). If this
happens the connection is considered invalid. If the STUN
transaction produces a 430 error or times out, the client SHOULD
retry with a new STUN Binding Request transaction. The 430 response
code is a result of a failed race between the BindingRequest and the
answer. This is remedied by retrying, which allows the "slower"
answer to be received. These retry transactions carry the same
USERNAME value as the original Binding Request, and differ only in
their STUN transaction ID. If these retries have not produced a
success response after Tg seconds, the connection is considered
invalid. Tg SHOULD be configurable. It is RECOMMENDED that it
default to 50 seconds. This is a reasonable approximation of the
maximum SIP transaction duration.
In the case of RTP, the RTP packets themselves normally come If the STUN Binding Request generates a successful response, the
sufficiently quickly to keep the bindings alive. However, several connection over which it was sent is considered valid. Furthermore,
cases merit further discussion. Firstly, in some RTP usages, such as the agent stores the IP address and port from the MAPPED-ADDRESS
SIP, the media streams can be "put on hold". This is accomplished by response in the STUN Binding Response. This is called the "apparent"
using the SDP "sendonly" or "inactive" attributes, as defined in RFC native transport address for the active side of the connection. It
3264 [4]. RFC 3264 directs implementations to cease transmission of will be used later if this connection is used for media transport.
media in these cases. However, doing so may cause NAT bindings to
timeout, and media won't be able to come off hold.
As such, clients SHOULD instead send a media packet periodically, Once a connection is valid, the agent which initiated the connection
independent of whether the stream is "sendonly", "recvonly" or MUST generate a new STUN Binding Request transaction every Tr
"inactive". At least once every 20 seconds is RECOMMENDED. These seconds. This transaction ensures that NAT bindings for the
packets can be sent using any of the payload formats listed by the connection remain open while the connection is under consideration as
peer in its SDP. For audio streams, It is RECOMMENDED that a candidate. Tr SHOULD be configurable, and SHOULD default to 15
implementations support the RTP payload format for comfort noise [5], seconds. Each new Binding Request transaction is processed according
which makes a good choice. For video codecs, a minimally coded frame to the procedures in this section. It is possible for a previously
is a good choice. valid candidate to later be invalidated by a subsequent STUN
transaction. This happens in cases where the NAT bindings expire.
Note that, unlike the UDP case, STUN is sent only while a connection
is is not active for media. If the connection is used as the active
connection for media, STUN MUST NOT be sent.
Secondly, some RTP payload formats, such as the payload format for 7.4.2.3 Receiving STUN Requests
text conversation [19], may send packets so infrequently that the
interval exceeds the NAT binding timeouts. In such cases, the
implementation should send some any kind of content, if possible. If
the payload type doesn't allow anything meaningful to be sent, even a
malformed RTP packet is superior to nothing at all; the malformed
packet would be rejected by the peer, and have the side effect of
keeping the NAT bindings open.
6. Running STUN on Derived Transport Addresses When an agent acted as the passive side of a TCP connection, it will
receive a STUN Binding Request over that connection.
One of the seemingly bizarre operations done during the ICE One of the candidates will be in use as the active candidate. For
processing is the transmission of a STUN request to a transport the transport addresses comprising that candidate, the agent will
address which is obtained through TURN or STUN itself. This actually receive both STUN requests and media packets on its associated local
does work, and in fact, has extremely useful properties. The transport addresses. The agent MUST be able to disambiguate them.
subsections below go through the detailed operations that would occur In the case of RTP/RTCP, this disambiguation is easy. RTP and RTCP
at each point to demonstrate correctness and the properties derived packets start with the bits 0b10 (v=2). The first two bits in STUN
from it. They are tutorial in nature. are always 0b00. This disambiguation also works for packets sent
using Secure RTP [23], since the RTP header is in the clear.
Disambiguating STUN with other media stream protocols may be more
complicated. However, it can always be possible with arbitrarily
high probabilities by selecting an appropriately random username (see
below).
6.1 STUN on a TURN Derived Transport Address The STUN Binding Request can only be usefully processed once an
offer/answer exchange has completed. As a result, if an offeror
receives a STUN Binding Request message prior to the receipt of an
answer to its offer, it MUST reject the request with a 430 response.
This will cause the answerer to retry, and give time for the answer
(which is in transit) to arrive at the offerer.
+----------+ If the offer/answer exchange has completed, the agent MUST follow the
| |192.0.2.1:26524 procedures defined in RFC 3489 and verify that the USERNAME attribute
| TURN X is known to the server. Here, this is done by taking the USERNAME
| Server | attribute, and comparing it against the transport address pair
| | identifiers for each transport address pair as seen by that agent.
| | If there is no match, the STUN Binding Request generates a 400. If
+----------+ there is a match, the resulting transport address pair is called the
192.0.2.1:7764. ^192.0.2.1:7764 matching transport address pair. The user agent proceeds with the
. . processing of the request and generation of a response as per RFC
. .192.0.2.88:5063 3489. In addition, the agent stores the source IP address and port
+----------+ of the Binding Request, and associates it with the connection. This
| NAT | address is called the "apparent" remote transport address for this
+----------+ connection.
TURN . .
Response . . TURN Request
. .
10.0.1.1:8866 V .10.0.1.1:8866
+----------+ +----------+
| | | |
| Client | | Client |
| | | |
| A | | B |
| | | |
+----------+ +----------+
Figure 6 An agent will continue to receive periodic STUN transactions as long
as it had listed its transport address in an a=candidate attribute.
It MUST process those transactions according to this section. It is
possible that a transport address pair that was previously valid may
become invalidated as a result of a subsequent failed STUN
transaction.
Consider a client A that is behind a NAT, shown in Figure 6. It Note that, unlike the UDP case, there will never be simultaneous
connects to a TURN server on the public side of the NAT. To do that, transmission of media and STUN packets over TCP connections. This is
A binds to a local transport address, say 10.0.1.1:8866, and then because the connection is listed as on hold according to comedia
sends a TURN request to the TURN server. The NAT translates the procedures, and no media will be transmitted. ICE will establish the
net-10 address to 192.0.2.88:5063. Assume that the TURN server is connections as described here. Once established, an updated offer/
running on 192.0.2.1 and listening for TURN traffic on port 7764. answer exchange can promote those connections to active usage through
The TURN server allocates a derived transport address 192.0.2.1:26524 the comedia "exist" mechanism, as described below. The additional
to the client (shown as the X on the TURN server in the diagram), and offer/answer exchange provides a barrier synchronization point at
returns it in the TURN response. Remember that all traffic from the which a TCP connection switches from ICE control to control by the
TURN server to the client is sent from 192.0.2.1:7764 to media source and sinks. Once it is active, STUN packets will no
10.0.1.1:8866, including the TURN response. longer be sent on the connection.
Now, the client runs a STUN server on 10.0.1.1:8866, and advertises 7.5 Promoting a Valid Candidate to Active
that its server actually runs on 192.0.2.1:26524. Another client, B,
sends a STUN request to this server. It sends it from a local
transport address, 192.0.2.77:1296. When it arrives at
192.0.2.1:26524, it is discarded since client A has not sent a packet
to 192.0.2.77:1296. Once client A gets client B's accept message, it
will learn about B's candidate address, and generate a STUN request
towards it. This results in a permission being installed in the TURN
server, so that packets from 192.0.2.77:1296 will now be accepted.
The next STUN request from client B will therefore succeed. This is
the normal mode of operations for port restricted NAT; as described
in TURN, the server turns a symmetric NAT into a port restricted one
[8].
+----------+ 7.5.1 Minimum Requirements
| |192.0.2.1:26524 STUN Request
| TURN X<...............................
| Server | STUN Response .
| |......................... .
| |192.0.2.1:26524 . .
+----------+ . .
192.0.2.1:7764 . ^ 192.0.2.1:7764 . .
. . . .
192.0.2.88:5063 V . 192.0.2.88:5063 . .
+----------+ . .
| NAT | . .
+----------+ . .
192.0.2.1:7764 . ^ 192.0.2.1:7764 . .
. . 192.0.2.77:1296 .
. . . .
10.0.1.1:8866 V . 10.0.1.1:8866 V .192.0.2.77:1296
+----------+ +----------+
| | | |
| Client | | Client |
| | | |
| A | | B |
| | | |
+----------+ +----------+
Figure 7 As the STUN connectivity checks run, they will result in the
validation of pairings. Once validated, a pairing can be used by
promoting it to active. This promotion occurs by placing the
transport addresses for the native candidate of the pairing into the
m/c line and sending an updated offer. It MAY promote a candidate
associated with any validated pairing at any time, as long as the
candidate had been provided in series of a=candidate attributes in
the most recent offer (in other words, an agent can't validate a
candidate, omit that candidate from the a=candidate attribute of an
offer, and then later on, generate a new offer that promotes the
candidate to active). The procedures for doing so are described
here.
As shown in Figure 7, client B will retry, sending it STUN request Any candidates which the agent would like to retain as valid
from 192.0.2.77:1296 to 192.0.2.1:26524. This successful STUN candidates are also included in a=candidate lines in the offer. It
request is forwarded to the client, sent with a source address of SHOULD include any candidates learned from the peer-to-peer discovery
192.0.2.1:7764 and a destination address of 192.0.2.88:5063. This processing of Section 7.4.1.3, and SHOULD include any candidates of
passes through the NAT, which rewrites the destination address to higher priority than the one just promoted to active. It SHOULD omit
10.0.1.1:8866. This arrives at A's STUN server. The server observes candidates of lower priority than the one being promoted to active.
the source address of 192.0.2.1:7764, and generates a STUN response It SHOULD omit any for whom all pairings that include that candidate
containing this value in the MAPPED-ADDRESS attribute. The STUN have become invalid.
response is sent with a source address of 10.0.1.1:8866, and a
destination of 192.0.2.1:7764. This arrives at the TURN server,
which, because of current destination is 192.0.2.1:7764, sends the
STUN response with a source address of 192.0.2.1:26524 and
destination of 192.0.2.77:1296, which is B's STUN client.
Now, as far as A is concerned, it has obtained a new candidate If a candidate is omitted, and that candidate was a TURN-derived
transport address of 192.0.2.1:7764. And indeed, it has! STUN transport address, the agent SHOULD de-allocate the address from the
derived transport addresses are scoped to the session, so they can TURN server. If a local candidate was omitted, along with all of its
only be used by the peer in the session. Furthermore, that peer has derived transport addresses, local operating system resources for
to send requests from the socket on which the STUN server was that candidate SHOULD be de-allocated.
running. In this case, A is the peer, and its STUN server was on
10.0.1.1:8866. If it sends to 192.0.2.1:7764, the packet goes to the
TURN server, and since the destination address is set to
192.0.2.77:1296, is forwarded to B, and specifically, is forwarded to
the transport address B sent the STUN request from. Therefore, the
address is indeed a valid candidate transport address. Its priority
is derived from the priority of client B's public IP address.
The benefit of this is that it allows two clients to share the same Once it has decided on the set of candidates to provide in the
TURN server for media traffic in both directions. With "normal" TURN updated offer, the agent constructs the offer and follows the
usage, both clients would obtain a derived address from their own procedures in Section 7.6 which defines general subsequent offer/
TURN servers. The result is that, for a single call, there are two answer processing.
bindings allocated by each side from their respective servers, and
all four are used. With ICE, that drops to two bindings allocated
from a single server. Of course, all four bindings are allocated
initially. However, once one of the clients begins receiving media
on its STUN derived address, it can deallocate its TURN resources.
6.2 STUN on a STUN Derived Transport Address 7.5.2 Suggested Algorithm
Consider a client A that is behind a NAT. It connects to a STUN ICE leaves substantial variability to implementors around when an
server on the public side of the NAT. To do that, A binds to a local agent decides to generate a new offer. However, there are good ways
transport address, say 10.0.1.1:8866, and then sends a STUN request to do this, and bad ways. Perhaps the worst algorithm possible would
to the STUN server. The NAT translates the net-10 address to be to generate a new offer every time a candidate with higher
192.0.2.88:5063. Assume that the STUN server is running on 192.0.2.1 priority than the active one becomes valid. This algorithm will
and listening for STUN traffic on port 3478, the default STUN port. likely result in a large number of offer/answer exchanges in rapid
The STUN server sees a source IP address of 192.0.2.88:5063, and succession, many of which will produce "glare" as each agent will
returns that to the client in the STUN response. The NAT forwards independently initiate an exchange. This will consume CPU and
the response to the client. network resources for little benefit. Rather, the ideal algorithm
strikes a balance between usage of network resources and the desire
to use the ideal pair of candidates.
Now, the client runs a STUN server on 10.0.1.1:8866, and advertises The following algorithm provides a good tradeoff, and usage of this
that its transport address is 192.0.2.88:5063. Another client, B, algorithm is RECOMMENDED. The algorithm results in a bounded number
sends a STUN request to this address. It sends it from a local of additional offer/answer exchanges after the initial one - never
transport address, 192.0.2.77:1296. When it arrives at more than two, and frequently one or zero. The algorithm almost
192.0.2.88:5063 (on the NAT), the NAT rewrites the source address to never produces a glare condition.
10.0.1.1:8866, assuming that it is of the full-cone or restricted
variety [1], and the permission for 192.0.2.77:1296 is open. This
arrives at A's local STUN server. The server observes the source
address of 192.0.2.77:1296, and generates a STUN response containing
this value in the MAPPED-ADDRESS attribute. The STUN response is
sent with a source address of 10.0.1.1:8866, and a destination of
192.0.2.77:1296. This arrives at B's STUN client.
Now, as far as A is concerned, the STUN request had a source Once the initial offer/answer exchange completes, media flow will
transport address which was already known to A, presumably from an happen, though not optimally (where optimal is defined by the
ICE exchange. As far as B is concerned, the check succeeded, and the policies used to set the priorities of the candidates), as long as
address is viable. the candidate that is active has been validated. Thus, the objective
of the algorithm is to quickly make sure that there is a valid path
for media (to avoid clipping), and then do a single offer/answer
exchange to use the highest priority pairing that was validated.
7. XML Schema for ICE Messages After the initial offer/answer exchange, each agent sets a timer Tu.
This timer SHOULD have a configurable baseline value, which SHOULD
default to 3 seconds. The actual timer is set to this baseline, plus
a time value chosen uniformly beween -1 and 1 seconds. This causes
the actual timer to be randomized so that the timer doesnt fire
simultaneously at each agent. In addition, each agent monitors the
status of the active pairing. If the active media stream is UDP-
based, the status of the active candidates is equal to the status of
the pairing with matching transport addresses. In the case of TCP-
based media, the active media stream is never active initially, since
it always begins with the "holdconn" state.
This section contains the XML schema used to define the initiate and If, when Tu fires, the active pairing has not been validated, and
accept messages. Any protocol that uses ICE needs to map the there exists at least one pairing that has been validated, the agent
parameters defined here into its own messages. generates a new offer. This offer promotes its highest priority
candidate with a validated pairing to the active candidate. If there
are no pairings that have been validated when the timer fires, the
agent waits until one is validated, and once that happens, sets a
timer to fire randomly between 0 and 2 seconds. When the timer
fires, a new offer is generated that promotes the candidate from this
validating pairing to active. If the active pairing is validated
when the timer fires, the agent does nothing at this time.
Note that STUN allows both the username and password to contain the If new offer is to be sent, the agent includes the new active
space character. However, usernames and passwords used with ICE candidate in the a=candidate attribute list. It also includes all
cannot contain the space. candidates with higher priority than the one that is active,
including ones it learned from the connectivity checks themselves.
<?xml version="1.0" encoding="UTF-8"?> At this point, media is flowing successfully, since a valid candidate
<xs:schema targetNamespace="urn:ietf:params:xml:ns:ice" is active. However, it may not be optimal. So, the next stage of
xmlns:tns="urn:ietf:params:xml:ns:ice" the algorithm is to let the connectivity checks continue. If those
xmlns:xs="http://www.w3.org/2001/XMLSchema" checks indicate that a pairing between the two highest priority
elementFormDefault="qualified" attributeFormDefault="unqualified"> candidates from both agents has been validated, each agent sets a
<xs:import namespace="http://www.w3.org/XML/1998/namespace" timer whose value is randomly set between 0 and 2 seconds. When the
schemaLocation="http://www.w3.org/2001/xml.xsd"/> timer fires, a new offer is generated that promotes the candidate
<xs:element name="message" type="tns:message"/> from this validating pairing to active. Otherwise, when the
<xs:complexType name="message"> connectivity checks have all concluded, such that no pairing exists
<xs:annotation> in the invalid state, each agent sets a timer whose value is randomly
<xs:documentation>This is the root element, which holds a set between 0 and 2 seconds. When the timer fires, a new offer is
media-streams elements.</xs:documentation> generated that promotes the candidate from the valid pairing with the
</xs:annotation> highest priority to active.
<xs:sequence>
<xs:element name="media-streams" type="tns:media-streams"/>
</xs:sequence>
<xs:attribute name="type" type="tns:msg-type" use="required"/>
</xs:complexType>
<xs:complexType name="media-streams">
<xs:sequence>
<xs:element name="media-stream" minOccurs="0" maxOccurs="unbounded">
<xs:annotation>
<xs:documentation>There are zero or more media stream
elements. Each defines attributes for a specific media
stream.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element name="default-addresses">
<xs:complexType>
<xs:sequence>
<xs:element name="ipv4-address" type="tns:rtp-info" minOccurs="0"/>
<xs:element name="ipv6-address" type="tns:rtp-info" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:sequence>
<xs:element name="candidate" minOccurs="0" maxOccurs="unbounded">
<xs:annotation>
<xs:documentation>Each candidate is a possible point
of media reception.</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:complexContent>
<xs:extension base="tns:transport-data">
<xs:attribute name="preference" type="xs:double" use="required"/>
<xs:attribute name="id" type="xs:string" use="required"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:simpleType name="msg-type">
<xs:restriction base="xs:string">
<xs:enumeration value="initiate"/>
<xs:enumeration value="accept"/>
<xs:enumeration value="modify"/>
</xs:restriction>
</xs:simpleType>
<xs:complexType name="transport-data">
<xs:sequence>
<xs:element name="rtp-address" type="tns:transport-address"/>
<xs:element name="rtcp-address" type="tns:transport-address" minOccurs="0"/>
<xs:element name="rtp-stun-info" type="tns:stun-info"/>
<xs:element name="rtcp-stun-info" type="tns:stun-info" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="transport-address">
<xs:sequence>
<xs:element name="ip-address" type="xs:string"/>
<xs:element name="port">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="1"/>
<xs:maxInclusive value="65535"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="stun-info">
<xs:sequence>
<xs:element name="username-fragment"/>
<xs:element name="password"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="rtp-info">
<xs:sequence>
<xs:element name="rtp-address" type="tns:transport-address"/>
<xs:element name="rtcp-address" type="tns:transport-address"
minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
8. Example 7.6 Subsequent Offer/Answer Exchanges
In the example that follows, messages are labeled with "message name An offer/answer exchange within a session can occur at any time,
A,B" to mean a message from transport address A to B. For STUN whether it is the result of the algorithm described in Section 7.5.2,
Requests, this is followed by curly brackets enclosing the username or because one of the agents wishes to add or remove a media stream,
and password. For STUN responses, this is followed by square or add a codec, and so on.
brackets and the value of MAPPED ADDRESS. The example shows a flow
of two clients where one is behind a full cone NAT, and the other is
on the public Internet.
A NAT STUN B 7.6.1 Sending of an Offer
|(1) STUN Req P1,STUN-PUBLIC | |
|---------------->| | |
| |(2) STUN Req U, STUN-PUBLIC |
| |---------------->| |
| |(3) STUN Res STUN-PUBLIC, U [U] |
| |<----------------| |
|(4) STUN Res STUN-PUBLIC, P1 [U] | |
|<----------------| | |
|(5) Intitiate {P2,ufrag1A,pass1A,q=0.4} |
|{U,ufrag2A,pass2A,q=0.4} | |
|---------------------------------------------------->|
| | |(6) STUN Req P3,STUN-PUBLIC
| | |<----------------|
| | |(7) STUN Res STUN-PUBLIC,P3 [P3]
| | |---------------->|
|(8) Accept {P3,ufrag1B,pass1B,q=0.4} |
|<----------------------------------------------------|
| |(9) STUN Req P3,P2 |
| |(ufrag1Aufrag1B,pass1A) |
| |<----------------------------------|
| |Timeout | |
| |(10) STUN Req P3,U |
| |(ufrag2Aufrag1B,pass2A) |
| |<----------------------------------|
|(11) STUN Req P3,P1 | |
|(ufrag2Aufrag1B,pass2A) | |
|<----------------| | |
|(12) STUN Res P1,P3 [P3] | |
|---------------->| | |
| |(13) STUN Res U,P3 [P3] |
| |---------------------------------->|
|(14) STUN Req P2,P3 | |
|(ufrag1Bufrag1A,pass1B) | |
|---------------->| | |
| |(15) STUN Req W,P3 |
| |(ufrag1Bufrag1A,pass1B) |
| |---------------------------------->|
| |(16) STUN Res P3,W [W] |
| |<----------------------------------|
|(17) STUN Res P3,P2 [W] | |
|<----------------| | |
|(18) STUN Req P1,P3 | |
|(ufrag1Bufrag2A,pass1B) | |
|---------------->| | |
| |(19) STUN Req U,P3 |
| |(ufrag1Bufrag2A,pass1B) |
| |---------------------------------->|
| |(20) STUN Res P3,U [U] |
| |<----------------------------------|
|(21) STUN Res P3,P1 [U] | |
|<----------------| | |
The initiator, client A, binds to a local transport address P1, which The meaning of a=candidate attributes within a subsequent offer have
will be used as an associated local transport address. As such, it the same meaning they do in an initial offer. They are a request for
sends a STUN request to its STUN server (message 1). This passes the peer to attempt (or continue to attempt if the candidate was
through a NAT, and the NAT maps private address P1 to public address provided previously) a connectivity check using STUN from each of its
U (message 2). The STUN server mirrors this public address in the own candidates. As such, an a=candidate attribute is included in
MAPPED-ADDRESS of the STUN response (message 3), and it is forwarded subsequent offers when (1) connectivity checks haven't concluded yet
to the initiator (message 4). Now, client A has a STUN derived to that candidate, or (2) the checks have concluded, and the
transport address of U. It also binds to a second local transport candidate is currently active. In that case, STUN is used to keep
address, P2, which will be a usable local transport address. It the bindings active.
starts STUN servers on both local transport addresses P1 and P2. It
then generates an Initiate request to client B (message 5) which
contains both of the gathered transport addresses P2 and U, along
with username fragments and passwords.
Client B is not behind a NAT. It binds to a local transport address If an agent sends an offer which omits candidates it had sent to its
P3, and sends a STUN request to its STUN server (message 6). This is peer previously, it MUST cease connectivity checks from that
responded to by the STUN server (message 7). The client observes candidate. Any pairings that include the absent native candidate are
that this address is identical to its local transport address, and discarded. Any STUN transactions in progress from that candidate are
therefore that local transport address is, which was targeted for an immediately terminated - no further retransmissions take place, and
associated local transport address, is promoted to a usable local no further transactions from that candidate will be made. If a TCP
transport address. It then sends an Accept message to client A, connection was opened to or from that candidate, and that connection
including this transport address and its username fragment and is not listed as the active one in the offer, the connection is torn
password (message 8). down.
Once the Accept message is sent, the client can perform its STUN The offer MAY contain a new active candidate in the m/c line. If the
connectivity checks. B has a single local transport address (P3), new active transprot address is UDP, candidate is encoded into an
which it matches up with A's two remote transport addresses (P2 and update offer as described in Section 7.2. The transport addresses
U). B tries P2 (message 9). This request fails since P2 is a constituting the candidate SHOULD also be listed in a=candidate
private address. In parallel, B tries U (message 10). Since A's NAT attributes, so that STUN can be used as an ongoing keepalive.
is full cone, this packet is accepted and is passed to client A
(message 11). Client A generates a response (message 12) which is
forwarded to client B (message 13). The source transport address in
the STUN packet, P3, is already known to client A, and thus no new
candidates are learned. Client B learns that client A is reachable
at transport address U, but not P3. Thus, it can begin sending media
to U from local transport address P3.
Once the Accept message arrives at client A, it can begin its If the new active transport address is TCP, it is more complicated.
connectivity checks. It has two local transport addresses P1 and P2, Recall that each TCP connection is opened from one of the agents to
which it combines with client Bs single transport address P3. It the other, such that, for each connection, one agent has the active
tries to send a STUN packet from P2 to P3 (message 14). Since the role, and the other, the passive. The ICE mechanisms allow the
NAT has not seen source address P2 yet, it maps it to a new public active agent to actually choose a specific connection for use in an
transport address W, and the STUN request is forwarded to client B offer, so long as the agent has used a different ephemeral port for
(message 15). Client B generates a STUN response (message 16), which each connection it initiated (which is almost always the case). If,
is forwarded back to client A (message 17). Based on this, client A however, an agent was in the passive role, it cannot choose a
learns that it can reach P3 from P2. Client B learns a new remote specific connection. Rather, it can choose a specific native
transport address, W. However, the priority of this address is the transport address which may have been used to receive multiple
same as P2, which is 0.4, and equal to the priority of address U, to connections. This assymetric behavior brings with it some important
which client B has already connected. Thus, it does not bother to security properties, which are discussed in Section 12.
perform the check (such a check would have succeeded if it had been
done).
While the P2->P3 check is taking place, client A also sends a STUN If the agent was the active one and established the connection, it
request from P1 to P3 (message 18). This passes through the NAT, includes its apparent native transport address in the m/c line of the
which maps the source transport address to the same public address it SDP (recall that this address was discovered via the STUN exchange
allocated previously, U. This STUN request arrives at client B over the connection). Note that this is instead of the SHOULD-
(message 19). It generates a response (message 20), which is strength recommendation in comedia, which recommends that the port
forwarded to client A (message 21). Based on this check, client A number sent by the entity which initiated the connection should be
learns that P3 is also reachable from P1. Client B did not learn a '9'. The actual port number is present to facilitate identification
new candidate transport address, since U was already known. Now, of the connection. The a=setup attribute MUST be present and MUST
client A can send media to P3 from either P1 or P2. contain the value "active". The a=connection attribute MUST be
present and MUST have the value of "existing".
9. Mapping ICE into SIP If the agent was the passive one and was the recipient of the
connection, it includes its transport address in the m/c line of the
SDP. In this case, that address will be the same as the one it had
placed into the a=candidate line of the SDP. The a=setup attribute
MUST be present and MUST contain the value of "passive". The
a=connection attribute MUST be present and MUST have the value of
"existing".
In this section, we show how to map ICE into SIP. This mapping 7.6.2 Receiving the Offer and Sending an Answer
involves three parts. The first is the actual mapping of the ICE
message into SIP and SDP messages, which requires extensions to SDP
documented here. The second are security considerations specific to
SIP. The third is handling of updates in the offer/answer model.
9.1 Message Mapping If an agent receives an updated offer with a=candidate attributes, it
checks to see if it already knows about the listed candidates. This
is done by comparing the tid with the candidates it had received in
the previous offer or answer from the peer. If the tid is already
known, processing for that candidate continues as if no offer had
been made. Any connectivity checks in progress continue, and any
ongoing STUN keepalives continue.
A new SDP attribute is defined to support ICE. It is called If a candidate which had been listed previously is no longer present
"candidate". The candidate attribute MUST be present within a media in the offer, this tells the answerer to cease connectivity checks.
block of the SDP. It contains a candidate IP address and port (or Any pairings that include the absent remote candidate are discarded.
pair of IP addresses and ports in the case of RTP) that the recipient Any STUN transactions in progress to that candidate are immediately
of the SDP can use. There MAY be multiple candidate attributes in a terminated - no further retransmissions take place, and no further
media block. In that case, each of them MUST contain a different IP transactions to that candidate will be made. If a TCP connection was
address and port (or a differing pair of IP address and ports in the opened to or from that candidate, and that connection is not listed
case of RTP). as the active one in the offer, the connection is torn down.
The syntax of this attribute is: The agent then sends its answer. Like the offerer, it can add or
remove candidates from its answer. If it removed candidates from its
answer, it ceases STUN connectivity checks from those candidates, and
any pairings that include those candidates are discarded. Any STUN
transactions in progress to that candidate are immediately terminated
- no further retransmissions take place, and no further transactions
to that candidate will be made. If a TCP connection was opened to or
from that candidate, and that connection is not listed as the active
one in the answer, the connection is torn down.
candidate-attribute = "candidate" ":" id SP qvalue SP After transmission of the answer, there may be a set of candidates
rtp-user-frag SP rtp-password SP which were new in the offer, and a set that were new in the answer.
rtp-unicast-address SP rtp-port [SP rtcp-user-frag The agent begins connectivity checks as described in Section 7.4,
SP rtcp-password [SP rtcp-unicast-address SP pairing each new candidate in its answer with all candidates in the
rtcp-port]] offer, and each new candidate in the offer with all of its candidates
;qvalue from RFC 3261 in the answer.
rtp-port = port
rtcp-port = port
rtp-unicast-address = unicast-address
rtcp-unicast-address = unicast-address
;unicast-address, port from RFC 2327
rtp-user-frag = non-ws-string
rtp-password = non-ws-string
rtcp-user-frag = non-ws-string
rtcp-password = non-ws-string
id = token
With the addition of the candidate attribute, the mapping of the ICE The m/c line may have also changed, indicating a new active
messages to SIP/SDP is straightforward. The ICE initiate message candidate. If the m/c line contains a UDP stream, the agent begins
corresponds to a SIP message with an SDP offer. The ICE accept sending media to the transport addresses listed there. In addition,
message corresponds to a SIP message with a SDP answer. it checks to see if those transport addresses correspond to a remote
candidate in a valid pairing. So long as the remote agent has
offered up a candidate that has been validated by ICE, it should be
the case. Indeed, there may be a multitude of valid pairings
containing the transport addresses in the m/c line as the remote
candidate. In that case, the agent MUST choose the pairing whose
native candidate has the highest priority. It MUST place this
candidate in the m/c line. Transmission of media occurs as defined
in Section 7.8.
Each media stream element in an ICE message maps to either one or two If the m/c line has changed, and now indicates a new TCP candidate,
media blocks in the SDP. If the ICE message has only an IPv4 default the agent examines it. The comedia "a=connection" attribute will
address or an IPv6 default address, but not both, one media block is normally be present and normally contain the value of "existing". If
used. If both defaults are present, two media blocks are used. Each not present, or if present but with a value of "new", comedia process
default address maps to the m and c lines in the SDP media block. In is followed, as apparently the peer has abandoned ICE operation for
particular, the <ip-address> from the <rtp-address> element maps into this media stream. Assuming it contains a value of "existing", the
the SDP c line. The <port> from the <rtp-address> maps into the port agent looks at whether the a=setup attribute is present. If its
in the SDP m line. If the ICE message indicates a default RTCP value is "active", it means that a connection that was initiated by
address whose IP address is not identical to the default RTP address, the remote agent is to be used. The agent examines the transport
and whose port is not one higher than that of the RTP, the SDP RTCP address in the m/c line. It looks for a matching value in the
attribute [2] MUST be used to convey the RTCP transport address. apparent remote transport addresses of existing connections. If it
matches multiple connections (though it should normally match just
one), one of those connections is chosen. The native transport
address of that connection is then placed into the m/c line of the
answer. If no existing connections where matched, an error has
occured. The agent SHOULD respond with "holdconn", and then generate
its own offer with a connection to the peer which it believes is
valid.
Each <candidate> element in an ICE message maps to a candidate If the a=setup attribute had a value of "passive", it means that a
attribute in the SDP. If the IP version of the <candidate> is IPv4, connection that was initiated by the agent itself is to be used. The
it MUST be mapped into the media block containing the default IPv4 agent examines the transport address in the m/c line. It looks for a
address. If the IP version of the <candidate> is IPv6, it MUST be matching value amongst the remote transport addresses in valid
mapped into the media block containing the default IPv6 address. pairings. If multiple pairings match, it MUST choose the one whose
Mapping of each individual candidate is simple. The native transport address has the highest priority. The apparent
<username-fragment> element of the <rtp-stun-info> element maps to native transport address associated with an active connection
the rtp-user-frag component of the candidate attribute. The initiated by the agent is then placed into the m/c line, and that TCP
<password> element of the <rtp-stun-info> element maps to the connection is used to send and receive media. If no pairings match,
rtp-password component of the candidate attribute. The <rtp-address> an error has occured. The agent SHOULD respond with "holdconn", and
element maps to the first unicast-address and port components of the then generate its own offer with a connection to the peer which it
candidate attribute. believes is valid.
If the <rtcp-stun-info> element is present, it means that RTCP is in 7.6.3 Receiving the Answer
use. The rtcp-user-frag and rtcp-password components of the
candidate attribute MUST be present, and MUST be set to the
<username-fragment> and <password> elements of the <rtcp-stun-info>
element, respectively. If the <rtcp-address> element is also
present, its IP address and port information is copied into the
rtcp-unicast-address and rtcp-port components of the candidate
attribute.
The preference attribute from the <candidate> element is mapped to If an agent receives an answer with a=candidate attributes, it checks
the q-value component of the candidate attribute. The id attribute to see if it already knows about the listed candidates. This is done
from the <candidate> element is mapped into the id component of the by comparing the tid with the candidates it had received in the
candidate attribute. previous offer or answer from the peer. If the tid is already known,
processing for that candidate continues as if no offer had been made.
Any connectivity checks in progress continue, and any ongoing STUN
keepalives continue.
If the mapping process produced both an IPv6 media block (that is, a If a candidate which had been listed previously is no longer present
media block with an IPv6 address in the c line, and with all IPv6 in the answer, this tells the offerer to cease connectivity checks.
addresses in the candidate attributes within that block) and an IPv4 Any pairings that include the absent remote candidate are discarded.
media block, these two blocks MUST be grouped using the ANAT grouping Any STUN transactions in progress to that candidate are immediately
[7]. terminated - no further retransmissions take place, and no further
transactions to that candidate will be made. If a TCP connection was
opened to or from that candidate, and that connection is not listed
as the active one in the answer, the connection is torn down.
9.2 SIP and SDP Specific Security Considerations Furthermore, there may be a set of candidates which were new in the
offer, and a set that were new in the answer. The agent begins
connectivity checks as described in Section 7.4, pairing each new
candidate in its offer with all candidates in the answer, and each
new candidate in the answer with all of its candidates in the offer.
The SDP messages described here contain usernames and passwords. If The m/c line may have also changed, indicating a new active
those passwords are transmitted in the clear, it introduces candidate. If the m/c line contains a UDP stream, the agent begins
significant security vulnerabilities, discussed in detail below. In sending media to the transport addresses listed there as defined in
summary, those vulnerabilities would allow an eavesdropper that can Section 7.8. It will send from the m/c line it had signaled in the
inject packets, to "steal" the media streams for a call unless secure offer.
media transport (such as SRTP) is used. Even if SRTP is used, an
attacker could disrupt a call and prevent media from flowing. These
attacks, fortunately, can be obviated by providing secure transport
of the SDP. SIP-based implementations of ICE SHOULD use the sips URI
scheme when transporting SDP with ICE information, and MAY use S/MIME
[3].
9.3 Updates in the Offer/Answer Model If the m/c line has changed, and now indicates a new TCP candidate,
the agent examines it. If the agent had, in its offer, indicated the
desire to use a specific connection that it had initiated, it would
have used the a=connection attribute with the value of "existing",
and the a=setup attribute with the value of "active", and have placed
its apparent native transport address in the m/c line. In that case,
the m/c line in the answer will normally have the a=connection
attribute with the value "existing", which means that the remote
agent agrees with the usage of that connection. The transport
addresses in the m/c line should correspond to the remote transport
addresses that the agent had initiated its connection to. If so,
that connection is used.
ICE itself only considers an initial exchange of messages. However, If the agent had, in its offer, indicated the desire to use any
the offer/answer model [4] allows for the session to be modified with connection that had been established to a specific native transport
subsequent exchanges. How is an updated offer with SDP alternate address, it would have, in its offer, used the a=connection attribute
attributes to be treated? with the value of "existing" and the a=setup attribute with the value
of "passive", and placed that address in the m/c line. In that case,
the m/c line in the answer will normally have the a=connection
attribute with the value of "existing" and the a=setup attribute with
the value of "active". The transport address in the m/c line will
correspond to the apparent remote transport address. The agent MUST
scan its existing connections to the native transport address it had
advertised in the offer, and find the one whose apparent remote
transport address matches the m/c line in the answer. If there is a
match, that connection is used for sending media. If there is no
match, an error has occurred.
If a user agent receives an updated offer with candidate attributes, 7.7 Binding Keepalives
it checks to see if it already knows about those candidates. This is
done by comparing the transport address and username fragment with
existing values. If the combination is already known, no additional
action is taken. In particular, if STUN connectivity checks had
already been made, no new ones are performed. However, if a
candidate contains a new transport address or new username fragment,
it is treated as a totally new candidate, and STUN connectivity
checks are performed per Section 5.3.4. If a candidate formerly sent
by the peer no longer appears, that candidate is considered BAD, and
if it was in use previously, it ceases being used, and the next
highest priority connection in the GOOD state is used.
The inclusion of the username fragment in the determination of Once the candidates are promoted to active, and media begins flowing,
whether a candidate is known provides a hook that allows a peer to it is still necessary to keep the bindings alive at intermediate NATs
request a new set of connectivity checks on an existing transport for the duration of the session. Normally, the RTP packets
address. It can update the username fragment and generate an updated themselves meet this objective. However, several cases merit further
offer, without changing the transport address. discussion. Firstly, in some RTP usages, such as SIP, the media
streams can be "put on hold". This is accomplished by using the SDP
"sendonly" or "inactive" attributes, as defined in RFC 3264 [4]. RFC
3264 directs implementations to cease transmission of media in these
cases. However, doing so may cause NAT bindings to timeout, and
media won't be able to come off hold.
10. Security Considerations Secondly, some RTP payload formats, such as the payload format for
text conversation [28], may send packets so infrequently that the
interval exceeds the NAT binding timeouts.
STUN itself introduces many security considerations. In particular, Thirdly, if silence suppression is in use, long periods of silence
there are attacks whereby an eavesdropper replays STUN packets with a may cause media transmission to cease sufficiently long for NAT
modified source address. These modified packets can cause service bindings to time out.
disruptions and denial-of-service attacks, which are only partially
mitigated by the heuristics described in STUN [1].
Interestingly, when STUN is used within ICE, these security To prevent these problems, ICE implementations MUST continue to list
weaknesses are mitigated completely, without the need for the their active transport addresses as candidates in a=candidate lines.
heuristics defined in RFC 3489. As a consequence of this, STUN packets will be transmitted
periodically independently of the transmission (or lack thereof) of
media packets. This provides a media independent, RTP independent,
and codec independent solution for keeping the NAT bindings alive.
If an ICE implementation is communciating with one that does not
support ICE, keepalives MUST still be sent. In that case, it is
RECOMMENDED that an agent support the RTP No-Op payload format [15],
and send it at least once every 20 seconds if media is not otherwise
being sent. This No-Op MUST be sent even if the media stream is
inactive or recvonly.
7.8 Sending Media
When an agent sends media packets, it MUST send them from the same IP
address and port it has advertised in the m/c-line. This provides a
property known as symmetry, which is an essential facet of NAT
travresal.
In the case of a STUN-derived transport address, this means that the
RTP packets are sent from the local transport address used to obtain
the STUN address. In the case of a TURN-derived transport address,
this means that media packets are sent through the TURN server (using
the TURN SEND primitive). For local transport addresses, media is
sent from that local transport address.
This symmetric behavior MUST be followed by an agent even if its peer
in the session doesn't support ICE.
8. Interactions with Forking
SIP allows INVITE requests carrying offers to fork, which means that
they are delivered to multiple user agents. Each of those user
agents then provides an answer to the offer in the INVITE. The
result is that a single offer generated by the UAC produces multiple
answers.
ICE interacts very well with forking. Indeed, ICE fixes some of the
problems associated with forking. Once the offer/answer exchange has
completed, the UAC will have an answer from each UAS that received
the INVITE. The ICE connectivity checks that ensue will carry tids
that correlate each of those checks (and thus their corresponding
source IP address and port or TCP connection) with a specific remote
user agent. As these checks happen before any media is transmitted,
ICE allows a UAC to disambiguate subsequent media traffic, and
corelate that traffic with a particular remote UA. When SIP is used
without ICE, the incoming media traffic cannot be disambiguated
without an additional offer/answer exchange.
9. Interactions with Preconditions
Because ICE involves multiple addresses and pre-session activities,
its interactions with preconditions [10] merits further discussion.
Quality of Service (QoS) preconditions, which are defined in RFC
3312, apply only to the IP addresses and ports listed in the m/c
lines in an offer/answer. If ICE changes the address and port where
media is received, this change is reflected in the m/c lines of a new
offer/answer. As such, it appears like any other re-INVITE would,
and is fully treated in RFC 3312, which applies without regard to the
fact that the m/c lines are changing due to ICE negotiations ocurring
"in the background".
ICE also has (purposeful) interactions with connectivity
preconditions [12]. As described there, the precondition is
satisfied once ICE has verified that there exists a valid path of
connectivity for each media stream to which the precondition applies.
More specifically, it is satisfied when there is at least one valid
UDP transport address pairing or TCP connection for such a media
stream. Furthermore, when a subsequent offer is made to promote one
of those valid transport address pairings or connections into the
m/c-line, the preconditions is marked as met in that same offer/
answer exchange.
10. Example
In the example that follows, messages are labeled with "message name
A,B" to mean a message from transport address A to B. For STUN
Requests, this is followed by curly brackets enclosing the username
(which is also the password). For STUN answers, this is followed by
square brackets containing the value of MAPPED ADDRESS. The example
shows a flow of two agents where one is behind a full cone NAT, and
the other is behind a symmetric NAT.
TODO: Fill in. This is a big complicated flow!
11. Grammar
This specification defines a new SDP attribute. It is called
"candidate". The candidate attribute MUST be present within a media
block of the SDP. It contains a transport address for a candidate
that can be used for connectivity checks. There MAY be multiple
candidate attributes in a media block.
The syntax of this attribute is:
candidate-attribute = "candidate" ":" candidate-id SP tid SP
transport SP
qvalue SP ;qvalue from RFC 3261
addr SP
port SP
;addr, port from RFC 2327
transport = "UDP" / "TCP" / transport-extension
transport-extension = token
candidate-id = 1*DIGIT
id = non-ws-string
The candidate-id is used to group together the transport addresses
for a particular candidate. It MUST be a positive integer whose
value is less than (2^31 -1). It MUST have the same value for all
transport addresses within the same candidate. It MUST have a
different value for transport addresses within different candidates
for the same media stream. The tid production contains an
identifier, chosen with 128 bits of randomness, that identifies the
transport address. The tid of a pair of transport addresses is
combined to for the username and password of a STUN request from one
transport address to another. The transport production indicates the
transport protocol for the candidate. This can be either UDP or TCP.
Extensibility is provided to allow for future transport protocols to
be used with ICE, such as the Datagram Congestion Control Protocol
(DCCP) [26]. The unicast-address production is from RFC 2327, and
contains the IPv4 or IPv6 address of the candidate. The port
production contains its port.
12. Security Considerations
There are numerous threats in a system using ICE. This section
overviews these threats and discusses how they are mitigated.
STUN itself introduces many security considerations, which receive an
extensive treatment in RFC 3489. STUN is used within ICE in two ways
- one, as a technique for address gathering, and two, as a peer-to-
peer connectivity check. All of the security considerations of RFC
3489 apply directly to the former usage. However, the latter usage,
as a peer-to-peer connectivity check, is sufficiently different that
a discussion of its security considerations is appropriate.
It remains the case that many attacks are rooted in a single
primitive - an attacker attempts to inject a STUN response with an
invalid MAPPED-ADDRESS attribute. In the usages of STUN described in
RFC 3489, this injection can occur as a result of compromises of STUN
servers, attacks on the DNS, rogue NATs, injection of faked responses
coupled with a dos attack, and replaying modified requests. With
peer-to-peer STUN, compromises of STUN servers are not much of a
concern, since the STUN servers are embedded in endpoints and
distributed throughout the network. Thus, compromising the STUN
server is equivalent to comprimising the endpoint, and if that
happens, far more problematic attacks are possible than those against
ICE. Similarly, DNS attacks are irrelevant since STUN servers are
not discovered via DNS, they are signaled via SIP. Rogue NATs,
injection of fake responses and relaying modified requests all can be
handled in ICE with the countermeasures discussed below.
Consider an attacker that intercepts a STUN packet used for Consider an attacker that intercepts a STUN packet used for
connectivity checks, and replays it using a faked source address. If connectivity checks, and replays it using its own source address. If
successful, this would fool an endpoint into thinking that this faked successful, this would fool an endpoint into thinking that this faked
source address was a valid destination for media (recall that the source address was a valid destination for media (recall that the
source transport address of received STUN packets is used as a source transport address of received STUN packets is used as a
potential candidate address). However, the recipient of the replayed potential candidate address). However, the recipient of the replayed
packet will not just send media to that candidate. It will verify it packet will not just send media to that candidate. It will verify it
with a STUN connectivity check. This check will be sent to that with a STUN connectivity check. This check will be sent to that
faked source address, and if there is no response, the address will faked source address, and if there is no answer, the address will not
not be used. The attacker cannot answer the STUN request without be used. The attacker cannot answer the STUN request without access
access to the username and password, which are exchanged as part of to the username and password, which are exchanged as part of the
the signaling. Thus, if the signaling is protected as recommended signaling. Thus, if the signaling is protected as recommended above,
above, the attacker cannot obtain the username or password. the attacker cannot obtain the username or password.
If an attacker instead intercepts and replays STUN packets used for If an attacker instead intercepts and replays STUN packets used for
the purposes of unilateral allocation, a similar result occurs. The the purposes of unilateral allocation, a similar result occurs. The
target of the attack will be fooled into thinking it has a STUN target of the attack will be fooled into thinking it has a STUN
derived transport address that it does not. Its peer will perform a derived transport address that it does not. Its peer will perform a
connectivity check to this address, which will fail. The attacker connectivity check to this address, which will fail. The attacker
cannot force this check to succeed without access to the username and cannot force this check to succeed without access to the username and
password, which are protected. Thus, this address will not be used. password, which are protected. Thus, this address will not be used.
In the worst case, an attacker can generate enough traffic so that In the worst case, an attacker can generate enough traffic so that
none of the valid STUN checks or unilateral allocations succeed. none of the valid STUN checks or unilateral allocations succeed.
This would result in a service disruption. However, this attack is This would result in a service disruption. However, this attack is
no worse than any pure packet flood disruption attack launched no worse than any pure packet flood disruption attack launched
against any other protocol. These attacks cannot be prevented by any against any other protocol. These attacks cannot be prevented by any
protocol means. protocol means.
If an attacker could intercept and modify the contents of the If an attacker could intercept and modify the contents of the Offer
Initiate or Accept messages, they could disrupt the session, divert or Accept messages, they could disrupt the session, divert the media,
the media, and otherwise take control over the session. This attack and otherwise take control over the session. This attack is
is prevented by encryption, authentication and message integrity of prevented by encryption, authentication and message integrity of the
the signaling channel used for ICE. signaling channel used for ICE.
11. IANA Considerations SIP-based implementations of ICE SHOULD use the sips URI scheme when
transporting SDP with ICE information, and MAY use S/MIME [3].
11.1 SDP Attribute Name 13. IANA Considerations
This specification defines one new SDP attribute per the procedures This specification defines one new SDP attribute per the procedures
of Appendix B of RFC 2327. The required information for the of Appendix B of RFC 2327. The required information for the
registration is: registration is:
Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net.
Attribute Name: candidate Attribute Name: candidate
Long Form: candidiate Long Form: candidiate
skipping to change at page 39, line 22 skipping to change at page 42, line 28
Charset Considerations: The attribute is not subject the the charset Charset Considerations: The attribute is not subject the the charset
attribute. attribute.
Purpose: This attribute is used with Interactive Connectivity Purpose: This attribute is used with Interactive Connectivity
Establishment (ICE), and provides one of many possible candidate Establishment (ICE), and provides one of many possible candidate
addresses for communication. These addresses are validated with addresses for communication. These addresses are validated with
an end-to-end connectivity check using Simple Traversal of UDP an end-to-end connectivity check using Simple Traversal of UDP
with NAT (STUN). with NAT (STUN).
Appropriate Values: See Section 9 of RFC XXXX [Note to RFC-ed: please Appropriate Values: See Section 11 of RFC XXXX [Note to RFC-ed:
replace XXXX with the RFC number of this specification]. please replace XXXX with the RFC number of this specification].
11.2 URN Sub-Namespace Registration
This section registers a new XML namespace, per the guidelines in [6]
URI: The URI for this namespace is urn:ietf:params:xml:ns:ice.
Registrant Contact: IETF, MMUSIC working group, (mmusic@ietf.org),
Jonathan Rosenberg (jdrosen@jdrosen.net).
XML:
BEGIN
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"
"http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="content-type"
content="text/html;charset=iso-8859-1"/>
<title>ICE Namespace</title>
</head>
<body>
<h1>Namespace for ICE Documents</h1>
<h2>urn:ietf:params:xml:ns:ice</h2>
<p>See <a href="[URL of published
RFC]">RFCXXXX</a>. [Note to RFC-ed: please replace XXXX with the RFC
number of this specification.]</p>
</body>
</html>
END
11.3 XML Schema Registration
This section registers an XML schema per the procedures in [6].
URI: urn:ietf:params:xml:schema:ice
Registrant Contact: IETF, MMUSIC working group, (mmusic@ietf.org),
Jonathan Rosenberg (jdrosen@jdrosen.net).
The XML for this schema can be found as the sole content of
Section 7.
12. IAB Considerations 14. IAB Considerations
The IAB has studied the problem of "Unilateral Self Address Fixing", The IAB has studied the problem of "Unilateral Self Address Fixing",
which is the general process by which a client attempts to determine which is the general process by which a agent attempts to determine
its address in another realm on the other side of a NAT through a its address in another realm on the other side of a NAT through a
collaborative protocol reflection mechanism [14]. ICE is an example collaborative protocol reflection mechanism [21]. ICE is an example
of a protocol that performs this type of function. Interestingly, of a protocol that performs this type of function. Interestingly,
the process for ICE is not unilateral, but bilateral, and the the process for ICE is not unilateral, but bilateral, and the
difference has a signficant impact on the issues raised by IAB. The difference has a signficant impact on the issues raised by IAB. The
IAB has mandated that any protocols developed for this purpose IAB has mandated that any protocols developed for this purpose
document a specific set of considerations. This section meets those document a specific set of considerations. This section meets those
requirements. requirements.
12.1 Problem Definition 14.1 Problem Definition
From RFC 3424 any UNSAF proposal must provide: From RFC 3424 any UNSAF proposal must provide:
Precise definition of a specific, limited-scope problem that is to Precise definition of a specific, limited-scope problem that is to
be solved with the UNSAF proposal. A short term fix should not be be solved with the UNSAF proposal. A short term fix should not be
generalized to solve other problems; this is why "short term generalized to solve other problems; this is why "short term
fixes usually aren't". fixes usually aren't".
The specific problems being solved by ICE are: The specific problems being solved by ICE are:
Provide a means for two peers to determine the set of transport Provide a means for two peers to determine the set of transport
addresses which can be used for communication. addresses which can be used for communication.
Provide a means for resolving many of the limitations of other Provide a means for resolving many of the limitations of other
UNSAF mechanisms by wrapping them in an additional layer of UNSAF mechanisms by wrapping them in an additional layer of
processing (the ICE methodology). processing (the ICE methodology).
Provide a means for a client to determine an address that is Provide a means for a agent to determine an address that is
reachable by another peer with which it wishes to communicate. reachable by another peer with which it wishes to communicate.
12.2 Exit Strategy 14.2 Exit Strategy
From RFC 3424, any UNSAF proposal must provide: From RFC 3424, any UNSAF proposal must provide:
Description of an exit strategy/transition plan. The better short Description of an exit strategy/transition plan. The better short
term fixes are the ones that will naturally see less and less use term fixes are the ones that will naturally see less and less use
as the appropriate technology is deployed. as the appropriate technology is deployed.
ICE itself doesn't easily get phased out. However, it is useful even ICE itself doesn't easily get phased out. However, it is useful even
in a globally connected Internet, to serve as a means for detecting in a globally connected Internet, to serve as a means for detecting
whether a router failure has temporarily disrupted connectivity, for whether a router failure has temporarily disrupted connectivity, for
skipping to change at page 42, line 5 skipping to change at page 43, line 43
other UNSAF mechanisms simply never get used, because higher priority other UNSAF mechanisms simply never get used, because higher priority
connectivity exists. Therefore, the servers get used less and less, connectivity exists. Therefore, the servers get used less and less,
and can eventually be remove when their usage goes to zero. and can eventually be remove when their usage goes to zero.
Indeed, ICE can assist in the transition from IPv4 to IPv6. It can Indeed, ICE can assist in the transition from IPv4 to IPv6. It can
be used to determine whether to use IPv6 or IPv4 when two dual-stack be used to determine whether to use IPv6 or IPv4 when two dual-stack
hosts communicate with SIP (IPv6 gets used). It can also allow a hosts communicate with SIP (IPv6 gets used). It can also allow a
network with both 6to4 and native v6 connectivity to determine which network with both 6to4 and native v6 connectivity to determine which
address to use when communicating with a peer. address to use when communicating with a peer.
12.3 Brittleness Introduced by ICE 14.3 Brittleness Introduced by ICE
From RFC3424, any UNSAF proposal must provide: From RFC3424, any UNSAF proposal must provide:
Discussion of specific issues that may render systems more Discussion of specific issues that may render systems more
"brittle". For example, approaches that involve using data at "brittle". For example, approaches that involve using data at
multiple network layers create more dependencies, increase multiple network layers create more dependencies, increase
debugging challenges, and make it harder to transition. debugging challenges, and make it harder to transition.
ICE actually removes brittleness from existing UNSAF mechanisms. In ICE actually removes brittleness from existing UNSAF mechanisms. In
particular, traditional STUN (the usage described in RFC 3489) has particular, traditional STUN (the usage described in RFC 3489) has
several points of brittleness. One of them is the discovery process several points of brittleness. One of them is the discovery process
which requires a client to try and classify the type of NAT it is which requires a agent to try and classify the type of NAT it is
behind. This process is error-prone. With ICE, that discovery behind. This process is error-prone. With ICE, that discovery
process is simply not used. Rather than unilaterally assessing the process is simply not used. Rather than unilaterally assessing the
validity of the address, its validity is dynamically determined by validity of the address, its validity is dynamically determined by
measuring connectivity to a peer. The process of determining measuring connectivity to a peer. The process of determining
connectivity is very robust. The only potential problem is that connectivity is very robust. The only potential problem is that
bilaterally fixed addresses through STUN can expire if traffic does bilaterally fixed addresses through STUN can expire if traffic does
not keep them alive. However, that is substantially less brittleness not keep them alive. However, that is substantially less brittleness
than the STUN discovery mechanisms. than the STUN discovery mechanisms.
Another point of brittleness in STUN, TURN, and any other unilateral Another point of brittleness in STUN, TURN, and any other unilateral
mechanism is its absolute reliance on an additional server. ICE mechanism is its absolute reliance on an additional server. ICE
makes use of a server for allocating unilateral addresses, but allows makes use of a server for allocating unilateral addresses, but allows
clients to directly connect if possible. Therefore, in some cases, agents to directly connect if possible. Therefore, in some cases,
the failure of a STUN or TURN server would still allow for a call to the failure of a STUN or TURN server would still allow for a call to
progress when ICE is used. progress when ICE is used.
Another point of brittleness in traditional STUN is that it assumes Another point of brittleness in traditional STUN is that it assumes
that the STUN server is on the public Internet. Interestingly, with that the STUN server is on the public Internet. Interestingly, with
ICE, that is not necessary. There can be a multitude of STUN servers ICE, that is not necessary. There can be a multitude of STUN servers
in a variety of address realms. ICE will discover the one that has in a variety of address realms. ICE will discover the one that has
provided a usable address. provided a usable address.
The most troubling point of brittleness in traditional STUN is that The most troubling point of brittleness in traditional STUN is that
it doesn't work in all network topologies. In cases where there is a it doesn't work in all network topologies. In cases where there is a
shared NAT between each client and the STUN server, traditional STUN shared NAT between each agent and the STUN server, traditional STUN
may not work. With ICE, that restriction can be lifted. may not work. With ICE, that restriction can be lifted.
Traditional STUN also introduces some security considerations. Traditional STUN also introduces some security considerations.
Fortunately, those security considerations are also mitigated by ICE. Fortunately, those security considerations are also mitigated by ICE.
12.4 Requirements for a Long Term Solution 14.4 Requirements for a Long Term Solution
From RFC 3424, any UNSAF proposal must provide: From RFC 3424, any UNSAF proposal must provide:
Identify requirements for longer term, sound technical solutions Identify requirements for longer term, sound technical solutions
-- contribute to the process of finding the right longer term -- contribute to the process of finding the right longer term
solution. solution.
Our conclusions from STUN remain unchanged. However, we feel ICE Our conclusions from STUN remain unchanged. However, we feel ICE
actually helps because we believe it can be part of the long term actually helps because we believe it can be part of the long term
solution. solution.
12.5 Issues with Existing NAPT Boxes 14.5 Issues with Existing NAPT Boxes
From RFC 3424, any UNSAF proposal must provide: From RFC 3424, any UNSAF proposal must provide:
Discussion of the impact of the noted practical issues with Discussion of the impact of the noted practical issues with
existing, deployed NA[P]Ts and experience reports. existing, deployed NA[P]Ts and experience reports.
A number of NAT boxes are now being deployed into the market which A number of NAT boxes are now being deployed into the market which
try and provide "generic" ALG functionality. These generic ALGs hunt try and provide "generic" ALG functionality. These generic ALGs hunt
for IP addresses, either in text or binary form within a packet, and for IP addresses, either in text or binary form within a packet, and
rewrite them if they match a binding. This will interfere with rewrite them if they match a binding. This will interfere with
proper operation of any UNSAF mechanism, including ICE. proper operation of any UNSAF mechanism, including ICE.
13. Acknowledgements 15. Acknowledgements
The authors would like to thank Douglas Otis, Francois Audet and The authors would like to thank Douglas Otis, Francois Audet and
Magnus Westerland for their comments and input. Magnus Westerland for their comments and input.
14. References 16. References
14.1 Normative References 16.1 Normative References
[1] Rosenberg, J., Weinberger, J., Huitema, C. and R. Mahy, "STUN - [1] Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, "STUN
Simple Traversal of User Datagram Protocol (UDP) Through Network - Simple Traversal of User Datagram Protocol (UDP) Through
Address Translators (NATs)", RFC 3489, March 2003. Network Address Translators (NATs)", RFC 3489, March 2003.
[2] Huitema, C., "Real Time Control Protocol (RTCP) attribute in [2] Huitema, C., "Real Time Control Protocol (RTCP) attribute in
Session Description Protocol (SDP)", RFC 3605, October 2003. Session Description Protocol (SDP)", RFC 3605, October 2003.
[3] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., [3] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
Session Initiation Protocol", RFC 3261, June 2002. Session Initiation Protocol", RFC 3261, June 2002.
[4] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with [4] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with
Session Description Protocol (SDP)", RFC 3264, June 2002. Session Description Protocol (SDP)", RFC 3264, June 2002.
[5] Zopf, R., "Real-time Transport Protocol (RTP) Payload for [5] Zopf, R., "Real-time Transport Protocol (RTP) Payload for
Comfort Noise (CN)", RFC 3389, September 2002. Comfort Noise (CN)", RFC 3389, September 2002.
[6] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, January [6] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688,
2004. January 2004.
[7] Camarillo, G., "The Alternative Network Address Types Semantics [7] Handley, M. and V. Jacobson, "SDP: Session Description
Protocol", RFC 2327, April 1998.
[8] Casner, S., "Session Description Protocol (SDP) Bandwidth
Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 3556,
July 2003.
[9] Rosenberg, J. and H. Schulzrinne, "Reliability of Provisional
Responses in Session Initiation Protocol (SIP)", RFC 3262,
June 2002.
[10] Camarillo, G., Marshall, W., and J. Rosenberg, "Integration of
Resource Management and Session Initiation Protocol (SIP)",
RFC 3312, October 2002.
[11] Camarillo, G., "The Alternative Network Address Types Semantics
(ANAT) for theSession Description Protocol (SDP) Grouping (ANAT) for theSession Description Protocol (SDP) Grouping
Framework", draft-ietf-mmusic-anat-02 (work in progress), Framework", draft-ietf-mmusic-anat-02 (work in progress),
October 2004. October 2004.
[8] Rosenberg, J., "Traversal Using Relay NAT (TURN)", [12] Andreasen, F., "Connectivity Preconditions for Session
draft-rosenberg-midcom-turn-06 (work in progress), October 2004. Description Protocol Media Streams",
draft-ietf-mmusic-connectivity-precon-00 (work in progress),
May 2005.
14.2 Informative References [13] Yon, D., "Connection-Oriented Media Transport in the Session
Description Protocol (SDP)", draft-ietf-mmusic-sdp-comedia-10
(work in progress), November 2004.
[9] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming [14] Rosenberg, J., "Traversal Using Relay NAT (TURN)",
draft-rosenberg-midcom-turn-07 (work in progress),
February 2005.
[15] Andreasen, F., "A No-Op Payload Format for RTP",
draft-ietf-avt-rtp-no-op-00 (work in progress), May 2005.
16.2 Informative References
[16] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming
Protocol (RTSP)", RFC 2326, April 1998. Protocol (RTSP)", RFC 2326, April 1998.
[10] Senie, D., "Network Address Translator (NAT)-Friendly [17] Senie, D., "Network Address Translator (NAT)-Friendly
Application Design Guidelines", RFC 3235, January 2002. Application Design Guidelines", RFC 3235, January 2002.
[11] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A. and A. [18] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A., and A.
Rayhan, "Middlebox communication architecture and framework", Rayhan, "Middlebox communication architecture and framework",
RFC 3303, August 2002. RFC 3303, August 2002.
[12] Borella, M., Lo, J., Grabelsky, D. and G. Montenegro, "Realm [19] Borella, M., Lo, J., Grabelsky, D., and G. Montenegro, "Realm
Specific IP: Framework", RFC 3102, October 2001. Specific IP: Framework", RFC 3102, October 2001.
[13] Borella, M., Grabelsky, D., Lo, J. and K. Taniguchi, "Realm [20] Borella, M., Grabelsky, D., Lo, J., and K. Taniguchi, "Realm
Specific IP: Protocol Specification", RFC 3103, October 2001. Specific IP: Protocol Specification", RFC 3103, October 2001.
[14] Daigle, L. and IAB, "IAB Considerations for UNilateral [21] Daigle, L. and IAB, "IAB Considerations for UNilateral Self-
Self-Address Fixing (UNSAF) Across Network Address Address Fixing (UNSAF) Across Network Address Translation",
Translation", RFC 3424, November 2002. RFC 3424, November 2002.
[15] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, [22] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson,
"RTP: A Transport Protocol for Real-Time Applications", RFC "RTP: A Transport Protocol for Real-Time Applications",
3550, July 2003. RFC 3550, July 2003.
[16] Baugher, M., McGrew, D., Naslund, M., Carrara, E. and K. [23] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC Norrman, "The Secure Real-time Transport Protocol (SRTP)",
3711, March 2004. RFC 3711, March 2004.
[17] Carpenter, B. and K. Moore, "Connection of IPv6 Domains via [24] Carpenter, B. and K. Moore, "Connection of IPv6 Domains via
IPv4 Clouds", RFC 3056, February 2001. IPv4 Clouds", RFC 3056, February 2001.
[18] Huitema, C., "Teredo: Tunneling IPv6 over UDP through NATs", [25] Huitema, C., "Teredo: Tunneling IPv6 over UDP through NATs",
draft-huitema-v6ops-teredo-04 (work in progress), January 2005. draft-huitema-v6ops-teredo-05 (work in progress), April 2005.
[19] Hellstrom, G., "RTP Payload for Text Conversation", [26] Kohler, E., "Datagram Congestion Control Protocol (DCCP)",
draft-ietf-dccp-spec-11 (work in progress), March 2005.
[27] Lazzaro, J., "Framing RTP and RTCP Packets over Connection-
Oriented Transport", draft-ietf-avt-rtp-framing-contrans-05
(work in progress), January 2005.
[28] Hellstrom, G., "RTP Payload for Text Conversation",
draft-ietf-avt-rfc2793bis-09 (work in progress), August 2004. draft-ietf-avt-rfc2793bis-09 (work in progress), August 2004.
Author's Address Author's Address
Jonathan Rosenberg Jonathan Rosenberg
Cisco Systems Cisco Systems
600 Lanidex Plaza 600 Lanidex Plaza
Parsippany, NJ 07054 Parsippany, NJ 07054
US US
Phone: +1 973 952-5000 Phone: +1 973 952-5000
EMail: jdrosen@cisco.com Email: jdrosen@cisco.com
URI: http://www.jdrosen.net URI: http://www.jdrosen.net
Intellectual Property Statement Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information made any independent effort to identify any such rights. Information
 End of changes. 

This html diff was produced by rfcdiff 1.25, available from http://www.levkowetz.com/ietf/tools/rfcdiff/