draft-ietf-mmusic-ice-08.txt   draft-ietf-mmusic-ice-09.txt 
MMUSIC J. Rosenberg MMUSIC J. Rosenberg
Internet-Draft Cisco Systems Internet-Draft Cisco Systems
Expires: September 30, 2006 March 29, 2006 Expires: December 28, 2006 June 26, 2006
Interactive Connectivity Establishment (ICE): A Methodology for Network Interactive Connectivity Establishment (ICE): A Methodology for Network
Address Translator (NAT) Traversal for Offer/Answer Protocols Address Translator (NAT) Traversal for Offer/Answer Protocols
draft-ietf-mmusic-ice-08 draft-ietf-mmusic-ice-09
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 34 skipping to change at page 1, line 34
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 30, 2006. This Internet-Draft will expire on December 28, 2006.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2006). Copyright (C) The Internet Society (2006).
Abstract Abstract
This document describes a protocol for Network Address Translator This document describes a protocol for Network Address Translator
(NAT) traversal for multimedia session signaling protocols based on (NAT) traversal for multimedia session signaling protocols based on
the offer/answer model, such as the Session Initiation Protocol the offer/answer model, such as the Session Initiation Protocol
(SIP). This protocol is called Interactive Connectivity (SIP). This protocol is called Interactive Connectivity
Establishment (ICE). ICE makes use of the Simple Traversal of UDP Establishment (ICE). ICE makes use of the Simple Traversal of UDP
through NAT (STUN), applying its binding discovery, connectivity through NAT (STUN), applying its binding discovery, connectivity
check and relay usages. check and relay usages.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . . 4
3. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . 8 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 15
4. Sending the Initial Offer . . . . . . . . . . . . . . . . . 11 4. Sending the Initial Offer . . . . . . . . . . . . . . . . . . 18
5. Receipt of the Offer and Generation of the Answer . . . . . 11 5. Receipt of the Offer and Generation of the Answer . . . . . . 19
6. Processing the Answer . . . . . . . . . . . . . . . . . . . 12 6. Processing the Answer . . . . . . . . . . . . . . . . . . . . 19
7. Common Procedures . . . . . . . . . . . . . . . . . . . . . 12 7. Common Procedures . . . . . . . . . . . . . . . . . . . . . . 20
7.1 Gathering Candidates . . . . . . . . . . . . . . . . . . . 12 7.1. Gathering Candidates . . . . . . . . . . . . . . . . . . 20
7.2 Prioritizing the Candidates and Choosing an Active One . . 18 7.2. Prioritizing the Candidates and Choosing an Operating
7.3 Encoding Candidates into SDP . . . . . . . . . . . . . . . 20 One . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
7.4 Forming Candidate Pairs . . . . . . . . . . . . . . . . . 23 7.3. Encoding Candidates into SDP . . . . . . . . . . . . . . 27
7.5 Ordering the Candidate Pairs . . . . . . . . . . . . . . . 25 7.4. Forming Candidate Pairs . . . . . . . . . . . . . . . . . 31
7.6 Performing the Connectivity Checks . . . . . . . . . . . . 28 7.5. Ordering the Candidate Pairs . . . . . . . . . . . . . . 33
7.7 Sending a Binding Request for Connectivity Checks . . . . 32 7.6. Performing the Connectivity Checks . . . . . . . . . . . 36
7.8 Receiving a Binding Request for Connectivity Checks . . . 33 7.7. Sending a Binding Request for Connectivity Checks . . . . 42
7.9 Promoting a Candidate to Active . . . . . . . . . . . . . 35 7.8. Receiving a Binding Request for Connectivity Checks . . . 44
7.10 Learning New Candidates from Connectivity Checks . . . . 36 7.9. Promoting a Candidate to Operating . . . . . . . . . . . 46
7.10.1 On Receipt of a Binding Request . . . . . . . . . . 36 7.10. Learning New Candidates from Connectivity Checks . . . . 47
7.10.2 On Receipt of a Binding Response . . . . . . . . . . 40 7.10.1. On Receipt of a Binding Request . . . . . . . . . . 47
7.11 Subsequent Offer/Answer Exchanges . . . . . . . . . . . 42 7.10.2. On Receipt of a Binding Response . . . . . . . . . . 51
7.11.1 Sending of a Subsequent Offer . . . . . . . . . . . 42 7.11. Subsequent Offer/Answer Exchanges . . . . . . . . . . . . 53
7.11.2 Receiving the Offer and Sending an Answer . . . . . 45 7.11.1. Sending of a Subsequent Offer . . . . . . . . . . . 53
7.11.3 Receiving the Answer . . . . . . . . . . . . . . . . 47 7.11.2. Receiving the Offer and Sending an Answer . . . . . 56
7.12 Binding Keepalives . . . . . . . . . . . . . . . . . . . 48 7.11.3. Receiving the Answer . . . . . . . . . . . . . . . . 59
7.13 Sending Media . . . . . . . . . . . . . . . . . . . . . 49 7.12. Binding Keepalives . . . . . . . . . . . . . . . . . . . 59
7.14 Receiving Media . . . . . . . . . . . . . . . . . . . . 51 7.13. Sending Media . . . . . . . . . . . . . . . . . . . . . . 61
8. Guidelines for Usage with SIP . . . . . . . . . . . . . . . 52 7.14. Receiving Media . . . . . . . . . . . . . . . . . . . . . 63
9. Interactions with Forking . . . . . . . . . . . . . . . . . 54 8. Guidelines for Usage with SIP . . . . . . . . . . . . . . . . 64
10. Interactions with Preconditions . . . . . . . . . . . . . . 54 9. Interactions with Forking . . . . . . . . . . . . . . . . . . 66
11. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 55 10. Interactions with Preconditions . . . . . . . . . . . . . . . 67
11.1 Basic Example . . . . . . . . . . . . . . . . . . . . . 56 11. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 67
11.2 Advanced Example . . . . . . . . . . . . . . . . . . . . 60 11.1. Basic Example . . . . . . . . . . . . . . . . . . . . . . 68
12. Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . 80 11.2. Advanced Example . . . . . . . . . . . . . . . . . . . . 72
13. Security Considerations . . . . . . . . . . . . . . . . . . 82 12. Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
13.1 Attacks on Connectivity Checks . . . . . . . . . . . . . 82 13. Security Considerations . . . . . . . . . . . . . . . . . . . 95
13.2 Attacks on Address Gathering . . . . . . . . . . . . . . 85 13.1. Attacks on Connectivity Checks . . . . . . . . . . . . . 95
13.3 Attacks on the Offer/Answer Exchanges . . . . . . . . . 86 13.2. Attacks on Address Gathering . . . . . . . . . . . . . . 98
13.4 Insider Attacks . . . . . . . . . . . . . . . . . . . . 86 13.3. Attacks on the Offer/Answer Exchanges . . . . . . . . . . 99
13.4.1 The Voice Hammer Attack . . . . . . . . . . . . . . 86 13.4. Insider Attacks . . . . . . . . . . . . . . . . . . . . . 99
13.4.2 STUN Amplification Attack . . . . . . . . . . . . . 86 13.4.1. The Voice Hammer Attack . . . . . . . . . . . . . . 99
14. IANA Considerations . . . . . . . . . . . . . . . . . . . . 87 13.4.2. STUN Amplification Attack . . . . . . . . . . . . . 99
14.1 candidate Attribute . . . . . . . . . . . . . . . . . . 87 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 100
14.2 remote-candidate Attribute . . . . . . . . . . . . . . . 87 14.1. candidate Attribute . . . . . . . . . . . . . . . . . . . 100
14.3 ice-pwd Attribute . . . . . . . . . . . . . . . . . . . 88 14.2. remote-candidate Attribute . . . . . . . . . . . . . . . 100
15. IAB Considerations . . . . . . . . . . . . . . . . . . . . . 88 14.3. ice-pwd Attribute . . . . . . . . . . . . . . . . . . . . 101
15.1 Problem Definition . . . . . . . . . . . . . . . . . . . 89 15. IAB Considerations . . . . . . . . . . . . . . . . . . . . . 101
15.2 Exit Strategy . . . . . . . . . . . . . . . . . . . . . 89 15.1. Problem Definition . . . . . . . . . . . . . . . . . . . 102
15.3 Brittleness Introduced by ICE . . . . . . . . . . . . . 90 15.2. Exit Strategy . . . . . . . . . . . . . . . . . . . . . . 102
15.4 Requirements for a Long Term Solution . . . . . . . . . 91 15.3. Brittleness Introduced by ICE . . . . . . . . . . . . . . 103
15.5 Issues with Existing NAPT Boxes . . . . . . . . . . . . 91 15.4. Requirements for a Long Term Solution . . . . . . . . . . 104
16. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 91 15.5. Issues with Existing NAPT Boxes . . . . . . . . . . . . . 104
17. References . . . . . . . . . . . . . . . . . . . . . . . . . 92 16. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 104
17.1 Normative References . . . . . . . . . . . . . . . . . . 92 17. References . . . . . . . . . . . . . . . . . . . . . . . . . 105
17.2 Informative References . . . . . . . . . . . . . . . . . 93 17.1. Normative References . . . . . . . . . . . . . . . . . . 105
Author's Address . . . . . . . . . . . . . . . . . . . . . . 94 17.2. Informative References . . . . . . . . . . . . . . . . . 106
Intellectual Property and Copyright Statements . . . . . . . 96 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 108
Intellectual Property and Copyright Statements . . . . . . . . . 109
1. Introduction 1. Introduction
RFC 3264 [4] defines a two-phase exchange of Session Descrption RFC 3264 [4] defines a two-phase exchange of Session Description
Protocol (SDP) messages [5] for the purposes of establishment of Protocol (SDP) messages [5] for the purposes of establishment of
multimedia sessions. This offer/answer mechanism is used by multimedia sessions. This offer/answer mechanism is used by
protocols such as the Session Initiation Protocol (SIP) [2]. protocols such as the Session Initiation Protocol (SIP) [2].
Protocols using offer/answer are difficult to operate through Network Protocols using offer/answer are difficult to operate through Network
Address Translators (NAT). Because their purpose is to establish a Address Translators (NAT). Because their purpose is to establish a
flow of media packets, they tend to carry IP addresses within their flow of media packets, they tend to carry IP addresses within their
messages, which is known to be problematic through NAT [17]. The messages, which is known to be problematic through NAT [15]. The
protocols also seek to create a media flow directly between protocols also seek to create a media flow directly between
participants, so that there is no application layer intermediary participants, so that there is no application layer intermediary
between them. This is done to reduce media latency, decrease packet between them. This is done to reduce media latency, decrease packet
loss, and reduce the operational costs of deploying the application. loss, and reduce the operational costs of deploying the application.
However, this is difficult to accomplish through NAT. A full However, this is difficult to accomplish through NAT. A full
treatment of the reasons for this is beyond the scope of this treatment of the reasons for this is beyond the scope of this
specification. specification.
Numerous solutions have been proposed for allowing these protocols to Numerous solutions have been proposed for allowing these protocols to
operate through NAT. These include Application Layer Gateways operate through NAT. These include Application Layer Gateways
(ALGs), the Middlebox Control Protocol [19], Simple Traversal of UDP (ALGs), the Middlebox Control Protocol [17], Simple Traversal of UDP
through NAT (STUN) [16] and its revision [13], the STUN Relay Usage through NAT (STUN) [14] and its revision [12], the STUN Relay Usage
[14], and Realm Specific IP [20] [21] along with session description [13], and Realm Specific IP [18] [19] along with session description
extensions needed to make them work, such as the Session Description extensions needed to make them work, such as the Session Description
Protocol (SDP) [5] attribute for the Real Time Control Protocol Protocol (SDP) [5] attribute for the Real Time Control Protocol
(RTCP) [1]. Unfortunately, these techniques all have pros and cons (RTCP) [1]. Unfortunately, these techniques all have pros and cons
which make each one optimal in some network topologies, but a poor which make each one optimal in some network topologies, but a poor
choice in others. The result is that administrators and implementors choice in others. The result is that administrators and implementors
are making assumptions about the topologies of the networks in which are making assumptions about the topologies of the networks in which
their solutions will be deployed. This introduces complexity and their solutions will be deployed. This introduces complexity and
brittleness into the system. What is needed is a single solution brittleness into the system. What is needed is a single solution
which is flexible enough to work well in all situations. which is flexible enough to work well in all situations.
This specification provides that solution for media streams This specification provides that solution for media streams
established by signaling protocols based on the offer-answer model. established by signaling protocols based on the offer-answer model.
It is called Interactive Connectivity Establishment, or ICE. ICE It is called Interactive Connectivity Establishment, or ICE. ICE
makes use of STUN and its relay extension, commonly called TURN, but makes use of STUN and its relay extension, commonly called TURN, but
uses them in a specific methodology which avoids many of the pitfalls uses them in a specific methodology which avoids many of the pitfalls
of using any one alone. of using any one alone.
2. Terminology 2. Overview of ICE
A typical architecture for an ICE deployment is shown in Figure 1.
The figure shows two endpoints (known as agents in RFC 3264
terminology) which we call L and R (for left and right, which helps
visualize call flows). Both L and R are behind a NAT. The type of
NAT and its properties are unknown. Indeed, it is not known whether
the agent is behind a NAT at all, or whether there are multiple NATs
between it and the network. Agents A and B are capable of engaging
in an offer/answer exchange [4] by which they can exchange SDP
messages, whose purpose is to set up a media session between A and B.
Of course, the offer/answer exchange itself must be capable of
traversing the NAT. Such traversal is facilitated through signaling
elements such as SIP servers, and is outside the scope of this
specification. Different solutions are applied for traversal of the
signaling that carries the offer/answer exchange, and for the media
set up by that offer/answer exchange. This is because of the vastly
different requirements on latency, packet loss, and overall bandwidth
between the signaling and media. For example, usage of a signaling
intermediary, such as a SIP proxy, as a relay for all signaling at
all times, is acceptable, whereas usage of relays at all times for
media is highly undesirable.
In addition to the agents, a SIP server and NATs, ICE is typically
used in concert with STUN servers in the network. Each agent can
have its own STUN server, or they can be the same.
+-------+
| SIP |
+-------+ | Srvr | +-------+
| STUN | | | | STUN |
| Srvr | +-------+ | Srvr |
| | | |
+-------+ +-------+
+--------+ +--------+
| NAT | | NAT |
+--------+ +--------+
+-------+ +-------+
| Agent | | Agent |
| L | | R |
| | | |
+-------+ +-------+
Figure 1
Prior to initiating an offer, the offering agent (L in this example)
starts by performing a process known as address gathering. This
process allows the client to obtain one or more transport addresses,
one more of which might be viable addresses at which the agent can
receive incoming media packets from the other agent, which we call
its peer. A transport address is just the combination of an IP
address and port. With ICE, an agent will actually provide its peer
with all of its possible transport addresses, and ICE will figure out
which one to actually use.
Naturally, one viable transport address is one obtained directly from
a local interface the client has towards the network. Such a
transport address is called a local transport address. The local
interface could be one on a local layer 2 network technology, such as
ethernet or WiFi, or it could be one that is obtained through a
tunnel mechanism, such as a Virtual Private Network (VPN) or Mobile
IP (MIP). In all cases, these appear to the agent as a local
interface from which ports (and thus transport addresses) can be
allocated.
If an agent is multihomed, it can obtain a transport address from
each interface. Depending on the location of the peer on the IP
network, the agent may be reachable through one of those interfaces,
or through another. Consider, for example, an agent which has a
local interface to a private net 10 network, and also to the public
Internet. A transport address from the net10 interface will be
directly reachable when communicating with a peer on the same private
net 10 network, while a transport address from the public interface
will be directly reachable when communicating with a peer on the
public Internet. Rather than trying to guess which interface will
work prior to sending an offer, the offering agent includes both
transport addresses in its offer.
Indeed, when using a media technology like the Real Time Transport
Protocol (RTP), an agent needs two transport addresses on each
interface - one for the RTP, and one for the Real Time Control
Protocol (RTCP). Other media technologies may require a multiplicity
of transport addresses to be used and treated as a bundle. Each of
these transport addresses is called a component. There are two
components in an RTP stream - the RTP itself, and the RTCP. In ICE,
the set of transport addresses that represent an atomic grouping on
which communications is possible is called a candidate. In the
example so far, the agent would obtain two candidates - one from the
net 10 interface, and one from the interface on the public Internet.
Each candidate would contain two transport addresses, corresponding
to each of the two components.
Once the agent has obtained local transport addresses, it uses STUN
to obtain additional transport addresses. To do this, it would send
a STUN Binding Request, using the Binding Discovery Usage [12] or the
Relay Usage [13] from a local transport address, to its STUN server.
It is assumed that the address of the STUN server is configured, or
learned in some way. Indeed, an agent might even have multiple STUN
servers. As a consequence of communicating with the STUN server, the
agent can learn potentially two new types of transport addresses -
server reflexive transport addresses and relayed transport addresses.
The relationship of these addresses to the local transport address is
shown in Figure 2.
To Internet
|
|
| /------------ Relayed
| / Address
+--------+
| |
| STUN |
| Server |
| |
+--------+
|
|
| /------------ Server
|/ Reflexive
+------------+ Address
| NAT |
+------------+
|
| /------------ Local
|/ Address
+--------+
| |
| Agent |
| |
+--------+
Figure 2
The local transport address is resident on the agent itself. Through
either the Binding Discovery Usage or the Relay Usage, the agent can
discover its server reflexive transport address. This is the address
on the public side of the NAT, facing the STUN server. It is the
transport address allocated to the agent on the public side of the
NAT as a consequence of the transmission of the STUN request through
the NAT, to the STUN server. The NAT will allocate a binding,
mapping this server reflexive transport address to the local
transport address. Packets received at the NAT, targeted towards the
server reflexive transport address, will have their destination
address rewritten to the local transport address by the NAT, and then
be forwarded to the agent. When there are multiple NATs between the
agent and the STUN server, the STUN request will create a binding on
each NAT, but only the outermost server reflexive transport address
will be discovered by the agent.
In addition, through the Relay Usage, the agent can request that the
STUN server itself allocate a transport address from one of its local
interfaces, and establish a binding that maps that transport address
(called a relayed transport address, naturally) towards the source
transport address of the STUN request, which will actually be equal
to the server reflexive transport address allocated by the outermost
NAT. Consequently, packets sent to the relayed transport address
will be routed by the IP network towards the STUN server. The STUN
server will receive them, rewrite the destination address to be equal
to the server reflexive transport address, and forward them. They
will then arrive at the NAT, where the destination address is
rewritten once again, and the packet forward finally to the agent at
its local address.
Since the server reflexive transport addresses and relayed transport
addresses and obtained from a local transport address, they are said
to be derived transport addresses, since they are derived from (and
ultimately map to) their associated local transport address. During
the process of address gathering, the agent will obtain as many
transport addresses of a given type as are needed for the media
session. For example, with RTP, two transport addresses are needed
for a candidate. The agent will obtain two server reflexive
transport addresses (each derived from a local transport address),
and they would be used to constitute a server reflexive candidate.
The local transport addresses make up a local candidate, and the
relayed transport addresses make up a relayed candidate.
Server Server
Reflexive Reflexive
Candidate Candidate
.............. ..............
. . . .
. +-+ +-+ . . +-+ +-+ .
. | | | | . . | | | | .
. +-+ +-+ . . +-+ +-+ .
. ^ ^ . . ^ ^ .
....|....|.... ....|....|....
| | | |
| | | |
....|....|.... ....|....|....
. | | . . | | .
. +-+ +-+ . Local . +-+ +-+ . Local
. | | | | . Candidate . | | | | . Candidate
. +-+ +-+ . . +-+ +-+ .
. | | . . | | .
....|....|.... ....|....|....
| | | |
| | | |
....|....|.... ....|....|....
. V V . . V V .
. +-+ +-+ . . +-+ +-+ .
. | | | | . . | | | | .
. +-+ +-+ . . +-+ +-+ .
. . . .
.............. ..............
Relayed Relayed
Candidate Candidate
Legend
------
+-+
| | Transport Address
+-+
---> Derived From
...
. . Candidate
...
Figure 3
The relationship between these various transport addresses and
candidates is shown pictorially in Figure 3. The figure shows our
example agent with two local interfaces, each of which provides two
transport address pairs to make up two candidates. From those two
local candidates, a server reflexive and relayed candidate are
derived.
Once the agent has completed gathering its candidates, it assigns
each a candidate identifier, called the candidate ID. The candidate
ID is a random number used to uniquely identify each candidate, and
is used in the connectivity checks discussed below. The components
of each candidate are ordered numerically, starting at one, such that
each transport address has a component ID. For example, in an RTP
candidate there are two components, component ID 1 and component ID
2. Each transport address pair is therefore uniquely identified by a
combination of its candidate ID and its component ID. The
combination of the two is called, unsurprisingly, a transport address
ID, or tid for short.
The agent will place all of its candidates in an offer, using a new
SDP attribute called the candidate attribute. This attribute
contains the actual transport address, the candidate ID and component
ID, and a q-value. The q-value is used for the agent to prioritize
its candidates. An agent will typically prefer to receive media at
particular candidates over other candidates, based on local policy.
For example, an agent would normally prefer to receive interactive
voice RTP packets at its local candidate as opposed to its relayed
candidate, due to the extra latency incurred by traveling through the
relay.
The candidate attribute will also include an indicator of the type of
candidate (server reflexive, local, relayed), and its related
transport address. For server reflexive transport addresses, the
related transport address is the local transport address from which
it was derived. For relayed transport addresses, the related
transport address is the server reflexive address towards the relay.
The related transport address for reflexive candidates is used by the
ICE algorithm itself, as explained below. For relayed candidates,
the related transport address is not used by ICE directly; it is
useful for diagnostic purposes and for Quality of Service mechanisms
that require knowledge of addresses closer to the agent.
Finally, the agent chooses one of its candidates for inclusion in the
m and c lines (called the m/c-line collectively). Assuming that
candidate is verified as functional by the ICE connectivity checks
described below, this is the actual IP address and port to which
media will be sent. The candidate selected for inclusion in the m/c-
line of an offer (or an answer) is called the operating candidate,
since it is the one that is the in-use destination for receipt of
media traffic.
Once the operating candidate is chosen, the agent sends the offer.
Through the wonders or SIP or other signaling protocols, this offer
is delivered to the peer, which must now select its answer. To
create the answer, the agent starts by gathering addresses, in
exactly the same way the offered did. It includes those as
candidates in its answer, and selects one as the operating candidate,
just like the offered did. It then sends the answer.
Each agent then pairs up each of its candidates with the candidates
of its peer. From the perspective of the offerer, the set of
candidates it sent in its offer are called its native candidates, and
the ones received in the answer are the remote candidates.
Similarly, from the perspective of the answerer, the set of
candidates it sent in its answer are the native candidates, and the
ones received in the offer are the remote candidates. Both agents
pair up each of their native candidates with each of the remote
candidates, producing a set of candidate pairs. If there were N
native candidates and M remote candidates, there will be N*M
candidate pairs. Within each candidate pair, the transport addresses
themselves are paired up one for one, resulting in transport address
pairs as well. The transport addresses are paired up such that they
have identical component IDs. Each transport address pair has an ID,
called the transport address pair ID, formed by concatenating the
transport address IDs of its two transport addresses.
Once the pairing is done, the transport address pairs are ordered in
such a way that both the offerer and answerer will end up with the
same order. This ordering is done by using the q-values each side
provided, along with the candidate IDs to help break ties. Then,
each side begins a process known as connectivity checks.
Connectivity checks are STUN transactions, using the connectivity
check usage of STUN, sent from the native transport address to the
remote transport address of a particular transport address pair. If
an agent sends a STUN request and gets a successful response, the
transport address pair is said to be Receive Valid, or Recv Valid for
short, since the agent knows that its peer was able to receive a
packet. If an agent receives a request and sends a response, the
transport address pair is said to be Send Valid, since the agent
knows that its peer was able to send it a packet. When transactions
in both directions complete, the transport address pair is said to be
Valid. The idea behind ICE is that if a transport address pair is
valid, it means that agents were able to succesfully exchange IP
packets in both directions. Consequently, any media packets, which
are sent to and from exactly the same IP addresses and ports, should
also work, since they don't differ in their IP addresses or ports.
It's important to point out that, when used with ICE, an agent will
always send and receive media on the same transport address. That
is, if an agent includes a transport address of 192.0.2.1:2444
(meaning an IP address of 192.0.2.1 and port of 2444) in its SDP for
receiving RTP packets (and also STUN connectivity check), it will not
only receive STUN requests and RTP packets on this transport address,
it will also send STUN requests and RTP packets from this transport
address. This property, known as symmetric RTP, is essential for
proper operation of ICE. Peer reflexive transport addresses,
discussed further below, will generally only work when symmetric RTP
is used. Symmetric RTP is also key for keeping NAT bindings alive.
Since there can be quite a few transport address pairs to check,
performing all of the connectivity checks in parallel can cause
substantial load on the network. Instead, each agent will start at
the top of the ordered list they each created, and every 50ms, begin
a new connectivity check.
In order to succesfully process a STUN connectivity check, an agent
must be able to correlate the STUN request or response with the
transport address pair whose connectivity the STUN message is meant
to validate. To perform this correlation, the STUN connectivity
checks contain a USERNAME attribute formed in a special way. In
particular, the USERNAME contains the actual transport address pair
ID, which, as described above, is formed by concatenating the
transport address IDs of each of the candidates. The USERNAME is
used in conjunction with an authentication and message integrity
operation on the STUN message that requires a password. This
password is conveyed in the offer/answer exchange, and is a random
number valid only for the duration of the media session. This
ensures that, if the signaling channel carrying the offer/answer
exchange is secure, the agent can be certain that its STUN
connectivity checks are taking place with the agent which responded
to the signaling.
Because each agent is receiving STUN requests on the same IP address
and port that media will later be sent to, each agent is effectively
acting as its own mini STUN server, implementing the connectivity
check usage described in [12]. Like all STUN servers, when the agent
sends a STUN response to a request, the response includes the XOR-
MAPPED-ADDRESS attribute that contains the source IP address and port
that the request came from. In certain deployment scenarios, and in
particular where one of the agents is behind a NAT whose address and
port mapping properties are address and port dependent [32], this
source IP address and port may differ from the server reflexive ones
allocated by the peer during the address gathering phase. This
source IP address and port, conveyed in the XOR-MAPPED-ADDRESS
attribute of the STUN response, therefore constitutes a new transport
address, called a peer reflexive transport address, which can be used
for communications.
+-------+
| STUN |
| Srvr |
| |
+-------+
^
|
|
|
|
+--------------------------+ |
| NAT-2| |NAT-1
| +-----------+
| | APD NAT |
| +-----------+
| | |
| \ |
VL1 \|R1
+-------+ +-------+
| Agent | | Agent |
| L | | R |
| | | |
+-------+ +-------+
Figure 4
Consider the example of Figure 4. The agent on the left, agent L,
has a single interface and is not behind a NAT. Consequently, it
ends up with a single candidate with a single transport address
(normally two for RTP, but we'll consider just one for ease of
explanation), transport address L1. It sends an offer to agent R,
which is behind one of these Address and Port Dependent (APD) mapping
NATs. Agent R has a local transport address R1, and obtains a server
reflexive transport address from its STUN server, transport address
NAT-1. Now, when agent R sends a connectivity check from its local
transport address (R1) to L's local transport address (L1), this
check will traverse the NAT. The connectivity check itself will
create a new mapping in the NAT and be allocated a new binding on the
NAT - NAT-2. This STUN request arrives at L, which generates a STUN
response containing transport address NAT-2. Agent R, noticing that
this is not the same as its other two transport addresses, treats
this as a new peer reflexive transport address.
This new peer reflexive transport address is paired up with the
remote transport address containing the STUN server from which that
transport address was learned (transport address L1 in the example
above). This becomes a new transport address pair, and connectivity
checks are run on it as well.
Once all of the transport address pairs in a candidate pair have been
validated, that candidate pair is ready to be used. Media starts
being sent on it immediately, and the offerer will send an updated
offer, now containing the agents half of the validated candidate pair
in the m/c-line. This is called "promoting a candidate to
operating". The updated offer only contains a single candidate
attribute - the one for the operating candidate. It also contains an
attribute, called the remote-candidate attribute, which tells the
answerer the remote candidate in the validated candidate pair. The
answerer uses this attribute, along with its own view on the states
of the candidate pairs, to place a candidate in the m/c-line and
populate the candidate attributes in its answer.
It is important to understand that, when ICE is in use, media is not
sent to a candidate without validation, even if that candidate
appears in the m/c-line. This is in order to avoid denial-of-service
attacks. In particular, without ICE, an offerer can send an offer to
another agent, and list the IP address and port of a target in the
offer. If the agent is an automata that answers a call
automatically, it will do so and then proceed to send media to the
target. This provides substantial packet amplifications. ICE fixes
this by requiring that an agent never send media packets unless it
has sent a STUN message towards the target of the RTP packets, and
received a reply from that target. See Section 7.13 for details.
A summary of this overall behavior is shown in the basic call flow in
Figure 5.
Agent A STUN Servers Agent B
|(1) Gather Addresses | |
|-------------------->| |
|(2) Offer | |
|------------------------------------------>|
| |(3) Gather Addresses |
| |<--------------------|
|(4) Answer | |
|<------------------------------------------|
|(5) STUN Check | |
|<------------------------------------------|
|(6) STUN Check | |
|------------------------------------------>|
|(7) Media | |
|<------------------------------------------|
|(8) Media | |
|------------------------------------------>|
|(9) Offer | |
|------------------------------------------>|
|(10) Answer | |
|<------------------------------------------|
Figure 5
3. Terminology
Several new terms are introduced in this specification: Several new terms are introduced in this specification:
Agent: As defined in RFC 3264, an agent is the protocol Agent: As defined in RFC 3264, an agent is the protocol
implementation involved in the offer/answer exchange. There are implementation involved in the offer/answer exchange. There are
two agents involved in an offer/answer exchange. two agents involved in an offer/answer exchange.
Peer: From the perspective of one of the agents in a session, its Peer: From the perspective of one of the agents in a session, its
peer is the other agent. Specifically, from the perspective of peer is the other agent. Specifically, from the perspective of
the offerer, the peer is the answerer. From the perspective of the offerer, the peer is the answerer. From the perspective of
the answerer, the peer is the offerer. the answerer, the peer is the offerer.
Transport Address: The combination of an IP address and port. Transport Address: The combination of an IP address and port.
Local Transport Address: A local transport address is a transport Local Transport Address: A local transport address is a transport
address that has been allocated from the operating system on the address that has been allocated from the operating system on the
host. This includes transport addresses obtained through Virtual host. This includes transport addresses obtained through Virtual
Private Networks (VPNs) and transport addresses obtained through Private Networks (VPNs) and transport addresses obtained through
Realm Specific IP (RSIP) [20] (which lives at the operating system Realm Specific IP (RSIP) [18] (which lives at the operating system
level). Transport addresses are typically obtained by binding to level). Transport addresses are typically obtained by binding to
an interface. an interface.
m/c line: The media and connection lines in the SDP, which together m/c line: The media and connection lines in the SDP, which together
hold the transport address used for the receipt of media. hold the transport address used for the receipt of media.
Derived Transport Address: A derived transport address is a transport Derived Transport Address: A derived transport address is a transport
address which is derived from a local transport address. The address which is obtained from a local transport address. The
derived transport address is related to the associated local derived transport address is related to the associated local
transport address in that packets sent to the derived transport transport address in that packets sent to the derived transport
address are received on the socket bound to its associated local address are received on the socket bound to its associated local
transport address. Derived addresses are obtained using protocols transport address. Derived addresses are obtained using protocols
like STUN, and more generally, any UNSAF protocol [22]. like STUN, and more generally, any UNSAF protocol [20].
Reflexive Transport Address: As defined in [13], a transport address Reflexive Transport Address: As defined in [12], a derived transport
learned by a client which identifies that client as seen by address learned by a client which identifies that client as seen
another host on an IP network, typically a STUN server. When by another host on an IP network, typically a STUN server. When
there is an intervening NAT between the client and the other host, there is an intervening NAT between the client and the other host,
the reflexive transport address represents the binding allocated the reflexive transport address represents the binding allocated
to the client on the public side of the NAT. Reflexive transport to the client on the public side of the NAT. Reflexive transport
addresses are learned from the XOR-MAPPED-ADDRESS attribute in addresses are learned from the XOR-MAPPED-ADDRESS attribute in
STUN Binding Responses and Allocate Responses [14], and are a type STUN Binding Responses and Allocate Responses [13].
of derived transport address.
Server Reflexive Transport Address: A server reflexive transport Server Reflexive Transport Address: A server reflexive transport
address is a reflexive address that is reflected off of a server, address is a reflexive address that is reflected off of a server,
distinct from the peer, whose address is configured or learned by distinct from the peer, whose address is configured or learned by
the client prior to an offer/answer exchange. the client prior to an offer/answer exchange.
Peer Reflexive Transport Address: A peer reflexive transport address Peer Reflexive Transport Address: A peer reflexive transport address
is a reflexive address that is reflected off of the peer. Peer is a reflexive address that is reflected off of the peer. Peer
reflexive transport addresses are learned by connectivity checks. reflexive transport addresses are learned by connectivity checks.
Relayed Transport Address: A transport address that terminates on a Relayed Transport Address: A derived transport address that
server, and is forwarded towards the client. The STUN Allocate terminates on a server, and is forwarded towards the client. The
Request can be used to obtain a relayed transport address, for STUN Allocate Request, defined as part of the STUN relay usage
[13] can be used to obtain a relayed transport address, for
example. example.
Associated Local Transport Address: When a peer sends a packet to a Associated Local Transport Address: When a peer sends a packet to a
transport address, the associated local transport address is the transport address, the associated local transport address is the
local transport address at which those packets will actually local transport address at which those packets will actually
arrive. For a local transport address, its associated local arrive. For a local transport address, its associated local
transport address is the same as the local transport address transport address is the same as the local transport address
itself. For reflexive and relayed transport addresses, however, itself. For reflexive and relayed transport addresses, however,
they are not the same. The associated local transport address is they are not the same. The associated local transport address is
the one from which the reflexive or relayed transport was derived. the one from which the reflexive or relayed transport was derived.
Candidate: A sequence of transport addresses that form an atomic set Candidate: A sequence of transport addresses that form an atomic set
for usage with a particular media session. Here, atomic means for usage with a particular media session. Here, atomic means
that all of transport addresses in the candidate need to work that all of transport addresses in the candidate need to work
before the candidate will be used for actual media transport. In before the candidate will be used for actual media transport. In
the case of RTP, there can be one or more transport addresses per the case of RTP, there can be one or more transport addresses per
candidate. In the most common case, there are two - one for RTP, candidate. In the most common case, there are two - one for RTP,
and another for RTCP. If the agent doesn't use RTCP, there would and another for RTCP. If the agent doesn't use RTCP, there would
be just one. If Generic Forward Error Correction (FEC) [18] is in be just one. If Generic Forward Error Correction (FEC) [16] is in
use, there may be more than two. The transport addresses that use, there may be more than two. The transport addresses that
compose a candidate are all of the same type - local, server compose a candidate are all of the same type - local, server
reflexive, peer reflexive or relayed. reflexive, peer reflexive or relayed.
Local Candidate: A candidate whose transport addresses are local Local Candidate: A candidate whose transport addresses are local
transport addresses. transport addresses.
Server Reflexive Candidate: A candidate whose transport addresses are Server Reflexive Candidate: A candidate whose transport addresses are
server reflexive transport addresses. server reflexive transport addresses.
Peer Reflexive Candidate: A candidate whose transport addresses are Peer Reflexive Candidate: A candidate whose transport addresses are
peer reflexive transport addresses. peer reflexive transport addresses.
Relayed Candidate: A candidate whose transport addresses are relayed Relayed Candidate: A candidate whose transport addresses are relayed
transport addresses. transport addresses.
Generating Candidate: The candidate from which a peer reflexive Generating Candidate: The candidate from which a peer reflexive
candidate is derived. candidate is derived.
Active Candidate: The candidate that is in use for exchange of media. Operating Candidate: The candidate that is in use for exchange of
This is the one that an agent places in the m/c line of an offer media. This is the one that an agent places in the m/c line of an
or answer. offer or answer.
Candidate ID: An identifier for a candidate. Candidate ID: An identifier for a candidate.
Component: When a media stream, and as a consequence, its candidate, Component: When a media stream, and as a consequence, its candidate,
require several IP addresses and ports to work atomically, each of require several IP addresses and ports to work atomically, each of
the constituent IP addresses and ports represents a component of the constituent IP addresses and ports represents a component of
that media stream. For example, RTP-based media streams typically that media stream. For example, RTP-based media streams typically
have two components - one for RTP, and one for RTCP. have two components - one for RTP, and one for RTCP.
Component ID: An integer, starting with one within each candidate and Component ID: An integer, starting with one within each candidate and
skipping to change at page 8, line 7 skipping to change at page 18, line 32
received on a local transport address, the matching transport received on a local transport address, the matching transport
address pair is the transport address pair whose connectivity is address pair is the transport address pair whose connectivity is
being checked by that Binding Request. being checked by that Binding Request.
Candidate Pair Priority Ordering: An ordering of candidate pairs Candidate Pair Priority Ordering: An ordering of candidate pairs
based on a combination of the qvalues of each candidate and the based on a combination of the qvalues of each candidate and the
candidate IDs of each candidate. candidate IDs of each candidate.
Candidate Pair Check Ordering: An ordering of candidate pairs that is Candidate Pair Check Ordering: An ordering of candidate pairs that is
similar to the candidate pair priority ordering, except that the similar to the candidate pair priority ordering, except that the
active candidate appears at the top of the list, regardless of its operating candidate appears at the top of the list, regardless of
priority. its priority.
Transport Address Pair Check Ordering: An ordering of transport Transport Address Pair Check Ordering: An ordering of transport
address pairs that determines the sequence of connectivity checks address pairs that determines the sequence of connectivity checks
performed for the pairs. performed for the pairs.
Transport Address Pair Count: The number of transport address pairs Transport Address Pair Count: The number of transport address pairs
in a candidate pair. This is equal to the minimum of the number in a candidate pair. This is equal to the minimum of the number
of transport addresses in the native candidate and the number of of transport addresses in the native candidate and the number of
transport addresses in the remote candidate. transport addresses in the remote candidate.
3. Overview of ICE
ICE makes the fundamental assumption that clients exist in a network
of segmented connectivity. This segmentation is the result of a
number of addressing realms in which a client can simultaneously be
connected. We use "realms" here in the broadest sense. A realm is
defined purely by connectivity. Two clients are in the same realm
if, when they exchange the addresses each has in that realm, they are
able to send packets to each other. This includes IPv6 and IPv4
realms, which actually use different address spaces, in addition to
private networks connected to the public Internet through NAT.
The key assumption in ICE is that a client cannot know, apriori,
which address realms it shares with any peer it may wish to
communicate with. Therefore, in order to communicate, it has to try
connecting to addresses in all of the realms.
Agent A STUN Servers Agent B
|(1) Gather Addresses | |
|-------------------->| |
|(2) Offer | |
|------------------------------------------>|
| |(3) Gather Addresses |
| |<--------------------|
|(4) Answer | |
|<------------------------------------------|
|(5) STUN Check | |
|<------------------------------------------|
|(6) STUN Check | |
|------------------------------------------>|
|(7) Media | |
|<------------------------------------------|
|(8) Media | |
|------------------------------------------>|
|(9) Offer | |
|------------------------------------------>|
|(10) Answer | |
|<------------------------------------------|
Figure 1
The basic flow of operation for ICE is shown in Figure 1. Before the
offerer establishes a session, it obtains local transport addresses
from its operating system on as many interfaces as it has access to.
These interfaces can include IPv4 and IPv6 interfaces, in addition to
Virtual Private Network (VPN) interfaces or ones associated with
RSIP. It then obtains transport addresses for the media from each
interface. Though ICE can support any type of transport protocol,
this specification only defines mechanisms for UDP. In addition, the
agent obtains server reflexive and relayed transport addresses.
These are usually obtained through a single STUN Allocate request,
which provides both. These requests are paced at a fixed rate in
order to limit network load and avoid NAT overload. The local,
server reflexive and relayed transport addresses are formed into
candidates, each of which represents a possible set of transport
addresses that might be viable for a media stream.
Each candidate is listed in a set of a=candidate attributes in the
offer. Each candidate is given a priority. Priority is a matter of
local policy, but typically, lowest priority would be given to
relayed transport addresses. Each candidate is also assigned a
distinct ID, called a candidate ID.
The agent will choose one of its candidates as its active candidate
for inclusion in the connection and media lines in the offer. Media
can be sent to this candidate immediately following its validation.
Media can also be sent to a candidate that is not active but has been
validated. Media is not sent without validation in order to avoid
denial-of-service attacks. In particular, without ICE, an offerer
can send an offer to another agent, and list the IP address and port
of a target in the offer. If the agent is an automata that answers a
call automatically, it will do so and then proceed to send media to
the target. This provides substantial packet amplifications. ICE
fixes this by requiring that an agent never send media packets unless
it has sent a STUN message towards the target of the RTP packets, and
received a reply from that target Section 7.13.
The offer is then sent to the answerer. This specification does not
address the issue of how the signaling messages themselves traverse
NAT. It is assumed that signaling protocol specific mechanisms are
used for that purpose. The answerer follows a similar process as the
offerer followed; it obtains addresses from local interfaces, obtains
derived transport addresses from those, and then groups them into
candidates for inclusion in a=candidate attributes in the answer. It
picks one candidate as its active candidate and places it into the
m/c line in the answer.
Once the offer/answer exchange has completed, both agents pair up the
candidates, and then determine an ordered set of transport address
pairs. This ordering is based primarily on the priority of the
candidates, with the exception of the active candidate, whose
addresses are at the top of the list. Both agents start at the top
of this list, beginning a connectivity check for that transport
address pair. At a fixed interval, checks for the next transport
address on the list begin. This results in a pacing of the
connectivity checks. These connectivity checks are performed through
peer-to-peer STUN requests, sent from one agent to the other. In
addition to pacing the checks out at regular intervals, the offerer
will generate a connectivity check for a transport address pair when
it receives one from its peer. As soon as the active candidate has
been verified by the STUN checks, media can begin to flow. Once a
higher priority candidate has been verified by the offerer, it ceases
additional connectivity checks, begins using that candidate for
media, and sends an updated offer which promotes this higher priority
candidate to the m/c-line. That candidate is also listed in
a=candidate attributes, resulting in periodic STUN keepalives through
the duration of the media session.
If an agent receives a STUN connectivity check with a new source IP
address and port, or a response to such a check with a new reflexive
transport address (obtained from the XOR-MAPPED-ADDRESS attribute),
this new address might be a viable candidate for the receipt of
media. This happens when there is a NAT with an address dependent or
address and port dependent mapping property [37] between the agents.
In such a case, the agents algorithmically construct a new candidate.
Like other candidates, connectivity checks begin for it, and if they
succeed, its transport addresses can be used for receipt of media by
promoting it to the m/c-line.
The gathering of addresses and connectivity checks take time. As a
consequence, in order to have minimal impact on the call setup time
or post-pickup delay for SIP, these offer/answer exchanges and checks
happen while the call is ringing.
4. Sending the Initial Offer 4. Sending the Initial Offer
When an agent wishes to begin a session by sending an initial offer, When an agent wishes to begin a session by sending an initial offer,
it starts by gathering transport addresses, as described in it starts by gathering transport addresses, as described in
Section 7.1. This will produce a set of candidates, including local Section 7.1. This will produce a set of candidates, including local
ones, server reflexive ones, and relayed ones. ones, server reflexive ones, and relayed ones.
This process of gathering candidates can actually happen at any time This process of gathering candidates can actually happen at any time
before sending the initial offer. A agent can pre-gather transport before sending the initial offer. A agent can pre-gather transport
addresses, using a user interface cue (such as picking up the phone, addresses, using a user interface cue (such as picking up the phone,
or entry into an address book) as a hint that communications is or entry into an address book) as a hint that communications is
imminent. Doing so eliminates any additional perceivable call setup imminent. Doing so eliminates any additional perceivable call setup
delays due to address gathering. delays due to address gathering.
When it comes time to offer communications, the agent determines a When it comes time to offer communications, the agent determines a
priority for each candidate and identifies the active candidate that priority for each candidate and identifies the operating candidate
will be used for receipt of media, as described in Section 7.2. that will be used for receipt of media, as described in Section 7.2.
The next step is to construct the offer message. For each media The next step is to construct the offer message. For each media
stream, it places its candidates into a=candidate attributes in the stream, it places its candidates into a=candidate attributes in the
offer and puts its active candidate into the m/c line. The process offer and puts its operating candidate into the m/c line. The
for doing this is described in Section 7.3. The offer is then sent. process for doing this is described in Section 7.3. The offer is
then sent.
5. Receipt of the Offer and Generation of the Answer 5. Receipt of the Offer and Generation of the Answer
Upon receipt of the offer message, the agent checks if the offer Upon receipt of the offer message, the agent checks if the offer
contains any a=candidate attributes. If the offer does, the offerer contains any a=candidate attributes. If the offer does, the offerer
supports ICE. In that case, it starts gathering candidates, as supports ICE. In that case, it starts gathering candidates, as
described in Section 7.1, and prioritizes them as described in described in Section 7.1, and prioritizes them as described in
Section 7.2. This processing is done immediately on receipt of the Section 7.2. This processing is done immediately on receipt of the
offer, to prepare for the case where the user should accept the call, offer, to prepare for the case where the user should accept the call,
or early media needs to be generated. By gathering candidates (and or early media needs to be generated. By gathering candidates (and
performing connectivity checks) while the user is being alerted to performing connectivity checks) while the user is being alerted to
the request for communications, session establishment delays are the request for communications, session establishment delays are
reduced. reduced.
The agent then constructs its answer, encoding its candidates into The agent then constructs its answer, encoding its candidates into
a=candidate attributes and including the active one in the m/c-line, a=candidate attributes and including the operating one in the m/c-
as described in Section 7.3. The agent then forms candidate pairs as line, as described in Section 7.3. The agent then forms candidate
described in Section 7.4. These are ordered as described in pairs as described in Section 7.4. These are ordered as described in
Section 7.5. The agent then begins connectivity checks, as described Section 7.5. The agent then begins connectivity checks, as described
in Section 7.6. It follows the logic in Section 7.10 on receipt of in Section 7.6. It follows the logic in Section 7.10 on receipt of
Binding Requests and responses to learn new candidates from the Binding Requests and responses to learn new candidates from the
checks themselves. checks themselves.
Transmission of media is performed according to the procedures in Transmission of media is performed according to the procedures in
Section 7.13. Section 7.13.
6. Processing the Answer 6. Processing the Answer
skipping to change at page 12, line 42 skipping to change at page 20, line 24
and responses to learn new candidates from the checks themselves. and responses to learn new candidates from the checks themselves.
Transmission of media is performed according to the procedures in Transmission of media is performed according to the procedures in
Section 7.13. Section 7.13.
7. Common Procedures 7. Common Procedures
This section discusses procedures that are common between offerer and This section discusses procedures that are common between offerer and
answerer. answerer.
7.1 Gathering Candidates 7.1. Gathering Candidates
An agent gathers candidates when it believes that communications is An agent gathers candidates when it believes that communications is
imminent. For offerers, this occurs before sending an offer imminent. For offerers, this occurs before sending an offer
(Section 4). For answerers, it occurs before sending an answer (Section 4). For answerers, it occurs before sending an answer
(Section 5). (Section 5).
Each candidate has one or more components, each of which is Each candidate has one or more components, each of which is
associated with a sequence number, starting at 1 for the first associated with a sequence number, starting at 1 for the first
component of each candidate, and incrementing by 1 for each component of each candidate, and incrementing by 1 for each
additional component within that candidate. These components additional component within that candidate. These components
represent a set of transport addresses for which connectivity must be represent a set of transport addresses for which connectivity must be
validated. For a particular media stream, all of the candidates validated. For a particular media stream, all of the candidates
SHOULD have the same number of components. The number of components SHOULD have the same number of components. The number of components
that are needed are a function of the type of media stream. All of that are needed are a function of the type of media stream. All of
the components in a candidate MUST be of the same type - server the components in a candidate MUST be of the same type - server
reflexive, relayed, or local, and obtained from the same server in reflexive, relayed, or local, and obtained from the same server in
the case of server reflexive or relayed candidates. For local the case of server reflexive or relayed candidates. For local
candidates, each component MUST be obtained from the same interface. candidates, each component MUST be obtained from the same interface.
For server reflexive and relayed candidates, each component MUST be
derived from a component with the same component ID, all of which
come from a single local candidate.
For traditional RTP-based media streams, it is RECOMMENDED that there For traditional RTP-based media streams, it is RECOMMENDED that there
be two components per candidate - one for RTP and one for RTCP. The be two components per candidate - one for RTP and one for RTCP. The
component with the component ID of 1 MUST be RTP, and the one with component with the component ID of 1 MUST be RTP, and the one with
component ID of 2 MUST be RTCP. If an agent doesn't implement RTCP, component ID of 2 MUST be RTCP. If an agent doesn't implement RTCP,
it SHOULD have a single component for the RTP stream (which will have it SHOULD have a single component for the RTP stream (which will have
a component ID of 1 by definition). Each component of a candidate a component ID of 1 by definition). Each component of a candidate
has a single transport address. has a single transport address.
The first step is to gather local candidates. Local candidates are The first step is to gather local candidates. Local candidates are
obtained by binding to ephemeral ports on an interface (physical or obtained by binding to ports (typically ephemeral) on an interface
virtual, including VPN interfaces) on the host. The process for (physical or virtual, including VPN interfaces) on the host. The
gathering local candidates depends on the transport protocol. process for gathering local candidates depends on the transport
Procedures are specified here for UDP. Extensions to ICE that define protocol. Procedures are specified here for UDP. Extensions to ICE
procedures for other transport protocols MUST specify how local that define procedures for other transport protocols MUST specify how
transport addresses are gathered. local transport addresses are gathered.
For each UDP media stream the agent wishes to use, the agent SHOULD For each UDP media stream the agent wishes to use, the agent SHOULD
obtain a set of candidates (one for each interface) by binding to N obtain a set of candidates (one for each interface) by binding to N
ephemeral UDP ports on each interface, where N is the number of UDP ports on each interface, where N is the number of components
components needed for the candidate. For RTP, N is typically two. needed for the candidate. For RTP, N is typically two. If a host
If a host has K local interfaces, this will result in K candidates has K local interfaces, this will result in K candidates for each UDP
for each UDP stream, requiring K*N local transport addresses. stream, requiring K*N local transport addresses.
Once the agent has obtained local candidates, it obtains candidates Once the agent has obtained local candidates, it obtains candidates
with derived transport addresses. The process for gathering derived with derived transport addresses. The process for gathering derived
candidates depends on the transport protocol. Procedures are candidates depends on the transport protocol. Procedures are
specified here for UDP. Extensions to ICE that define procedures for specified here for UDP. Extensions to ICE that define procedures for
other transport protocols MUST specify how derived transport other transport protocols MUST specify how derived transport
addresses are gathered. addresses are gathered.
Agents which serve end users directly, such as softphones, Agents which serve end users directly, such as softphones,
hardphones, terminal adapters and so on, MUST implement the STUN hardphones, terminal adapters and so on, MUST implement the STUN
Binding Discovery usage and SHOULD use it to obtain server reflexive Binding Discovery usage and SHOULD use it to obtain server reflexive
candidates. These devices SHOULD implement the STUN Relay usage, and candidates. These devices SHOULD implement the STUN Relay usage, and
SHOULD use its Allocate request to obtain both server reflexive and SHOULD use its Allocate request to obtain both server reflexive and
relayed candidates. They MAY implement and MAY use other protocols relayed candidates. They MAY implement and MAY use other protocols
that provide server reflexive or relayed transport addresses, such as that provide derived transport addresses, such as TEREDO [29].
TEREDO [33].
The requirement to use the relay Usage is at SHOULD strength to allow The requirement to use the relay Usage is at SHOULD strength to allow
for provider variation. If it is not to be used, it is RECOMMENDED for provider variation. If it is not to be used, it is RECOMMENDED
that it be implemented and just disabled through configuration, so that it be implemented and just disabled through configuration, so
that it can re-enabled through configuration if conditions change in that it can re-enabled through configuration if conditions change in
the future. the future.
Agents which represent network servers under the control of a service Agents which represent network servers under the control of a service
provider, such as gateways to the telephone network, media servers, provider, such as gateways to the telephone network, media servers,
or conferencing servers that are targeted at deployment only in or conferencing servers that are targeted at deployment only in
networks with public IP addresses MAY use the STUN Binding Discovery networks with public IP addresses MAY use the STUN Binding Discovery
usage and relay usage, or other similar protocols to obtain usage and relay usage, or other similar protocols to obtain
candidates. candidates.
Why would these types of endpoints even bother to implement ICE? Why would these types of endpoints even bother to implement ICE?
The answer is that such an implementation greatly facilitates NAT The answer is that such an implementation greatly facilitates NAT
traversal for clients that connect to it. The ability to process traversal for clients that connect to it. Consider a PC softphone
STUN connectivity checks allows for clients to obtain peer behind a NAT whose mapping policy is address and port dependent.
reflexive transport addresses that can be used by the network The softphone initiates a call through a gateway that implements
server to reach them without a relay, even through NATs with ICE. The gateway doesn't obtain any server reflexive or relayed
restrictive mapping and filtering policies. Furthermore, transport addresses, but it implements ICE, and consequently, is
implementation of the STUN connectivity checks allows for NAT prepared to receive STUN connectivity checks on its local
bindings along the way to be kept open. ICE also provides transport addresses. The softphone will send a STUN connectivity
numerous security properties that are independent of NAT to check to that local transport address, causing the NAT to
traversal, and would benefit any multimedia endpoint. See allocate a new binding for the softphone. The connectivity check
Section 13 for a discussion on these benefits. will inform the softphone of this address, allowing it to be used
by the gateway as a peer reflexive remote candidate. This allows
direct media transmission between the gateway and softphone,
without the need for relays. Furthermore, implementation of the
STUN connectivity checks allows for NAT bindings along the way to
be kept open. ICE also provides numerous security properties that
are independent of NAT traversal, and would benefit any multimedia
endpoint. See Section 13 for a discussion on these benefits.
Obtaining derived candidates requires transmission of packets which Obtaining derived candidates requires transmission of packets which
have the effect of creating bindings on NAT devices between the have the effect of creating bindings on NAT devices between the
client and the STUN servers. Experience has shown that many NAT client and the STUN servers. Experience has shown that many NAT
devices have upper limits on the rate at which they will create new devices have upper limits on the rate at which they will create new
bindings. Furthermore, transmission of these packets on the network bindings. Furthermore, transmission of these packets on the network
makes use of bandwidth and needs to be rate limited by the agent. As makes use of bandwidth and needs to be rate limited by the agent. As
a consequence, a client SHOULD pace its STUN transactions, such that a consequence, a client SHOULD pace its STUN transactions, such that
the start of each new transaction occurs at least Ta seconds after the start of each new transaction occurs at least Ta seconds after
the start of the previous transaction. The value of Ta SHOULD be the start of the previous transaction. The value of Ta SHOULD be
skipping to change at page 15, line 10 skipping to change at page 22, line 52
transport address with a single transaction. It is possible that transport address with a single transaction. It is possible that
some STUN servers will only support the Relay usage or only the some STUN servers will only support the Relay usage or only the
Binding Discovery usage, in which case a client might be configured Binding Discovery usage, in which case a client might be configured
with different servers depending on the usage. with different servers depending on the usage.
To obtain both server reflexive and relayed candidates using the STUN To obtain both server reflexive and relayed candidates using the STUN
Relay Usage, the client takes a local UDP candidate, and for each Relay Usage, the client takes a local UDP candidate, and for each
configured STUN server, produces both candidates. It is anticipated configured STUN server, produces both candidates. It is anticipated
that clients may have a multiplicity of STUN servers configured or that clients may have a multiplicity of STUN servers configured or
discovered in network environments where there are multiple layers of discovered in network environments where there are multiple layers of
NAT, and that layering is known to the provider of the client. To NAT, and where that layering is known to the provider of the client.
obtain these candidates, for each configured STUN server, the client
initiates an Allocate Request transaction using the procedures of To obtain these candidates, for each configured STUN server, the
Section 8.1.2 of [14] from each transport address of a particular client initiates an Allocate Request transaction using the procedures
of Section 8.1.2 of [13] from each transport address of a particular
local candidate. The Allocate Response will provide the client with local candidate. The Allocate Response will provide the client with
its server reflexive transport address (obtained from the XOR-MAPPED- its server reflexive transport address (obtained from the XOR-MAPPED-
ADDRESS attribute) and its relayed transport address in the RELAY- ADDRESS attribute) and its relayed transport address in the RELAY-
ADDRESS attribute. Once the Allocate requests have given a client a ADDRESS attribute. Indeed, these two transport addresses are related
relayed transport address for all transport addresses in a relayed to each other. The relay will forward packets received on the
candidate, there is no reason for a client to obtain further relayed relayed transport address towards that server reflexive transport
candidates through the same STUN server. Thus, if there are other address. As such, the server reflexive transport address is said to
local candidates from which the client has not yet obtained relayed be the associated server reflexive transport address for that relayed
address. Once the Allocate requests have given a client a relayed
transport address for all transport addresses in a relayed candidate,
there is no reason for a client to obtain further relayed candidates
through the same STUN server. Thus, if there are other local
candidates from which the client has not yet obtained relayed
transport address, the client SHOULD NOT bother to obtain them. transport address, the client SHOULD NOT bother to obtain them.
Instead, it SHOULD use the STUN Binding Discovery usage and obtain Instead, it SHOULD use the STUN Binding Discovery usage and obtain
just server reflexive addresses from that STUN server. The order in just server reflexive addresses from that STUN server. The order in
which local candidates are tried against the STUN server to obtain which local candidates are tried against the STUN server to obtain
relayed candidates is a matter of local policy. relayed candidates is a matter of local policy.
To obtain server reflexive candidates using the STUN Binding To obtain server reflexive candidates using the STUN Binding
Discovery usage, the client takes a local UDP candidate, and for each Discovery usage, the client takes a local UDP candidate, and for each
configured STUN server, produces a server reflexive candidate. To configured STUN server, produces a server reflexive candidate. To
produce the server reflexive candidate from the local candidate, it produce the server reflexive candidate from the local candidate, it
follows the procedures of Section 12.2 of [13] for each local follows the procedures of Section 12.2 of [12] for each local
transport address in the local candidate. The Binding Response will transport address in the local candidate. The Binding Response will
provide the client with its server reflexive transport address. If provide the client with its server reflexive transport address. If
the client had K local candidates, this will produce S*K server the client had K local candidates, this will produce S*K server
reflexive candidates, where S is the number of STUN servers. reflexive candidates, where S is the number of STUN servers.
Since a client will pace its STUN transactions (both Binding and Since a client will pace its STUN transactions (both Binding and
Allocate requests) at a total rate of one new transaction every Ta Allocate requests) at a total rate of one new transaction every Ta
seconds, it will take a certain amount of time to complete the seconds, it will take a certain amount of time to complete the
address gathering phase. It is RECOMMENDED that implementations have address gathering phase. It is RECOMMENDED that implementations have
a configurable upper bound on the total amount of time allotted to a configurable upper bound on the total amount of time allotted to
skipping to change at page 16, line 8 skipping to change at page 24, line 7
of STUN servers and local interfaces) might exceed this value, of STUN servers and local interfaces) might exceed this value,
clients SHOULD prioritize their local candidates and STUN servers, clients SHOULD prioritize their local candidates and STUN servers,
performing transactions from the highest priority local candidates to performing transactions from the highest priority local candidates to
the highest priority STUN servers first. A STUN server would the highest priority STUN servers first. A STUN server would
typically be higher priority if it supports the STUN Relay Usage, typically be higher priority if it supports the STUN Relay Usage,
since such a server provides two transport addresses with one since such a server provides two transport addresses with one
transaction. transaction.
Once the allocations are complete, any redundant candidates are Once the allocations are complete, any redundant candidates are
discarded. Candidate A is redundant with candidate B if the discarded. Candidate A is redundant with candidate B if the
transport addresses for each component of each component match, and transport addresses of each component match, and each component of
each component of their associated local candidates match. For their associated local candidates match. For example, consider a set
example, consider a set of candidates with a single component. One of candidates with a single component. One candidate is a local
candidate is a local candidate, and its one component has a transport candidate, and its one component has a transport address of 10.0.1.1:
address of 10.0.1.1:4458. A reflexive transport address is derived 4458. A reflexive transport address is derived from this local
from this local transport address, producing a 10.0.1.1:4458. These transport address, producing a 10.0.1.1:4458. These two candidates
two candidates are identical, and also have identical associated are identical, and also have identical associated local transport
local transport addresses, so they are redundant. addresses, so they are redundant.
+----------+ +----------+
| STUN Srvr| | STUN Srvr|
+----------+ +----------+
| |
| |
----- -----
// \\ // \\
| | | |
| B:net10 | | B:net10 |
skipping to change at page 17, line 40 skipping to change at page 25, line 4
----- -----
| |
| |
|192.168.1.1 ----- |192.168.1.1 -----
+----------+ // \\ +----------+ +----------+ // \\ +----------+
| | | | | | | | | | | |
| Offerer |---------| C:net10 |---------| Answerer | | Offerer |---------| C:net10 |---------| Answerer |
| |10.0.1.1 | | 10.0.1.2 | | | |10.0.1.1 | | 10.0.1.2 | |
+----------+ \\ // +----------+ +----------+ \\ // +----------+
----- -----
Figure 6
Figure 2 Consider the more complicated case of Figure 6. In this case, the
Consider the more complicated case of Figure 2. In this case, the
offerer is multi-homed. It has one interface, 10.0.1.1, on network offerer is multi-homed. It has one interface, 10.0.1.1, on network
C, which is a net 10 private network. The Answerer is on this same C, which is a net 10 private network. The Answerer is on this same
network. The offerer is also connected to network A, which is network. The offerer is also connected to network A, which is
192.168/16. The offerer has an interface of 192.168.1.1 on this 192.168/16. The offerer has an interface of 192.168.1.1 on this
network. There is a NAT on this network, natting into network B, network. There is a NAT on this network, natting into network B,
which is another net10 private network, but not connected to network which is another net10 private network, but not connected to network
C. There is a STUN server on network B. C. There is a STUN server on network B.
The offerer obtains local transport address on its interface on The offerer obtains local transport address on its interface on
network C (10.0.1.1:2498) and a local transport address on its network C (10.0.1.1:2498) and a local transport address on its
interface on network A (192.168.1.1:3344). It performs a STUN query interface on network A (192.168.1.1:3344). It performs a STUN query
to its configured STUN server from 192.168.1.1:3344. This query to its configured STUN server from 192.168.1.1:3344. This query
passes through the NAT, which happens to assign the binding 10.0.1.1: passes through the NAT, which happens to assign the binding 10.0.1.1:
2498. The STUN server reflects this in the STUN Binding Response. 2498. The STUN server reflects this in the STUN Binding Response.
Now, the offerer has obtained a candidate with a transport address it Now, the offerer has obtained a candidate with a transport address it
already has (10.0.1.1:2498), but from a new interface. It therefore already has (10.0.1.1:2498), but from a new interface. It therefore
keeps it. When it performs its connectivity checks, the offerer will keeps it. When it performs its connectivity checks, the offerer will
end up sending packets from both interfaces, and those sent from its end up sending packets from both interfaces, and those sent from its
interface on network C will succeed. interface on network C will succeed.
7.2 Prioritizing the Candidates and Choosing an Active One 7.2. Prioritizing the Candidates and Choosing an Operating One
The prioritization process takes the set of candidates and associates The prioritization process takes the set of candidates for a
each with a priority. This priority reflects the desire that the particular media stream and associates each with a priority. This
agent has to receive media at that candidate, and is assigned as a priority reflects the desire that the agent has to receive media at
value from 0 to 1 (1 being most preferred). Priorities are ordinal, that candidate, and is assigned as a value from 0 to 1 (1 being most
preferred). Priorities are a property of a candidate, and thus
shared across all components of a candidate. Priorities are ordinal,
so that their significance is only meaningful relative to other so that their significance is only meaningful relative to other
candidates from that agent for a particular media stream. Candidates candidates from that agent for a particular media stream. Candidates
MAY have the same priority. However, it is RECOMMENDED that each MAY have the same priority. However, it is RECOMMENDED that each
candidate have a distinct priority. Doing so improves the efficiency candidate have a distinct priority. Doing so improves the efficiency
of ICE. of ICE.
This specification makes no normative statements on how the This specification makes no normative statements on how the
prioritization is done. However, some useful guidelines are prioritization is done. However, some useful guidelines are
suggested on how such a prioritization can be determined. suggested on how such a prioritization can be determined.
skipping to change at page 18, line 52 skipping to change at page 26, line 16
may increase the cost of providing service, since media will be may increase the cost of providing service, since media will be
routed in and right back out of an intermediary run by the provider. routed in and right back out of an intermediary run by the provider.
If these concerns are important, candidates with this property can be If these concerns are important, candidates with this property can be
listed with lower priority. listed with lower priority.
Another criteria for choosing one candidate over another is IP Another criteria for choosing one candidate over another is IP
address family. ICE works with both IPv4 and IPv6. It therefore address family. ICE works with both IPv4 and IPv6. It therefore
provides a transition mechanism that allows dual-stack hosts to provides a transition mechanism that allows dual-stack hosts to
prefer connectivity over IPv6, but to fall back to IPv4 in case the prefer connectivity over IPv6, but to fall back to IPv4 in case the
v6 networks are disconnected (due, for example, to a failure in a v6 networks are disconnected (due, for example, to a failure in a
6to4 relay) [25]. It can also help with hosts that have both a 6to4 relay) [23]. It can also help with hosts that have both a
native IPv6 address and a 6to4 address. In such a case, higher native IPv6 address and a 6to4 address. In such a case, higher
priority could be afforded to the native v6 address, followed by the priority could be afforded to the native v6 address, followed by the
6to4 address, followed by a native v4 address. This allows a site to 6to4 address, followed by a native v4 address. This allows a site to
obtain and begin using native v6 addresses immediately, yet still obtain and begin using native v6 addresses immediately, yet still
fallback to 6to4 addresses when communicating with agents in other fallback to 6to4 addresses when communicating with agents in other
sites that do not yet have native v6 connectivity. sites that do not yet have native v6 connectivity.
Another criteria for choosing one candidate over another is security. Another criteria for choosing one candidate over another is security.
If a user is a telecommuter, and therefore connected to their If a user is a telecommuter, and therefore connected to their
corporate network and a local home network, they may prefer their corporate network and a local home network, they may prefer their
skipping to change at page 19, line 30 skipping to change at page 26, line 43
awareness. This is most useful for candidates that make use of awareness. This is most useful for candidates that make use of
relays. In those cases, if an agent has preconfigured or dynamically relays. In those cases, if an agent has preconfigured or dynamically
discovered knowledge of the topological proximity of the relays to discovered knowledge of the topological proximity of the relays to
itself, it can use that to select closer relays with higher priority. itself, it can use that to select closer relays with higher priority.
There may be transport-specific reasons for preferring one candidate There may be transport-specific reasons for preferring one candidate
over another. In such a case, specifications defining usage of ICE over another. In such a case, specifications defining usage of ICE
with other transport protocols SHOULD document such considerations. with other transport protocols SHOULD document such considerations.
Once the candidates have been prioritized, one may be selected as the Once the candidates have been prioritized, one may be selected as the
active one. This is the candidate that will be used for actual operating one. This is the candidate that will be used for actual
exchange of media if and when its validated, until a higher priority exchange of media if and when its validated, until a higher priority
candidate is validated. The active candidate will also be used to candidate is validated. The operating candidate will also be used to
receive media from ICE-unaware peers. As such, it is RECOMMENDED receive media from ICE-unaware peers. As such, it is RECOMMENDED
that one be chosen based on the likelihood of that candidate to work that one be chosen based on the likelihood of that candidate to work
with the peer that is being contacted. Unfortunately, it is with the peer that is being contacted. Unfortunately, it is
difficult to ascertain which candidate that might be. As an example, difficult to ascertain which candidate that might be. As an example,
consider a user within an enterprise. To reach non-ICE capable consider a user within an enterprise. To reach non-ICE capable
agents within the enterprise, a local candidate has to be used, since agents within the enterprise, a local candidate has to be used, since
the enterprise policies may prevent communication between elements the enterprise policies may prevent communication between elements
using a relay on the public network. However, when communicating to using a relay on the public network. However, when communicating to
peers outside of the enterprise, a relayed candidate from a peers outside of the enterprise, a relayed candidate from a
publically accessible STUN server is needed. publically accessible STUN server is needed.
Indeed, the difficulty in picking just one address that will work is Indeed, the difficulty in picking just one address that will work is
the whole problem that motivated the development of this the whole problem that motivated the development of this
specification in the first place. As such, it is RECOMMENDED that specification in the first place. As such, it is RECOMMENDED that
the active candidate be a relayed candidate from a STUN server the operating candidate be a relayed candidate from a STUN server
providing public IP addresses in response to an Allocate request. providing public IP addresses in response to an Allocate request.
Furthermore, ICE is only truly effective when it is supported on both Furthermore, ICE is only truly effective when it is supported on both
sides of the session. It is therefore most prudent to deploy it to sides of the session. It is therefore most prudent to deploy it to
close-knit communities as a whole, rather than piecemeal. In the close-knit communities as a whole, rather than piecemeal. In the
example above, this would mean that ICE would ideally be deployed example above, this would mean that ICE would ideally be deployed
completely within the enterprise, rather than just to parts of it. completely within the enterprise, rather than just to parts of it.
An additional consideration for selection of the active candidate is An additional consideration for selection of the operating candidate
the switching of media stream destinations between the initial offer is the switching of media stream destinations between the initial
and the subsequent offer. If the active candidate pair in the offer and the subsequent offer. The operating candidate pair in the
initial offer is being validated, media will flow to that pair once initial offer is validated first, and if that validation succeeds,
it is validated. When the ICE checks complete and yield a higher media will immediately begin to flow between the pair. When the ICE
priority candidate pair, media will begin to flow to it (there will checks complete and yield a higher priority candidate pair, media
also be an updated offer/answer exchange that changes the active will begin to flow to it (there will also be an updated offer/answer
candidate). This will result in a change in the destination of the exchange that changes the operating candidate). This will result in
media packets. This may also cause a different path for the media a change in the destination of the media packets. This may also
packets. That path might have different delay and jitter cause a different path for the media packets. That path might have
characteristics. As a consequence, the jitter buffers may see a different delay and jitter characteristics. As a consequence, the
glitch, causing possible media artifacts. If these issues are a jitter buffers may see a glitch, causing possible media artifacts.
concern, the initial offer MAY omit an active candidate. In such a If these issues are a concern, the initial offer MAY omit an
case, an updated offer will need to be sent immediately when operating candidate. This is done by including an m/c-line with an
communicating with an ICE-unaware agent, setting an active candidate. a=inactive attribute. In such a case, an updated offer will need to
be sent immediately when communicating with an ICE-unaware agent,
setting an operating candidate.
There may be transport-specific reasons for selection of an active There may be transport-specific reasons for selection of an operating
candidate. In such a case, specifications defining usage of ICE with candidate. In such a case, specifications defining usage of ICE with
other transport protocols SHOULD document such considerations. other transport protocols SHOULD document such considerations.
7.3 Encoding Candidates into SDP 7.3. Encoding Candidates into SDP
For each candidate for a media stream, the agent includes a series of For each candidate for a media stream, the agent includes a series of
a=candidate attributes as media-level attributes, one for each a=candidate attributes as media-level attributes, one for each
component in the candidate. Each candidate has a unique identifier, component in the candidate. Each candidate has a unique identifier,
called the candidate-id. The candidate-id MUST be chosen randomly called the candidate ID. The candidate ID MUST be chosen randomly
and contain at least 24 bits of randomness (this does not mean that and contain at least 24 bits of randomness. This means that a
the candidate-id is 24 bits long; just that it has at least 24 bits candidate ID must be at least 4 characters long, since each character
of randomness). It is chosen only when the candidate is placed into in the base64 alphabet used for candidate IDs contains at most 6 bits
the SDP for the first time; subsequent offers or answers within the of randomness. A candidate ID MAY be longer than 4 characters, and
same session containing that same candidate MUST use the same different candidate IDs MAY have different lengths. It is chosen
candidate-id used previously. 24 bits is sufficient because the only when the candidate is placed into the SDP for the first time;
candidate-id is not providing security (the much more random password subsequent offers or answers within the same session containing that
is). It is needed only to prevent a possible simultaneous selection same candidate MUST use the same candidate ID used previously. 24
by two agents within a private network for the useful lifetime of the bits is sufficient because the candidate ID is not providing security
software or hardware. (the much more random password is). Its sole purpose is to make it
highly unlikely that both the offerer and answerer select the same
value for a candidate for the same media stream. Different values
for the candidate ID are required to break ties in the procedure that
is used to order the candidate pairs.
Each component of the candidate has an identifier, called the Each component of the candidate has an identifier, called the
component-id. The component-id is a sequence number. For each component ID. The component ID is a sequence number. For each
candidate, it starts at one, and increments by one for each candidate, it starts at one, and increments by one for each
component. As discussed below, ICE will perform connectivity checks component. As discussed below, ICE will perform connectivity checks
such that, between a pair of candidates, checks only occur between such that, between a pair of candidates, checks only occur between
transport addresses with the same component-id. As a consequence, if transport addresses with the same component ID. As a consequence, if
one candidate has three components, and it is paired with a candidate one candidate has three components, and it is paired with a candidate
that has two, there will only be two transport address pairs and two that has two, there will only be two transport address pairs and two
connectivity checks. connectivity checks.
ICE will work without a standardized mapping between the components ICE will work without a standardized mapping between the components
of a media stream and the numerical value of the component-id. This of a media stream and the numerical value of the component ID. This
allows ICE to be used with media streams with multiple components allows ICE to be used with media streams with multiple components
without development of standards around such a mapping. However, a without development of standards around such a mapping. However, a
specific mapping has been defined in this specification for RTP - specific mapping has been defined in this specification for RTP -
component-id 1 corresponds to RTP, and component-id of 2 corresponds component ID 1 corresponds to RTP, and component ID of 2 corresponds
to RTCP. Like the candidate-id, the component-id is assigned at the to RTCP. Like the candidate ID, the component ID is assigned at the
time the candidate is first placed into the SDP; subsequent offers or time the candidate is first placed into the SDP; subsequent offers or
answers within the same session containing that same candidate MUST answers within the same session containing that same candidate MUST
use the same component-id used previously. use the same component ID used previously.
The transport, addr and port of the a=candidate attribute (all The transport, addr and port of the a=candidate attribute (all
defined in Section 12) are set to the transport protocol, unicast defined in Section 12) are set to the transport protocol, unicast
address and port of the tranport address. A Fully Qualified Domain address and port of the tranport address. A Fully Qualified Domain
Name (FQDN) for a host MAY be used in place of a unicast address. In Name (FQDN) for a host MAY be used in place of a unicast address. In
that case, when receiving an offer or answer containing an FQDN in an that case, when receiving an offer or answer containing an FQDN in an
a=candidate attribute, the FQDN is looked up in the DNS using an A or a=candidate attribute, the FQDN is looked up in the DNS using an A or
AAAA record, and the resulting IP address is used for the remainder AAAA record, and the resulting IP address is used for the remainder
of ICE processing. The qvalue is set to the priority of the of ICE processing. The qvalue is set to the priority of the
candidate, and MUST be the same for all components of the candidate. candidate, and MUST be the same for all components of the candidate.
The agent MUST include a type for the transport address by populating
the candidate-types production with the appropriate value - "local"
for local transport addresses, "srflx" for server reflexive
candidates, and "relay" for relayed candidates. If the transport
address is server reflexive, the agent MUST include the rel-addr and
rel-port productions containing the associated local transport
address for that server reflexive transport address. There are
environments in which the policy of an agent is such that it never
provides local transport addresses in its offers or answers, for fear
of revealing internal topology to external hosts. In such cases, an
agent MAY include a random transport address instead, as long as it
is the same transport address for all server reflexive candidates
derived from the same actual local transport address. This is
because the transport address in the rel-addr and rel-port production
are used by the ICE algorithm itself for correlation purposes.
If the tranport address is relayed, the agent SHOULD include the rel-
addr and rel-port productions, containing the associated server
reflexive transport address. When a relayed address is obtained from
a STUN relay, the associated server reflexive transport address is
the value from the XOR-MAPPED-ADDRESS that was returned in the same
STUN response which provided the relayed address to the agent.
Though not used directly with ICE, the rel-addr and rel-port
attributes are essential for proper functioning of QoS mechanisms,
such as those defined by 3gpp and Packetcable.
The rel-addr and rel-port production MUST NOT be present for a local
transport address.
All of the candidates for a media stream share a password that is All of the candidates for a media stream share a password that is
used for securing the STUN connectivity checks. Furthermore, the used for securing the STUN connectivity checks. The password will be
password for candidates for different media streams MAY be the same, used to process the MESSAGE-INTEGRITY attribute for STUN requests
or MAY be different. This password MUST be chosen randomly with 128 received by the agent. The password for candidates for different
bits of randomness (though it can be longer than 128 bits). This media streams MAY be the same, or MAY be different. This password
password is contained in the a=ice-pwd attribute, present as a MUST be chosen randomly with 128 bits of randomness (though it can be
session or media level attribute. New passwords MUST be selected for longer than 128 bits). This password is contained in the a=ice-pwd
each new session, even if the transport address from a previous attribute, present as a session or media level attribute. Since each
session was being recycled. character of the ice-pwd attribute can represent six bits of
randomness, the ice-pwd attribute will always be at least 22
characters long. New passwords MUST be selected for each new
session, even if the transport address from a previous session is
being recycled.
The combination of candidate-id and component-id uniquely identify The combination of candidate ID and component ID uniquely identify
each transport address. As a consequence, each transport address has each transport address. As a consequence, each transport address has
a unique identifier, called the tid. The tid is formed by a unique identifier, called the transport address ID. The transport
concatenating the candidate-id with the component-id, separated by address ID is formed by concatenating the candidate ID with the
the colon (":"). The tid is not explicitly encoded in the SDP; it is component ID, separated by the colon (":"). The transport address ID
derived from the candidate-id and component-id, which are present in is not explicitly encoded in the SDP; it is derived from the
the SDP. The usage of the colon as a separator allows the candidate ID and component ID, which are present in the SDP. The
candidate-id and component-id to be extracted from the tid, since the usage of the colon as a separator allows the candidate ID and
colon is not a valid character for the candidate-id. component ID to be extracted from the transport address ID, since the
colon is not a valid character for the candidate ID.
The tid gets combined, through further concatenation, with the tid of The transport address ID gets combined, through further
a transport address from the remote candidate (separated again by concatenation, with the transport address ID of a transport address
another colon) to form the username that is placed in the STUN checks from the remote candidate (separated again by another colon) to form
between the peers. This allows the STUN message to uniquely identify the username that is placed in the STUN checks between the peers.
the pairing whose connectivity it is checking. The tid is needed as This allows the STUN message to uniquely identify the pairing whose
a unique identifier because the IP address within the candidate fails connectivity it is checking. The transport address ID is needed as a
unique identifier because the IP address within the candidate fails
to provide that uniqueness as a consequence of NAT. to provide that uniqueness as a consequence of NAT.
Consider agents A, B, and C. A and B are within private enterprise 1, Consider agents A, B, and C. A and B are within private enterprise 1,
which is using 10.0.0.0/8. C is within private enterprise 2, which which is using 10.0.0.0/8. C is within private enterprise 2, which
is also using 10.0.0.0/8. As it turns out, B and C both have IP is also using 10.0.0.0/8. As it turns out, B and C both have IP
address 10.0.1.1. A sends an offer to C. C, in its answer, provides address 10.0.1.1. A sends an offer to C. C, in its answer, provides
A with its transport addresses. In this case, thats 10.0.1.1:8866 A with its transport addresses. In this case, that is 10.0.1.1:8866
and 8877. As it turns out, B is in a session at that same time, and and 10.0.1.1:8877. As it turns out, B is in a session at that same
is also using 10.0.1.1:8866 and 8877. This means that B is prepared time, and is also using 10.0.1.1:8866 and 10.0.1.1:8877. This means
to accept STUN messages on those ports, just as C is. A will send a that B is prepared to accept STUN messages on those ports, just as C
STUN request to 10.0.1.1:8866 and 8877. However, these do not go to is. A will send a STUN request to 10.0.1.1:8866 and and another to
C as expected. Instead, they go to B. If B just replied to them, A 10.0.1.1:8877. However, these do not go to C as expected. Instead,
would believe it has connectivity to C, when in fact it has they go to B. If B just replied to them, A would believe it has
connectivity to a completely different user, B. To fix this, tid connectivity to C, when in fact it has connectivity to a completely
takes on the role of a unique identifier. C provides A with an different user, B. To fix this, the transport address ID takes on the
identifier for its transport address, and A provides one to C. A role of a unique identifier. C provides A with an identifier for its
concatenates these two identifiers (with a colon between) and uses transport address, and A provides one to C. A concatenates these two
the result as the username in its STUN query to 10.0.1.1:8866. This identifiers (with a colon between) and uses the result as the
STUN query arrives at B. However, the username is unknown to B, and username in its STUN query to 10.0.1.1:8866. This STUN query arrives
so the request is rejected. A treats the rejected STUN request as if at B. However, the username is unknown to B, and so the request is
there were no connectivity to C (which is actually true). Therefore, rejected. A treats the rejected STUN request as if there were no
the error is avoided. connectivity to C (which is actually true). Therefore, the error is
avoided.
An unfortunate consequence of the non-uniqueness of IP addresses is An unfortunate consequence of the non-uniqueness of IP addresses is
that, in the above example, B might not even be an ICE agent. It that, in the above example, B might not even be an ICE agent. It
could be any host, and the port to which the STUN packet is directed could be any host, and the port to which the STUN packet is directed
could be any ephemeral port on that host. If there is an application could be any ephemeral port on that host. If there is an application
listening on this socket for packets, and it is not prepared to listening on this socket for packets, and it is not prepared to
handle malformed packets for whatever protocol is in use, the handle malformed packets for whatever protocol is in use, the
operation of that application could be affected. Fortunately, since operation of that application could be affected. Fortunately, since
the ports exchanged in SDP are ephemeral and ususally drawn from the the ports exchanged in SDP are ephemeral and usually drawn from the
dynamic or registered range, the odds are good that the port is not dynamic or registered range, the odds are good that the port is not
used to run a server on host B, but rather is the agent side of some used to run a server on host B, but rather is the agent side of some
protocol. This decreases the probability of hitting a port in-use, protocol. This decreases the probability of hitting a port in-use,
due to the transient nature of port usage in this range. However, due to the transient nature of port usage in this range. However,
the possibility of a problem does exist, and network deployers should the possibility of a problem does exist, and network deployers should
be prepared for it. Note that this is not a problem specific to ICE; be prepared for it. Note that this is not a problem specific to ICE;
stray packets can arrive at a port at any time for any type of stray packets can arrive at a port at any time for any type of
protocol, especially ones on the public Internet. As such, this protocol, especially ones on the public Internet. As such, this
requirement is just restating a general design guideline for Internet requirement is just restating a general design guideline for Internet
applications - be prepared for unknown packets on any port. applications - be prepared for unknown packets on any port.
The active candidate, if there is one, is placed into the m/c lines The operating candidate, if there is one, is placed into the m/c
of the SDP. For RTP streams, this is done by placing the RTP address lines of the SDP. For RTP streams, this is done by placing the RTP
and port into the c and m lines in the SDP respectively. If the address and port into the c and m lines in the SDP respectively. If
agent is utilizing RTCP, it MUST encode its address and port using the agent is utilizing RTCP, it MUST encode its address and port
the a=rtcp attribute as defined in RFC 3605 [1]. If RTCP is not in using the a=rtcp attribute as defined in RFC 3605 [1]. If RTCP is
use, the agent MUST signal that using b=RS:0 and b=RR:0 as defined in not in use, the agent MUST signal that using b=RS:0 and b=RR:0 as
RFC 3556 [6]. defined in RFC 3556 [6].
If there is no active candidate, the agent MUST include an a=inactive If there is no operating candidate, the agent MUST include an
attribute. The RTP address and port in the m/c-line is a=inactive attribute. The media address and port in the m/c-line is
inconsequential, since it won't be used. inconsequential, since it won't be used.
Encoding of candidates may involve transport protocol specific Encoding of candidates may involve transport protocol specific
considerations. There are none for UDP. However, extensions that considerations. There are none for UDP. However, extensions that
define usage of ICE with other transport protocols SHOULD specify any define usage of ICE with other transport protocols SHOULD specify any
special encoding considerations. special encoding considerations.
Once an offer or answer are sent, an agent MUST be prepared to Once an offer or answer are sent, an agent MUST be prepared to
receive both STUN and media packets on each candidate. As discussed receive both STUN and media packets on each candidate. As discussed
in Section 7.13, media packets can be sent to a candidate prior to in Section 7.13, media packets can be sent to a candidate prior to
its promotion to active. its promotion to operating.
7.4 Forming Candidate Pairs 7.4. Forming Candidate Pairs
Once the offer/answer exchange has completed, both agents will have a Once the offer/answer exchange has completed, both agents will have a
set of candidates for each media stream. Each agent forms a set of set of candidates for each media stream. Each agent forms a set of
candidate pairs for each media stream by combining each of its candidate pairs for each media stream by combining each of its
candidates with each of the candidates of its peer. Candidates can candidates with each of the candidates of its peer. Candidates can
be paired up only if their transport protocols are identical. If an be paired up only if their transport protocols are identical. Each
offer/answer exchange took place for a session comprised of an audio candidate has a number of components, each of which has a transport
and a video stream, and each agent had two candidates per media address. Within a candidate pair, the components themselves are
stream, there would be 8 candidate pairs, 4 for audio and 4 for paired up such that transport addresses with the same component ID
video. One agent can offer two candidates for a media stream, and are combined to form a transport address pair. If one candidate has
the answer can contain three candidates for the same media stream. more components than the other, those extra components will not be
In that case, there would be six candidate pairs. part of a transport address pair, won't be validated, and will
effectively be treated as if they weren't included in the candidate
pair in the first place.
Each candidate has a number of components, each of which has a For example, if an offer/answer exchange took place for a session
transport address. Within a candidate pair, the components comprised of an audio and a video stream, and each agent had two
themselves are paired up such that transport addresses with the same candidates per media stream, there would be 8 candidate pairs, 4 for
component ID are combined to form a transport address pair. audio and 4 for video. For each of the 8 candidate pairs, there
Returning to the previous example, for each of the 8 candidate pairs, would be two transport address pairs - one for RTP, and one for RTCP.
there would be two transport address pairs - one for RTP, and one for
RTCP. If one candidate has more components than the other, those
extra components will not be part of a transport address pair, won't
be validated, and will effectively be treated as if they weren't
included in the candidate pair in the first place.
The relationship between a candidate, candidate pair, transport The relationship between a candidate, candidate pair, transport
address, transport address pair and component are shown in Figure 3. address, transport address pair and component are shown in Figure 7.
This figure shows the relationships as seen by the agent that owns This figure shows the relationships as seen by the agent that owns
the candidate with candidate ID "L". This candidate has two the candidate with candidate ID "L". This candidate has two
components with transport addresses A and B respectively. This components with transport addresses A and B respectively. This
candidate is called the native candidate, since it is the one owned candidate is called the native candidate, since it is the one owned
by the agent in question. The candidate owned by its peer is called by the agent in question. The candidate owned by its peer is called
the remote candidate. As the figure shows, there is a single the remote candidate. As the figure shows, there is a single
candidate pair, and two components in each candidate. The native candidate pair, and two components in each candidate. The native
candidate has a candidate-id of "L", and the remote candidate has a candidate has a candidate ID of "L", and the remote candidate has a
candidate-id of "R". Since the two component-ids are 1 and 2, candidate ID of "R". Since the two component IDs are 1 and 2,
candidate "L" has two transport addresses with transport address IDs candidate "L" has two transport addresses with transport address IDs
of "L:1" and "L:2" respectively. Similarly, candidate "R" has two of "L:1" and "L:2" respectively. Similarly, candidate "R" has two
transport addresses with transport address IDs of "R:1" and "R:2" transport addresses with transport address IDs of "R:1" and "R:2"
respectively. respectively. Note that these candidate IDs are not actually legal
since they are not sufficiently random. However, we use "L" and "R"
to keep the figures readable.
Furthermore, each transport address pair is associated with an ID, Furthermore, each transport address pair is associated with an ID,
the transport address pair ID. This ID is equal to the concatenation the transport address pair ID. This ID is equal to the concatenation
of the tid of the native transport address with the tid of the remote of the transport address ID of the native transport address with the
transport address, separated by a colon. This means that the transport address ID of the remote transport address, separated by a
identifiers are seen differenly for each agent. For the agent that colon. This means that the identifiers are seen differenly for each
owns candidate "L", there are two transport address pairs. One agent. For the agent that owns candidate "L", there are two
contains transport address "L:1" and "R:1", with a transport address transport address pairs. One contains transport address "L:1" and
pair ID of "L:1:R:1". The other contains transport address "L:2" and "R:1", with a transport address pair ID of "L:1:R:1". The other
"R:2", with a transport address pair ID of "L:2:R:2". For the agent contains transport address "L:2" and "R:2", with a transport address
that owns candidate "R", the identifiers for these two transport pair ID of "L:2:R:2". For the agent that owns candidate "R", the
address pairs are reversed; it would be "R:1:L:1" for the first one identifiers for these two transport address pairs are reversed; it
and "R:2:L:2" for the second. would be "R:1:L:1" for the first one and "R:2:L:2" for the second.
............................................... ...............................................
. . . .
. . . .
. ............. ............. . . ............. ............. .
. . tid=L:1 . . tid=R:1 . . . . tid=L:1 . . tid=R:1 . .
. . -- . . -- . . component . . -- . . -- . . component
component. . | A|------------------------| C| . . id=1 component. . | A|------------------------| C| . . id=1
id=1 . . -- . Transport . -- . . id=1 . . -- . Transport . -- . .
. . . Address . . . . . . Address . . .
skipping to change at page 25, line 36 skipping to change at page 33, line 36
. ............. ............. . . ............. ............. .
. Native Remote . . Native Remote .
. Candidate Candidate . . Candidate Candidate .
. id=L id=R . . id=L id=R .
. . . .
. . . .
............................................... ...............................................
Candidate Pair Candidate Pair
Figure 3 Figure 7
If a candidate pair was created as a consequence of an offer If a candidate pair was created as a consequence of an offer
generated by an agent, then that agent is said to be the offerer of generated by an agent, then that agent is said to be the offerer of
that candidate pair and all of its transport address pairs. that candidate pair and all of its transport address pairs.
Similarly, the other agent is said to be the answerer of that Similarly, the other agent is said to be the answerer of that
candidate pair and all of its transport address pairs. As a candidate pair and all of its transport address pairs. As a
consequence, each agent has a particular role, either offerer or consequence, each agent has a particular role, either offerer or
answerer, for each transport address pair. This role is important; answerer, for each transport address pair. This role is important;
when a candidate pair is to be promoted to active, the offerer is the when a candidate pair is to be promoted to operating, the offerer is
one which performs the updated offer. the one which performs the updated offer.
7.5 Ordering the Candidate Pairs
For the same reason that the STUN transactions during address 7.5. Ordering the Candidate Pairs
gathering are paced at a rate of Ta transactions per second, so too
are the connectivity checks paced, also at a rate of Ta transactions
per second. However, in order to rapidly converge on a valid
candidate pair that is mutually desirable, the candidate pairs are
ordered, and the checks start with the candidate pair at the top of
the list. Rapid convergence of ICE depends on both the offerer and
answerer coming to the same conclusion on the ordering of candidate
pairs.
Recall that when each candidate is encoded into SDP, it contains a Recall that when each candidate is encoded into SDP, it contains a
qvalue between 1 and 0, with 1 being the highest priority. Peer qvalue between 1 and 0, with 1 being the highest priority. Peer
reflexive candidates, learned through the procedures described in reflexive candidates, learned through the procedures described in
Section 7.10 also have a priority between 0 and 1. For each media Section 7.10 also have a priority between 0 and 1. For each media
stream, the native candidates are ordered based on their qvalues, stream, the native candidates are ordered based on their qvalues,
with higher q-values coming first. Amongst candidates with the same with higher q-values coming first. Amongst candidates with the same
qvalue, they are ordered based on candidate ID, using reverse qvalue, they are ordered based on candidate ID, using reverse ASCII
lexicographic order, where C1 is placed before C2, if C2 precedes C1 sort order. For example, the candidate with candidate ID "lagDx"
lexicographically. Lexicographic order can be viewed as a numerical sorts before the candidate with ID "bad79", and both of those follow
ordering where each "digit" is actually a number in numerical base the candidate with ID "m8zz".
256, with the mapping of characters to numerical value being defined
by their ASCII encoding. For example, the candidate with candidate
ID agD is greater than the candidate with ID ad7, and both of those
are greater than the candidate with ID zz. Consequently, if these
three candidates had equal q-values, they would be ordered as agD,
ad7, zz - reverse of their lexicographic order.
The usage of a reverse lexicographic order is important; as discussed The usage of a reverse ASCII sort order is important; as discussed in
in Section 13, it allows peer-derived candidates to be preferred over Section 13, it allows peer-derived candidates to be preferred over
native ones. native ones.
The result of these ordering rules will be an ordered list of The result of these ordering rules will be an ordered list of
candidates. The first candidate in this list is given a sequence candidates. The first candidate in this list is given a sequence
number of 1, the next is given a sequence number of 2, and so on. number of 1, the next is given a sequence number of 2, and so on.
This same procedure is done for the remote candidates. The result is This same procedure is done for the remote candidates. The result is
that each candidate pair has two sequence numbers, one for the native that each candidate pair has two sequence numbers, one for the native
candidate, and one for the remote candidate. candidate, and one for the remote candidate.
First, all of the candidate pairs for whom the smaller of the two First, all of the candidate pairs for whom the smaller of the two
sequence numbers equals 1 are taken first. Then, all of those for sequence numbers equals 1 are taken first. Then, all of those for
whom the smaller of the two sequence numbers equals 2 are taken next, whom the smaller of the two sequence numbers equals 2 are taken next,
and so on. Amongst those pairs that share the same value for their and so on. Amongst those pairs that share the same value for their
smaller sequence number, they are ordered by the larger of their two smaller sequence number, they are ordered by the larger of their two
sequence numbers (smallest first). Amongst those pairs that share sequence numbers (smallest first). Amongst those pairs that share
the same value for their smaller sequence number and the same value the same value for their smaller sequence number and the same value
for their larger sequence number, the larger of the two candidate IDs for their larger sequence number, the larger of the two candidate IDs
in each pair are selected, and the pairs are lexicographically in each pair are selected, and the pairs are ordered in reverse ASCII
ordered in reverse by that candidate ID, largest first. order of the candidate ID, largest first.
The resulting ordering of candidate pairs is called the candidate
pair priority ordered list.
As an example, consider two agents, A and B. One offers two As an example, consider two agents, A and B. One offers two
candidates for a media stream with candidate IDs of "g9" and "88", candidates for a media stream with candidate IDs of "g9g9" and
with q-values of 1.0 and 0.8 respectively. The other answers with "8888", with q-values of 1.0 and 0.8 respectively. The other answers
three candidates with candidate IDs of "h8", "65" and "kl", with with three candidates with candidate IDs of "h8h8", "6565" and
q-values of 0.3, 0.2 and 0.1 respectively. The following table shows "klkl", with q-values of 0.3, 0.2 and 0.1 respectively. The
the rank ordering of the six candidate pairs. The column labeled following table shows the rank ordering of the six candidate pairs.
"Max SN" is the larger of the two sequence numbers in the candidate The column labeled "Max SN" is the larger of the two sequence numbers
pair, and "Min SN" is the minimum. The column labeled "Max Cand. in the candidate pair, and "Min SN" is the minimum. The column
ID" is the value of the larger of the two candidate IDs in the labeled "Max Cand. ID" is the value of the larger of the two
candidate pair. candidate IDs in the candidate pair.
Order A A A B B B Max Order A A A B B B Max
Cand. Cand. Cand. Cand. Cand. Cand. Max Min Cand. Cand. Cand. Cand. Cand. Cand. Cand. Max Min Cand.
ID q-value SN ID q-value SN SN SN ID ID q-value SN ID q-value SN SN SN ID
--------------------------------------------------------------------- ---------------------------------------------------------------------
1 g9 1.0 1 h8 0.3 1 1 1 h8 1 g9g9 1.0 1 h8h8 0.3 1 1 1 h8h8
2 88 0.8 2 h8 0.3 1 2 1 h8 2 8888 0.8 2 h8h8 0.3 1 2 1 h8h8
3 g9 1.0 1 65 0.2 2 2 1 g9 3 g9g9 1.0 1 6565 0.2 2 2 1 g9g9
4 g9 1.0 1 k1 0.1 3 3 1 k1 4 g9g9 1.0 1 klkl 0.1 3 3 1 klkl
5 88 0.8 2 65 0.2 2 2 2 88 5 8888 0.8 2 6565 0.2 2 2 2 8888
6 88 0.8 2 k1 0.1 3 3 2 k1 6 8888 0.8 2 klkl 0.1 3 3 2 klkl
This ordering is then modified slightly by taking the candidate pair The candidate pair priority ordered list is then used to obtain an
corresponding to the active candidate, if there is one, and promoting ordered list of transport address pairs, on which the agent will, in
it to the top of the list. To find this candidate pair, the agent order, attempt to send STUN connectivity checks. This list, called
looks for candidate pairs whose native and remote transport addresses the transport address pair check ordered list, is very similar to the
match the native and remote transport addresses in the m/c-line. It candidate pair priority ordered list, but differs in two important
is possible that multiple candidates match; this happens in the case respects. Firstly, the candidate pairs matching the operating
where an agent obtained the same derived transport address from candidate pair (there can actually be more than one) get promoted to
different local transport addresses. In such a case, the agent the top of the list. This allows the operating candidate pair to be
should pick one of the matching candidates. validated first. Secondly, many of the checks would be redundant,
and a filtering algorithm is used to eliminate these redundant
checks.
Putting the active candidate at the top of the list allows it to be Ordering of candidates may involve transport protocol specific
tested first. As discussed below, media is not sent until the considerations. There are none for UDP. However, extensions that
corresponding candidate is verified, necessitating rapid verification define usage of ICE with other transport protocols SHOULD specify any
of the active candidate. This modified ordering is called the special ordering considerations.
candidate pair check ordering, since it reflects the order in which
connectivity checks will be done. If there was no active candidate, To form the transport address pair check ordered list, the candidate
the candidate pair check ordering and the candidate pair priority list is first modified by taking the candidate pairs corresponding to
ordering will be identical. the operating candidate pair, and promoting them to the top of the
list. A candidate pair matches the operating candidate pair when its
native and remote transport address match the native and remote
transport addresses in the m/c-line, respectively. In unusual
circumstances, there may be more than one such candidate pair. In
such a case, they should be promoted such that the higher priority
candidate pairs appear first. In addition, it is possible that none
of the candidate pairs match the operating candidate pair. In that
case, no candidate pairs are promoted.
Within each candidate pair there will be a set of transport address Within each candidate pair there will be a set of transport address
pairs, one for each component ID. Those pairs are ordered by pairs, one for each component ID. Those pairs are ordered by
component ID. The result is an absolute ordering of all transport component ID. The result is an absolute ordering of all transport
address pairs for a media stream, sorted first by the order of their address pairs for a media stream, sorted first by the order of their
candidate pairs (with the exception of the active candidate), candidate pairs (with the exception of the operating candidate),
followed by the order of their component IDs. This ordering is followed by the order of their component IDs. This ordering is used
called the transport address pair check ordering. as the start of the transport address pair check ordering.
Ordering of candidates may involve transport protocol specific The next step is to remove redundant transport addresses. Starting
considerations. There are none for UDP. However, extensions that at the top of the list, the agent moves down from one transport
define usage of ICE with other transport protocols SHOULD specify any address pair to the next. If a transport address pair under
special ordering considerations. consideration has the same remote transport address as a previous
pair, based on transport address pair ID comparisons, and the native
transport address from that previous pair has the same origination
transport address as the one under consideration (based on IP address
and port comparison), the one under consideration is removed from the
list.
7.6 Performing the Connectivity Checks The origination transport address is the address that the agent would
send from in order to emit a packet with that native transport
address as a source transport address. For a local transport
address, the origination transport address is equal to that local
transport address. For a server reflexive transport address, the
origination transport address is equal to the local transport address
from which it was derived. For relayed addresses, packets are
emitted by explicitly sending them through the relay. Consequently,
the origination transport address is equal to the relayed address.
Connectivity checks are a STUN usage defined in [13]. They are After the agent has gone through the entire list, the result is the
transport address pair check ordered list.
The pairs that get removed are redundant since the agent would send a
STUN connectivity check using the same source and destination
addresses as a previous check. Consequently, the connectivity check
will provide no information to the remote agent except for the
transport address pair ID its associated with. These turn out to be
unnecesary due to the STUN processing rules outlined below.
7.6. Performing the Connectivity Checks
Connectivity checks are a STUN usage defined in [12]. They are
performed by sending peer-to-peer STUN Binding Requests. These performed by sending peer-to-peer STUN Binding Requests. These
checks result in a candidate progressing through a state machine that checks result in a transport address pair progressing through a state
captures the progress of connectivity checks. The specific state machine that captures the progress of the connectivity checks. The
machine and the procedures for the connectivity checks are specific specific state machine and the procedures for the connectivity checks
to the transport protocol. This specification defines rules for UDP. are specific to the transport protocol. This specification defines
Extensions to ICE that describe other transport protocols SHOULD rules for UDP. The state machine processing described in this
describe the state machine and the procedures for connectivity section MUST be followed by agents. Extensions to ICE that describe
checks. other transport protocols SHOULD describe the state machine and the
procedures for connectivity checks.
The set of states visited by the offerer and answerer are depicted The set of states for a transport address pair visited by the offerer
graphically in Figure 5 and answerer are depicted graphically in Figure 9. Note that this
state machine exists for all transport address pairs, including ones
pruned from the transport address pair check ordered list.
| |
|Start |Start
| |
| |
V V
+------------+ +------------+
| | +-----------------| |
| |
| Waiting |----------------+
| | |
| | | | | |
+------------+ | | +----| Waiting |----------------+
| |
| Timer Ta | Get Req
| --------. | -------
| Send Req Get Req | Send Res,
V ------- | Send Req
Get Res +------------+ Send Res, |
------- | | Re-Xmit |
- | | Req |
+---------------| Testing |-----------+ |
| | | | | | | | | |
| | | | | | | | | |
| +------------+ | | | Miss | +------------+ |
| | | | | ---- | | |
| | Error | | Match Res| - | | Selected | Match Req
| | ----- | | ---------| | | --------. | -------
Timer Tr | | - | | - | | | Send Req Match Req | Send Req
-------- V V V V | | V --------- |
Send Req +------------+ +------------+ +------------+ | Match Res | +------------+ Re-Xmit |
+-----| | | | | | | --------- | | | Req |
| - | | | |
| +------c----| Testing |-----------+ |
| | | | | | |
| | | | | | |
| | | +------------+ | |
| | | | | |
| | | | Error or | |
| | | | Miss | |
Timer Tr | | | | ----- | |
-------- V V | V - V V
Send Req +------------+ | +------------+ +------------+
+-----| | +--->| | | |
| | Recv- | | | | Send- | | | Recv- | | | | Send- |
| | Valid |------->| Invalid |<-------| Valid | | | Valid |------->| Invalid |<-------| Valid |
| | | | | | | | | | | | | |
+---->| | Error | | Error | | +---->| | Error, | | Error, | |
+------------+ ----- +------------+ ----- +------------+ +------------+ Miss +------------+ Miss +------------+
| - ^ - | | ----- ^ ----- |
| | Error | | - | Error, - |
| | Miss |
| | ----- | | | ----- |
| | - | | | - |
| +------------+ | | +------------+ |
| | | | | | | |
| | | | | | | |
+-------------->| Valid |<-------------+ +-------------->| Valid |<-------------+
Get Req | | Get Res Match Req | | Match Res
------- | | ------- --------- | | ---------
Send Res +------------+ - - +------------+ -
| ^ | ^
| | | |
| | | |
+-------+ +-------+
Timer Tr Timer Tr
-------- --------
Send Req Send Req
Figure 5 Figure 9
The state machine has six states - waiting, testing, Recv-Valid, The state machine has six states - Waiting, Testing, Recv-Valid,
Send-Valid, Valid and Invalid. Initially, all transport address Send-Valid, Valid and Invalid. In the Waiting state, the agent is
pairs start in the waiting state. In this state, the agent waits for waiting to send or receive a connectivity check for the pair. In the
one of two events - a chance to send a Binding Request, or receipt of Testing state, the agent has sent a connectivity check and is
a Binding Request. awaiting a response. In the Recv-Valid state, the agent knows that
its peer can receive packets from it on this transport address pair.
In the Send-Valid state, the agent knows that its peer can send
packets to it. In the Valid state, the agent knows that its peer can
both send and receive packets from it.
Initially, all transport address pairs start in the Waiting state.
In this state, the agent waits for one of three events - a chance to
send a Binding Request, receipt of a Binding Request, or receipt of a
Binding Response.
Since there is an instance of the state machine for each transport Since there is an instance of the state machine for each transport
address pair, Binding Requests and responses need to be matched to address pair, Binding Requests and responses need to be matched to
the specific state machine for which they apply. This is done by the specific state machine for which they were meant to apply. As
computing the matching transport address pair for each Binding described below, the Binding Request may not be a match for the
Request. This is done by examining the USERNAME of the incoming transport address pair it was meant to validate. To find the
Binding Request. The USERNAME directly contains the transport transport address pair it was meant to validate, called the target
address pair ID. Requests that are sent by an agent as part of the transport address pair, the agent examinines the USERNAME of the
processing described here encode the transport address pair in the incoming Binding Request. The USERNAME directly contains the
USERNAME. Binding Responses are matched to their requests using the transport address pair ID for the pair it was meant to validate.
STUN transaction ID, and then mapped to the transport address pair Binding Responses are matched to their requests using the STUN
from that. transaction ID, and then mapped to the transport address pair from
that.
Every Ta seconds, the agent starts a new connectivity check for a For each media stream, the agent starts a new connectivity check for
transport address pair. The check is started for the first transport a transport address pair every Tb*RND seconds. Tb SHOULD scale
address pair in the transport address pair check ordered list (which linearly with the number of media streams, so that the pace of
will be part of the active candidate) that is in the Waiting state. connectivity checks overall is invariant to the number of media
The state machine for this transport address pair is moved to the streams. Consequently, it is RECOMMENDED that Tb have a default
Testing state, and the agent sends a connectivity check using a STUN value of N*50ms, where N is the number of media streams. RND is a
Binding Request, as outlined in Section 7.7. Once a STUN random number chosen uniformly between 0.7 and 1.3, and it helps to
connectivity check begins, the processing of the check follows the avoid synchronization between the transmission of connectivity checks
rules for STUN. Specifically, retransmits of STUN requests are done for different media streams. On average, if there are N media
as specified in [13], and furthermore, if a transaction fails and streams, the checks across all media streams will be paced out at a
needs to be retried, that retry can happen rapidly, as described total of N/Tb checks per second. The check is started for the first
below. It doesn't "count" against the rate limit of 1/Ta checks per transport address pair in the transport address pair check ordered
second. In addition, the keepalives that are generated for a valid list that is in the Waiting state. The "Selected" event is passed to
pair do not count against the rate limit either. The rate limit the state machine for this transport address pair, causing it to be
applies strictly to the start of connectivity checks for a transport moved to the Testing state. The agent then sends a connectivity
address pair that has been newly signaled through an offer/answer check using a STUN Binding Request, as outlined in Section 7.7.
exchange.
In addition, if, while in the Waiting state, an agent receives a Once a STUN connectivity check begins, the processing of the check
Binding Request matching that transport address pair, and this follows the rules for STUN. Specifically, retransmits of STUN
Binding Request generates a successful response, the transport requests are done as specified in [12], and furthermore, if a
address pair moves into the Send-Valid state, and the agent sends a transaction fails and needs to be retried, that retry can happen
connectivity check of its own using a STUN Binding Request, as rapidly, as described below. It doesn't "count" against the average
outlined in Section 7.7. If the Binding Request didn't generate a rate limit of 1/Tb checks per second per media stream. In addition,
success response, there is no change in state or generation of a the keepalives that are generated for a valid pair do not count
Binding Request. against the rate limit either. The rate limit applies strictly to
the start of connectivity checks for a transport address pair that
has been newly signaled through an offer/answer exchange.
If, while in the Testing state, the agent receives a successful When an agent receives a Binding Request, which per the processing
response to its STUN request, the transport address pair moves into rules of Section 7.8 produces a succesful response, the agent
the Recv-Valid state. In this state, the agent knows that packets examines the source transport address of the request. If the native
can flow in both directions. However, its peer agent doesn't yet transport address was relayed, this would be the source as seen by
know that; all it knows is that it has been able to receive a packet. the relay. For the STUN relay usage, that source transport address
Thus, in this state, the agent awaits receipt of the Binding Request will be present in the REMOTE-ADDRESS attribute of a STUN Data
sent by its peer, as the response to that request is what informs its Indication message, if the Binding Request was delivered through a
peer that packets can flow in both directions. Data Indication. If the Binding Request was not encapsulated in a
Data Indication, that source transport address is equal to the
current active destination for the STUN relay session.
If, while in the Testing state, the agent receives a Binding Request If the source transport address matches the remote transport address
matching that transport address pair, and this Binding Request of the target transport address pair, the Binding Request is
generates a successful response, the transport address pair moves considered to be a match for the target transport address pair.
into the Send-Valid state. In addition, the agent retransmits a Consequently, a Match Req event is passed to the state machine for
Binding Request for the transaction in progress. This helps speed up the target transport address pair. If the state machine was in the
bidirectional connectivity verification when one agent is behind a Waiting or Testing state, the state machine moves into the Send-Valid
symmetric NAT. If the Binding Request didn't generate a success state. If it was previously in the Waiting state, the agent sends a
response, there is no change in state or generation of a Binding connectivity check of its own for the target transport address pair,
Request. as outlined in Section 7.7. If it was in the Testing state, it
retransmits a Binding Request for the transaction in progress. This
retransmission is one that would not normally occur based on the
procedures in [12]. ICE "prods" the STUN transaction state machine
to send an extra retransmit, in addition to the one which is
scheduled to be sent next. This helps speed up bidirectional
connectivity verification when one agent is behind a NAT with an
address and port dependent filtering behavior [32].
If, while in the Send-Valid state, the agent receives a successful If the source transport addresses in the Binding Request was not a
response to its STUN request, the transport address pair moves to the match for the remote transport address, the Binding Request is
Valid state. In this state, the agent knows that packets can flow in considered to be a miss for the target transport address pair.
each direction. It also knows that its peer has sent it the STUN Consequently, a Miss event is passed to the state machine of the
Request whose response will demonstrate to the peer that packets can target transport address pair, and it immediately moves into the
flow in each direction. Invalid state. Typically, the source transport address won't match
when there was a NAT between the sender and receiver with an address
and port dependent mapping property, though there are other cases in
which this can happen.
If, while in the Recv-Valid state, the agent receives a STUN Binding Though it was a miss for the target transport address pair, the
Request from its peer that results in a successful response, the connectivity check may have been a match for a different transport
transport address pair moves into the Valid state. Receipt of a address pair. To determine this, the agent checks the source
request whose response was not a successful one does not result in a transport address of the Binding Request against all of the other
change in state. remote transport addresses of transport address pairs for the same
media stream that use the same transport protocol and share the same
native transport address (based on transport address ID comparison)
of the target. Of those that match (assuming at least one matches),
it refines the set further by selecting only those for whom the
origination transport address of the remote transport address matches
the origination transport address of the remote transport address in
the target transport address pair. The origination transport address
for a remote transport address is obtained from information signaled
in the SDP, and depends on the type. For a local transport address,
the origination address equals that local transport address. For a
server reflexive transport address, the origination address is
obtained from the related address information provided in the SDP.
For a relayed transport address, the origination transport address
quals that relayed transport address. For these three types, the
type is signaled in the SDP. For a peer derived transport address,
the origination address is the same as the origination address of the
generating transport address.
If there was a match (there can only be either one or zero matches),
this match is called the alternate. In many cases, the alternate
transport address pair will not be in the transport address pair
check ordered list; it will have been one of the ones pruned.
Indeed, this is why it was pruned - a check on the remaining
transport address pairs can serve to validate it. The state machine
for the alternate is passed the Match Req event. If it was in the
Waiting state, this causes it to move into the Send-Valid state, and
a connectivity check is generated for the alternate transport address
pair. It may have been in the Testing state, in which case it moves
move into the Send-Valid state, and the agent restransmits the
Binding Request for the transaction in progress. If it was the in
the Recv-Valid state, this causes it to move into the Valid state.
If no alternate could be found, it means that a new remote transport
address and corresponding origination transport address have been
discovered. In this case, the agent follows the procedures of
Section 7.10.1 to create a new transport address pair and state
machine for it.
If the Binding Request didn't generate a success response, an Error
event is passed to the state machine of the target, causing it to
move into the Invalid state.
If the agent receives a successful response to its STUN request, it
agent examines the transport address in the XOR-MAPPED-ADDRESS
attribute of the response. This will be a peer reflexive transport
address. If the peer reflexive transport address matches (based on
IP address and port comparison) the native transport address of the
target transport address pair, a Match Res event is passed to the
state machine of the target. If the state machine was in the Testing
state, the state machine moves into the Recv-Valid state. If it was
in the Send-Valid state, it moves into the Valid state.
If, however, the transport addresses didn't match, a Miss event is
passed to the state machine of the target, and it immediately moves
into the Invalid state. The agent checks the peer reflexive
transport address against all of the other native transport addresses
for transport address pairs for the same media stream with the same
transport protocol and the same remote transport address (based on
comparison of transport address ID) as the target. Of those that
match (assuming at least one matches), it refines the set further by
selecting only those for whom the origination transport address of
the native transport address matches the origination address of the
native transport address in the target transport address pair. The
resulting transport address pair (there can be only zero or one) is
called the alternate. In many cases, the alternate transport address
pair will not be in the transport address pair check ordered list; it
will have been one of the ones pruned. The state machine for the
alternate is passed the Match Res event. If it was in the Waiting
state, this causes it to move into the Recv-Valid state. It may have
been in the Testing state, in which case it moves move into the Recv-
Valid state. If it was the in the Send-Valid state, this causes it
to move into the Valid state.
If no alternate could be found, the Binding Response will create a
new peer reflexive transport address, and the procedures of
Section 7.10.2 are followed to create a new transport address pair
and state machine for it.
In any state, if the STUN transaction results in an error, the state In any state, if the STUN transaction results in an error, the state
machine moves into the invalid state. A STUN transaction produces an machine moves into the Invalid state. A STUN transaction produces an
"error" based on the processing in Section 7.7, which indicates which "error" based on the processing in Section 7.7, which indicates which
STUN response codes constitute an error as far as ICE processing is STUN response codes constitute an error as far as ICE processing is
concerned. concerned.
If a transport address pair is in the Recv-Valid or Valid state, an If a transport address pair is in the Recv-Valid or Valid state, an
agent MUST generate a new STUN Binding Request transaction every Tr agent MUST generate a new STUN Binding Request transaction every Tr
seconds. This transaction ensures that NAT bindings for the seconds. This transaction ensures that NAT bindings for the
transport address pair remain open while the candidate is under transport address pair remain open while the candidate is under
consideration. The transaction is performed as outlined in consideration. The transaction is performed as outlined in
Section 7.7. These transactions can also be used to keep the NAT Section 7.7. These transactions can also be used to keep the NAT
bindings alive when the candidate is promoted to active, as described bindings alive when the candidate is promoted to operating, as
in Section 7.12. Tr SHOULD be configurable, and SHOULD default to 15 described in Section 7.12. Tr SHOULD be configurable, and SHOULD
seconds. If the transaction results in an error, the state machine default to 15 seconds. These STUN transactions are processed in the
moves to the invalid state. This happens in cases where the NAT same way as any other, and can result in new peer derived transport
bindings expire (e.g., due to binding timeouts or NAT failures). addresses, or can fail and cause the transport address pair to be
invalidated.
The candidate pair itself has a state, which is derived from the The candidate pair itself has a state, which is derived from the
states of its transport address pairs. If at least one of the states of its transport address pairs. If at least one of the
transport address pairs in a candidate pair is in the invalid state, transport address pairs in a candidate pair is in the invalid state,
the state of the candidate pair is considered to be invalid. If the the state of the candidate pair is considered to be invalid. If the
candidate pair enters this state, an agent SHOULD move the state candidate pair enters this state, an agent moves the state machines
machines for all of the other transport address pairs in this for all of the other transport address pairs in this candidate pair
candidate pair into the invalid state as well. This will ensure that into the invalid state as well. This will ensure that connectivity
connectivity checks never start for those transport address pairs. checks never start for those transport address pairs. Furthermore,
Furthermore, if checks are already in progress for one of those if checks are already in progress for one of those transport address
transport address pairs, the agent SHOULD cease them. pairs, the agent ceases them.
If all of the transport address pairs making up the candidate pair If all of the transport address pairs making up the candidate pair
are Valid, the candidate pair is considered valid. If all of the are Valid, the candidate pair is considered valid. If all of the
transport address pairs making up the candidate pair are either Valid transport address pairs making up the candidate pair are either Valid
or Recv-Valid, and at least one is Recv-Valid, the candidate pair is or Recv-Valid, and at least one is Recv-Valid, the candidate pair is
considered to be Recv-Valid. If all of the transport address pairs considered to be Recv-Valid. If all of the transport address pairs
making up the candidate pair are either Valid or Send-Valid, and at making up the candidate pair are either Valid or Send-Valid, and at
least one is Send-Valid, the candidate pair is considered to be Send- least one is Send-Valid, the candidate pair is considered to be Send-
Valid. If all of the transport address pairs in a candidate pair are Valid. If all of the transport address pairs in a candidate pair are
in the Waiting state, the candidate pair is in the waiting state. If in the Waiting state, the candidate pair is in the waiting state. If
skipping to change at page 32, line 20 skipping to change at page 42, line 49
in the Waiting or Testing states, and at least one is in the Testing in the Waiting or Testing states, and at least one is in the Testing
state, the state of the candidate pair is Testing. Otherwise, the state, the state of the candidate pair is Testing. Otherwise, the
state of the candidate pair is considered Indeterminate. state of the candidate pair is considered Indeterminate.
A candidate itself also has a state. If a candidate is present in at A candidate itself also has a state. If a candidate is present in at
least one valid candidate pair, that candidate is said to be valid. least one valid candidate pair, that candidate is said to be valid.
If all of the candidate pairs containing that candidate are invalid, If all of the candidate pairs containing that candidate are invalid,
the candidate itself is invalid. Otherwise, the candidate's state is the candidate itself is invalid. Otherwise, the candidate's state is
Indeterminate. Indeterminate.
7.7 Sending a Binding Request for Connectivity Checks 7.7. Sending a Binding Request for Connectivity Checks
An agent performs a connectivity check on a transport address pair by An agent performs a connectivity check on a transport address pair by
sending a STUN Binding Request from its native transport address, and sending a STUN Binding Request from its native transport address, and
sending it to the remote transport address. The meaning of "sending sending it to the remote transport address. Sending from its native
from its native transport address" depends on the type of transport transport address is done by sending it from its origination
protocol and the type of transport address (local, reflexive, or transport address. As mentioned above, the origination transport
relayed). This specification defines the meaning for UDP. address depends on the type of transport protocol and the type of
Specifications defining other transport protocols must define what transport address (local, reflexive, or relayed). This specification
this means for them. defines the meaning for UDP. Specifications defining other transport
protocols must define what this means for them.
For UDP-based local transport addresses, sending from the local For UDP-based local transport addresses, sending from the local
transport address has the meaning one would expect - the request is transport address has the meaning one would expect - the request is
sent such that the source IP address and port equal that of the local sent such that the source IP address and port equal that of the local
transport address. For reflexive ransport addresses, it is sent by transport address. For reflexive transport addresses, it is sent by
sending from the associated local transport address used to derive sending from the associated local transport address used to derive
that reflesive address. For relayed transport addresses, it is sent that reflexive address. For relayed transport addresses, it is sent
by using STUN mechanisms to send the request through the STUN relay by using STUN mechanisms to send the request through the STUN relay
(using the Send request). Sending the request through the STUN relay (using the Send request). Sending the request through the STUN relay
server neccesarily requires that the request be sent from the client, server necesarily requires that the request be sent from the client,
using the local transport address used to derive the relayed using the local transport address used to derive the relayed
transport address. transport address.
The Binding Request sent by the agent MUST contain the USERNAME The Binding Request sent by the agent MUST contain the USERNAME
attribute. This attribute MUST be set to the transport address pair attribute. This attribute MUST be set to the transport address pair
ID of the corresponding transport address pair as seen by its peer. ID of the corresponding transport address pair as seen by its peer.
Thus, for the first transport address pair in Figure 3, if the agent Thus, for the first transport address pair in Figure 7, if the agent
on the left sends the STUN Binding Request, the USERNAME will have on the left sends the STUN Binding Request, the USERNAME will have
the value R:1:L:1. If the agent on the right sends the STUN Binding the value R:1:L:1. If the agent on the right sends the STUN Binding
Request, the USERNAME will have the value L:1:R:1. To be clear, the Request, the USERNAME will have the value L:1:R:1. To be clear, the
USERNAME that is used is NOT the one seen locally, but rather the one USERNAME that is used is NOT the one seen locally, but rather the one
as seen by its peer. The request SHOULD contain the MESSAGE- as seen by its peer. The request SHOULD contain the MESSAGE-
INTEGRITY attribute, computed according to [13]. The key used as INTEGRITY attribute, computed according to [12]. The key used as
input to the HMAC is the password provided by the peer for this input to the HMAC is the password provided by the peer for this
remote transport address. This password will be identical for all remote transport address. This password will be identical for all
remote transport addresses for the same media stream. remote transport addresses for the same media stream.
Note that all ICE implementations are required to be compliant to Note that all ICE implementations are required to be compliant to
[13], as opposed to the older [16]. Consequently, all connectivity [12], as opposed to the older [14]. Consequently, all connectivity
checks will contain the magic cookie in the STUN header, and cause checks will contain the magic cookie in the STUN header, and cause
the STUN server embedded in each ICE implementation to include XOR- the STUN server embedded in each ICE implementation to include XOR-
MAPPED-ADDRESS attributes in the response, rather than MAPPED- MAPPED-ADDRESS attributes in the response, rather than MAPPED-
ADDRESS. ADDRESS.
Once created, the STUN transaction is linked to the transport address
pair so that, when the response is received, the state machine on the
linked transport address pair can be updated.
The STUN transaction will generate either a timeout, or a response. The STUN transaction will generate either a timeout, or a response.
If the response is a 420, 500, or 401, the agent should try again as If the response is a 420, 500, or 401, the agent should try again as
described in [13] (as mentioned above, it need not wait Ta seconds to described in [12] (as mentioned above, it need not wait the roughly
try again). Either initially, or after such a retry, the STUN Tb seconds to try again). Either initially, or after such a retry,
transaction might produce a non-recoverable failure response (error the STUN transaction might produce a non-recoverable failure response
codes 400, 430, 431, or 600) or a failure result inapplicable to this or a failure result inapplicable to this usage of STUN and thus
usage of STUN and thus unrecoverable (432, 433). If this happens, an unrecoverable. If this happens, an error event is generated into the
error event is generated into the state machine, and the transport state machine, and the transport address pair enters the invalid
address pair enters the invalid state. state.
If the STUN transaction times out, the client SHOULD NOT retry. The If the STUN transaction times out, the client SHOULD NOT retry. The
only reason a retry might succeed is if there was severe packet loss only reason a retry might succeed is if there was severe packet loss
during the duration of the check, or the answer was significantly during the duration of the check, or the answer was significantly
delayed, also due to packet loss. However, STUN Binding Request delayed, also due to packet loss. However, STUN Binding Request
transactions run for 9.5 seconds, which is well beyond the typical transactions run for 9.5 seconds, which is well beyond the typical
tolerance for a session establishment. The retries come with a tolerance for a session establishment. The retries come with a
penalty of additional traffic, which can be used to launch DoS penalty of additional traffic, which can be used to launch DoS
attacks Section 13.4.2. The only reason to not follow the SHOULD NOT attacks (see Section 13.4.2). The only reason to not follow the
is if the agent has adjusted the STUN transaction timers to be more SHOULD NOT is if the agent has adjusted the STUN transaction timers
aggressive. to be more aggressive.
If the Binding Response is a 200, the agent SHOULD check for the If the Binding Response is a 200, the agent SHOULD check for the
MESSAGE-INTEGRITY attribute and verify it, as discussed in [13]. MESSAGE-INTEGRITY attribute and verify it, as discussed in [12].
Indeed, this check SHOULD be done for all responses. This will Indeed, this check SHOULD be done for all responses. This will
result in the response being discarded (eventually leading to a result in the response being discarded (eventually leading to a
timeout), if the integrity check fails. timeout), if the integrity check fails.
7.8 Receiving a Binding Request for Connectivity Checks 7.8. Receiving a Binding Request for Connectivity Checks
As a result of providing a list of candidates in its offer or answer, As a result of providing a list of candidates in its offer or answer,
an agent will receive STUN Binding Request messages. An agent MUST an agent will receive STUN Binding Request messages. An agent MUST
be prepared to receive STUN Binding Requests on each local transport be prepared to receive STUN Binding Requests on each local transport
address from the moment it sends an offer or answer that contains a address from the moment it sends an offer or answer that contains a
candidate with that local transport address. Similarly, it MUST be candidate with that local transport address. Similarly, it MUST be
prepared to receive STUN Binding Requests on a local transport prepared to receive STUN Binding Requests on a local transport
address the moment it sends an offer or answer that contains a address the moment it sends an offer or answer that contains a
reflexive or relayed candidate derived from a local candidate with derived candidate derived from that local transport address. It can
that local transport address. It can cease listening for STUN cease listening for STUN messages on that local transport address
messages on that local transport address after sending an updated after sending an updated offer or answer which does not include any
offer or answer which does not include any candidates with transport candidates with transport addresses that are equal to or derived from
addresses that are equal to or derived from that local transport that local transport address.
address.
As discussed in [13], since the username and password for STUN As discussed in [12], since the username and password for STUN
requests are exchanged through another mechanism - here, ICE - the requests are exchanged through another mechanism - here, ICE - the
Shared Secret Request mechanism is not needed and need not be Shared Secret Request mechanism is not needed and need not be
implemented by agents that provide the connectivity check usage. implemented by agents that provide the connectivity check usage.
One of the candidates may be in use as the active candidate, or may One of the candidates may be in use as the operating candidate, or
become promoted to the active candidate in the next offer/answer may become promoted to the operating candidate in the next offer/
exchange as a consequence of a successful validation. In either answer exchange as a consequence of a successful validation. In
case, both media and STUN packets will be sent to the transport either case, both media and STUN packets will be sent to the
addresses comprising that candidate, causing both to receive on their transport addresses comprising that candidate, causing both to
associated local transport addresses. The agent MUST be able to receive on their associated local transport addresses. The agent
disambiguate them. This is done trivially by looking for the STUN MUST be able to disambiguate them. This is done trivially by looking
magic cookie as the value of the second 32-bit word in the packet. for the STUN magic cookie as the value of the second 32-bit word in
If present, it identifies a STUN packet. the packet. If present, it identifies a STUN packet.
Processing of the Binding Request proceeds in two steps. The first Processing of the Binding Request proceeds in two steps. The first
is generation of the response, and the second ICE-specific is generation of the response, and the second is ICE-specific
processing. Generation of the response follows the general processing. Generation of the response follows the general
procedures of [13]. The USERNAME is considered valid if one of the procedures of [12], and is independent of the state machinery
candidate IDs sent in an offer or answer is a prefix of the USERNAME described in Section 7.6. The USERNAME is considered valid if one of
(this will always be the case, even for peer reflexive candidates). the candidate IDs sent in an offer or answer is a prefix of the
The password associated with that candidate ID is used to verify the USERNAME (this will always be the case, even for peer reflexive
MESSAGE-INTEGRITY attribute, if one was present in the request. If candidates), and for the component indicated in the USERNAME, the
the USERNAME was not valid, the agent generates a 430. Otherwise, associated local transport address matches the local transport
the success response will include the XOR-MAPPED-ADDRESS attribute, address on which the request was received. The password associated
which is used for learning new candidates, as described in with that candidate ID, which was provided by the agent to its peer,
Section 7.10. The XOR-MAPPED-ADDRESS attribute is constructed using is used to verify the MESSAGE-INTEGRITY attribute, if one was present
the source IP address and port of the Binding Request. For Binding in the request. If the USERNAME is not valid, the agent generates a
Requests received over relayed transport addresses, this MUST be the 430. Otherwise, the success response will include the XOR-MAPPED-
source IP address and port of the Binding Request when it arrived at ADDRESS attribute, which is used for learning new candidates, as
the relay, prior to forwarding towards the agent. That source described in Section 7.10. The XOR-MAPPED-ADDRESS attribute is
transport address will be present in the REMOTE-ADDRESS attribute of constructed using the source IP address and port of the Binding
a STUN Data Indication message, if the Binding Request was delivered Request. For Binding Requests received over relayed transport
through a Data Indication. If the Binding Request was not addresses, this MUST be the source IP address and port of the Binding
encapsulated in a Data Indication, that source address is equal to Request when it arrived at the relay, prior to forwarding towards the
the current active destination for the STUN relay session. agent. That source transport address will be present in the REMOTE-
ADDRESS attribute of a STUN Data Indication message, if the Binding
Request was delivered through a Data Indication. If the Binding
Request was not encapsulated in a Data Indication, that source
address is equal to the current active destination for the STUN relay
session.
The ICE processing involves changes to the state machine for a The ICE processing involves changes to the state machine for a
transport address pair. This processing cannot be done until the transport address pair. This processing cannot be done until the
initial offer/answer exchange has completed. As a consequence, if initial offer/answer exchange has completed. As a consequence, if
the oferrer received a Binding Request that generated a success the offerer received a Binding Request that generated a success
response, but had not yet received the answer to its offer, it waits response, but had not yet received the answer to its offer, it waits
for the answer, and when it arrives, then performs the ICE for the answer, and when it arrives, then performs the ICE
processing. processing.
The agent takes the entire contents of the USERNAME, and compares The agent takes the entire contents of the USERNAME, and compares
them against the transport address pair identifiers as seen by that them against the transport address pair identifiers as seen by that
agent for each transport address pair. If there is no match, nothing agent for each transport address pair. If there is no match, nothing
is done - this should never happen for compliant implementations. If is done - this should never happen for compliant implementations. If
there is a match, the resulting transport address pair is called the there is a match, the resulting transport address pair is called the
matching transport address pair. The state machine for the matching matching transport address pair. The state machine for the matching
transport address pair is then updated based on the receipt of a STUN transport address pair is then updated based on the receipt of a STUN
Binding Request, and the resulting actions described in Section 7.6 Binding Request, and the resulting actions described in Section 7.6
are undertaken. are undertaken.
An agent will continue to receive periodic STUN connectivity checks An agent will continue to receive periodic STUN connectivity checks
on a local transport address as long as it had listed that transport on a local transport address as long as it had listed that transport
address, or one derived from it, in an a=candidate attribute in its address, or one derived from it, in an a=candidate attribute in its
most recent offer or answer, the state machine for that transport most recent offer or answer and the transport address is for UDP.
address is in the Recv-Valid or Valid states, and the transport Whether STUN keepalives are used for other transport protocols is
address is for UDP. Whether STUN keepalives are used for other defined by the specifications for that transport protocol. The agent
transport protocols is defined by the specifications for that processes any such transactions according to this section. It is
transport protocol. The agent processes any such transactions possible that a transport address pair that was previously valid may
according to this section. It is possible that a transport address become invalidated as a result of a subsequent failed STUN
pair that was previously valid may become invalidated as a result of transaction.
a subsequent failed STUN transaction.
7.9 Promoting a Candidate to Active 7.9. Promoting a Candidate to Operating
As a consequence of the connectivity checks, each agent will change As a consequence of the connectivity checks, each agent will change
the states for each transport address pair, and consequently, for the the states for each transport address pair, and consequently, for the
candidate pairs. When a candidate pair becomes valid, and the agent candidate pairs. When a candidate pair enters the valid state, and
is in the role of offerer for that candidate pair, the agent follows the agent is in the role of offerer for that candidate pair, the
the logic in this section. The rules only apply to the offerer of a agent follows the logic in this section. The rules only apply to the
candidate pair in order to eliminate the possibility of both agents offerer of a candidate pair in order to eliminate the possibility of
simultaneously offering an update to promote a candidate to active. both agents simultaneously offering an update to promote a candidate
to operating.
If this candidate pair is the first one in the candidate pair
priority ordered list, the agent SHOULD send an updated offer as
described in Section 7.11.1. If this candidate pair is not the first
on that list, but it is the first on the candidate pair check ordered
list, it means that this candidate pair is the active one, and its
connectivity has been verified. This is good news; the currently
active candidate is working. Media can now flow as described in
Section 7.13 (media will never flow prior to validation). However,
no updated offer is sent at this time.
If this candidate pair is not the first on the candidate pair
priority ordered list or the candidate pair check ordered list, and
the wait-state timer has not yet been set, the agent sets this timer
to Tws seconds. Tws SHOULD be configurable, and SHOULD have a
default of 100ms. This timer allows for a higher priority
connectivity check to complete, in the event its STUN Binding Request
was lost or delayed in the network. If, prior to the wait-state
timer firing, another connectivity check completes and a candidate
pair is validated, there is no need to reset or cancel the timer.
Once the timer fires, the agent SHOULD issue an updated offer as
described in Section 7.11.1.
In addition, in order to speed up ICE processing, once the agent has The agent locates the candidate pair in the candidate pair priority
determined the candidate that is to be promoted, it will send and ordered list. If it is the highest priority candidate pair, the
receive media using that candidate in expectation of an updated agent SHOULD send an updated offer immediately as described in
offer. This is discussed in Section 7.13. Section 7.11.1. If it is not the highest priority candidate pair,
and the states of all lower priority candidate pairs are Invalid, the
agent SHOULD send an updated offer immediately. If it is not the
highest priority candidate pair, and the state of at least one of the
lower priority candidate pairs is Indeterminate, the agent does
nothing. Tests have yet to begin for higher priority candidate
pairs. If it is not the highest priority candidate pair, and none of
the lower priority candidate pairs have a state of Indeterminate, the
agents starts a timer, called the wait-state timer, but only if this
timer is not already running. The timer is set to fire in Tws
seconds. Tws SHOULD be configurable, and SHOULD have a default of
Tws = max(0, 200ms - N*Tb), where N is the number of components for
the candidates for this media stream. The 200ms allows for a single
STUN retransmission (which takes 100ms) and an RTT of 100ms. This
timer allows for a higher priority connectivity check to complete, in
the event its STUN Binding Request was lost or delayed in the
network. Note that the timer goes to zero as the number of
components increases. If, prior to the wait-state timer firing,
another connectivity check completes and a candidate pair is
validated, there is no need to reset or cancel the timer. Once the
timer fires, the agent SHOULD issue an updated offer as described in
Section 7.11.1. This updated offer will use the highest priority
candidate pair in Valid state when the timer fires.
7.10 Learning New Candidates from Connectivity Checks 7.10. Learning New Candidates from Connectivity Checks
ICE makes use of reflexive addresses, which are addresses that inform ICE makes use of reflexive addresses, which are addresses that inform
an agent of its transport address as seen by another host. An an agent of its transport address as seen by another host. An
initial offer or answer generated by an agent includes server initial offer or answer generated by an agent includes server
reflexive addresses, which are learned from a configured or reflexive addresses, which are learned from a configured or
discovered STUN server in the network. However, the connectivity discovered STUN server in the network. However, the connectivity
checks themselves can inform an agent of reflexive addresses, and in checks themselves can inform an agent of reflexive addresses, and in
particular, ones that are reflexive towards its peer. These are particular, ones that are reflexive towards its peer. These are
called peer reflexive candidates. A new peer reflexive candidate is called peer reflexive candidates. A new peer reflexive candidate is
typically observed when two agents are separated by a NAT with the typically observed when two agents are separated by a NAT with the
address-dependent or address and port dependent mapping properties address-dependent or address and port dependent mapping properties
[37]. When the agent behind such a NAT sends a Binding Request to [32]. However, in unusual topologies, peer reflexive candidates can
the other agent (assuming it is reachable), the NAT will create a new be observed even when there are only NATs with the endpoint
mapping for this Binding Request. Because STUN and the media packets independent mapping property. Because STUN and the media packets are
are sent on the same port, regardless of the filtering properties of sent on the same port, regardless of the filtering properties of the
the NAT (whether endpoint independent, address dependent, or address NAT (whether endpoint independent, address dependent, or address and
and port dependent), this reflexive address can be used by the peer port dependent), this reflexive address can be used by the peer for
for sending STUN and media packets back towards the agent. sending STUN and media packets back towards the agent.
To obtain and use these peer reflexive transport addresses, ICE To obtain and use these peer reflexive transport addresses, ICE
agents perform additional processing on the receipt of STUN Binding agents MUST perform the additional processing on the receipt of STUN
Requests and responses, beyond the logic described in Section 7.7 and Binding Requests and responses described in the following two
Section 7.8. This logic is described below. subsections. These procedures are not just applied in the (hopefully
increasingly rare) case of address and port dependent mapping NATs.
They are also needed for behave-compliant NATs [32].
7.10.1 On Receipt of a Binding Request 7.10.1. On Receipt of a Binding Request
When a STUN Binding Request is received which generates a success The procedures in this section are followed when an agent receives a
response, that Binding Request would have been associated with a STUN Binding Request matched to a target transport address pair whose
matching transport address pair and corresponding candidate pair. source transport address (where the source is the one seen by the
The source IP and port of this Binding Request are compared to the IP relay for requests received on a relayed transport address) doesn't
address and port of the remote transport address in the matching match any of the existing remote transport addresses, or where the
transport address pair. Note that, in this case, we are comparing source matches, but the origination transport address does not. This
actual IP addresses and ports - not tids. In addition, if the source address and its associated origination transport address
Binding Request arrived through a relayed transport address, the become a new remote transport address.
source IP and port of this binding request used for the comparison
are those in the Binding Request when it arrived at the relay, prior
to forwarding towards the agent. That source transport address will
be present in the REMOTE-ADDRESS attribute of a STUN Data Indication
message, if the Binding Request were delivered through a Data
Indication. If the Binding Request was not encapsulated in a Data
Indication, that source address is equal to the current active
destination for the STUN relay session.
The comparison of the source IP and port of the Binding Request and To use it, that source transport address needs to be associated with
the IP address and port of the remote transport address in the a candidate (called a peer-derived candidate). In this case,
matching transport address pair may indicate inequality. In that however, the candidate isn't signaled through an offer/answer
case, the source IP and port of the Binding Request (and again, for exchange; it is constructed dynamically from information in the STUN
relayed transport address, this refers to the source IP address and request. Like all other candidates, the peer-derived candidate has a
port of the packet when it arrived at the relay) are compared to the candidate ID. The candidate ID is derived from the candidate IDs of
IP address and ports across the transport address pairs in *all* the target candidate pair. In particular, the candidate ID is
remote candidates. If there is a match to another remote candidate constructed by concatenating the remote candidate ID with the native
(called the alternate remote candidate), this is not a new candidate; candidate ID (without the colon). The password for the new candidate
however, the Binding Request has effectively helped validate the equals that of the remote candidate ID in the target candidate pair
alternate remote candidate. The agent SHOULD select the candidate (note that, this password would be the same for all remote candidates
pair corresponding to the combination of the alternate remote for the same media line).
candidate and the native candidate from the original matching
candidate pair. A "Get Req" event is passed to the state machine for
that candidate pair. Consequently, if this candidate pair was in the
Waiting state, a connectivity check will be generated for it.
If, when the source IP and port of the STUN packet, when compared When the STUN Binding Request is received, the agent constructs the
against all remote candidates, was not a match to any of them, it candidate ID for the peer reflexive candidate, and checks to see if
means that the source IP and port might represent another valid that candidate exists. It may already exist if it had been
remote transport address - a peer derived one. constructed as a consequence of a previous application of this logic
on receipt of a Binding Request from a different remote transport
address of the same new peer reflexive candidate. If there is not
yet a peer reflexive candidate with that candidate ID, the agent
creates it, and assigns it the newly computed candidate ID. The
priority of the peer-derived candidate is set to the priority of its
generating candidate. The generating candidate is the one that the
new peer derived candidate comes from - the remote candidate in the
target candidate. Note that, at this time, the peer derived
candidate has no transport addresses in it.
To use it, that address needs to be associated with a candidate The remote candidate is then paired up with a native candidate.
(called a peer-derived candidate). In this case, however, the However, unlike the procedures of Section 7.5, which pair up each
candidate isn't signaled through an offer/answer exchange; it is remote candidate with each native candidate, this peer reflexive
constructed dynamically from information in the STUN request. Like candidate is only paired up with a the native candidate from the
all other candidates, the peer-derived candidate has a candidate ID. candidate pair from which it was derived. This creates a new
The candidate ID is derived from the candidate IDs of the matching candidate pair. This new candidate pair is inserted into the
candidate pair. In particular, the candidate ID is constructed by candidate pair priority ordered list based on the ordering rules
concatenating the remote candidate ID with the native candidate ID defined in Section 7.5. Note that no entries are added to the
(without the colon). The password for the new candidate equals that transport address pair check ordered list.
of the remote candidate ID in the matching candidate pair (note that,
this password would be the same for all remote candidates for the
same media line).
On receipt of a STUN Binding Request whose source IP and port don't Recall that, for each candidate pair, one agent plays the role of
match the transport address in any remote candidate, the agent offerer, and the other of answerer. For a peer-reflexive candidate,
constructs the candidate ID that represents the peer reflexive the role is identical to that of its generating candidate.
candidate, and checks to see if that candidate exists. It may
already exist if it had been constructed as a consequence of a
previous application of this logic on receipt of a Binding Request
for a different transport address pair of the same candidate pair.
If there is not yet a peer reflexive candidate with that candidate
ID, the agent creates it, and assigns it the newly computed candidate
ID. The priority of the peer-derived candidate MUST be set to the
priority of its generating candidate - the remote candidate in the
matching transport address pair. Note that, at this time, the peer
derived candidate has no transport addresses in it.
Newly created or not, the agent extracts the component ID from the Newly created or not, the agent extracts the component ID from the
matching transport address pair, and sees if a transport address with matching transport address pair, and sees if a transport address with
that same component ID exists in the peer reflexive candidate. If that same component ID exists in the peer reflexive candidate. If it
not (and it shouldn't), the agent adds a transport address to the does, the agent does nothing further. This can happen in unusual
peer reflexive candidate. This transport address is equal to the cases when there is a NAT reboot in the middle of a STUN transaction,
source IP address and port from the incoming STUN Binding Request causing two requests in the same transaction two produce two
(and in the case of a relayed transport address, the one seen by the different transport addresses. If there is no transport address with
relay). It is assigned the component ID equal to the component ID in the same component ID in the peer reflexive candidate, the agent adds
the matching transport address pair. This transport address will a transport address to the peer reflexive candidate. This transport
have a tid, equal to the concatenation of the candidate ID for this address is equal to the source IP address and port from the incoming
new candidate, and the component ID, separated by a colon. STUN Binding Request (and in the case of Binding Request received on
a relayed transport address, the one seen by the relay), and has a
The peer reflexive candidate becomes usable once the number of transport protocol equal to that of the incoming STUN request. It is
transport addresses in it equals the transport address pair count of assigned the component ID equal to the component ID in the target
the candidate pair from which it is derived. Initially, the peer transport address pair. This new transport address will have a
reflexive candidate will start with a single transport address. More transport address ID, equal to the concatenation of the candidate ID
are added as the connectivity checks for the original candidate pair for this new candidate, and the component ID, separated by a colon.
take place. Once the peer reflexive candidate becomes usable, it has The type of the transport address is considered to be peer reflexive,
to be paired up with native candidates. However, unlike the though this is never signaled through SDP and so there is no
procedures of Section 7.5, which pair up each remote candidate with candidate-types value defined for it. Recall that each transport
each native candidate, this peer reflexive candidate is only paired address is associated with an origination transport address. For
up with the native candidate from the candidate pair from which it server reflexive candidates, the origination transport address is
was derived. This creates a new candidate pair, and a set of new signaled through SDP. For peer reflexive transport addresses, it is
transport address pairs. inherited from the origination transport address of the generating
transport address. If the generating transport address was a local
transport address, then the origination transport address is that
transport address. If the generating transport address was server
reflexive, the origination transport address is the related transport
address that was signaled for that server reflexive candidate. If
the generating transport address was relayed, the origination
transport address is the relayed transport address itself. Whether
and how other candidate attributes defined by extensions are
inherited depends on the extension.
Recall that, for each candidate pair, one agent plays the role of The newly added transport address is paired up with the native
offerer, and the other of answerer. For a peer-reflexive candidate, transport address with the same component ID. Initially, the peer
the role is identical to that of its generating candidate. reflexive candidate will start with a single transport address a
transport address pair. More are added as the connectivity checks
for the original candidate pair take place.
Figure 6 provides a pictorial representation of the peer reflexive Figure 10 provides a pictorial representation of the peer reflexive
candidate (the one with id=RL) and its pairing with the native candidate (the one with id=RL) and its pairing with the native
candidate with id L. The candidate with ID R is referred to as the candidate with ID L. The candidate with ID R is the generating
generating candidate. The peer reflexive candidate is effectively an candidate. The peer reflexive candidate is effectively an alternate
alternate for that generating candidate, but is only paired with a for that generating candidate, but is only paired with a specific
specific native candidate. Note that, for a particular generating native candidate. Note that, for a particular generating candidate,
candidate, there can be many peer derived candidates, up to one for there can be many peer derived candidates, up to one for each native
each native candidate. candidate. Also note that candidate IDs with values "L" and "R" and
"RL" are not actually permitted, since all candidate IDs must be at
least four characters long. These shortened candidate IDs are used
to keep the figure readable.
............. ............. ............. .............
. tid=L:1 . . tid=R:1 . . tid=L:1 . . tid=R:1 .
component. -- . id=L:1:R:1 . -- .component component. -- . id=L:1:R:1 . -- .component
id=1 . | A|-------------------------| C| . id=1 id=1 . | A|-------------------------| C| . id=1
. -- -------+ . -- . . -- -------+ . -- .
. . | . . Generating . . | . . Generating
. . | . . Candidate . . | . . Candidate
. tid=L:2 . | . tid=R:2 . . tid=L:2 . | . tid=R:2 .
component. -- . | id=L:2:R:2 . -- .component component. -- . | id=L:2:R:2 . -- .component
skipping to change at page 39, line 38 skipping to change at page 50, line 37
| . . Candidate | . . Candidate
| . tid=RL:2 . | . tid=RL:2 .
| id=L:2:RL:2 . -- .component | id=L:2:RL:2 . -- .component
+-------------------| D| . id=2 +-------------------| D| . id=2
. -- . . -- .
............. .............
Remote Remote
Candidate Candidate
id=RL id=RL
Figure 6 Figure 10
The new transport address pairs have a state machine associated with The new transport address pair has a state machine associated with
them. The state that is entered, and actions to take as a it. The state that is entered, and actions to take as a consequence,
consequence, are specific to the transport protocol. For UDP, the are specific to the transport protocol. For UDP, the procedures are
procedures are defined here. Extensions that define processing for defined here. Extensions that define processing for other transport
other transport protocols SHOULD describe the behavior. protocols SHOULD describe the behavior.
For UDP, the state machine enters the Send-Valid state. Effectively, For UDP, the state machine enters the Send-Valid state. Effectively,
the Binding Request just received "counts" as a validation in this the Binding Request just received "counts" as a validation in this
direction, even though it was formally done for a different candidate direction, even though it was formally done for a different transport
pair. In addition, the agent SHOULD generate a Binding Request for address pair. In addition, the agent generates a Binding Request for
each transport address in this new candidate pair, as described in the new transport address pair, as described in Section 7.7.
Section 7.7. The transport address pairs are inserted into the Processing of the response follows the logic described in
ordered list of pairs based on the ordering described in Section 7.5 Section 7.6.
and processing follows the logic described in Section 7.6.
7.10.2 On Receipt of a Binding Response As with all candidate pairs, the state of this new candidate pair is
derived from the states of its transport address pairs. Until the
number of transport address pairs in the candidate pair equals the
transport address pair count of the candidate pair from which it is
derived, the state of the candidate pair is Indeterminate. Once they
are equal, the state is derived just like any other candidate pair.
7.10.2. On Receipt of a Binding Response
The procedures on receipt of a Binding Response are nearly identical The procedures on receipt of a Binding Response are nearly identical
to those for receipt of a Binding Request as described above. to those for receipt of a Binding Request as described above.
When a successful STUN Binding Response is received, it will be The procedures in this section are followed when an agent receives a
associated with a matching transport address pair and corresponding STUN Binding Response matched to a transport address pair whose XOR-
candidate pair. This matching is done based on comparison of MAPPED-ADDRESS doesn't match any of the existing native transport
candidate IDs. The reflexive transport address from the Binding addresses. The XOR-MAPPED-ADDRESS becomes a new native transport
Response is compared to the IP address and port of the native address.
transport address in the matching transport address pair. Note that,
in this case, we are comparing actual IP addresses and ports - not
tids. These may not match if there was a NAT between the two agents.
If they do not match, the reflexive transport address is compared to
the IP address and ports across the transport address pairs in *all*
native candidates. If there is a match to another native candidate
(called the alternate native candidate), this is not a new candidate;
however, the Binding Response has effectively helped validate the
alternate native candidate. The agent SHOULD select the candidate
pair corresponding to the combination of the alternate native
candidate and the remote candidate from the original matching
candidate pair. If the candidate pair is in the Waiting state, it
moves directly to the Recv Valid state.
If, when the reflexive transport address, when compared against all
native candidates, was not a match to any of them, it means that the
reflexive transport address might represent another valid native
transport address - a peer derived one.
To use it, that address needs to be associated with a candidate. In
this case, however, the candidate isn't signaled through an offer/
answer exchange; it is constructed dynamically from information in
the STUN response. Such a candidate is called a peer reflexive
candidate. Like all other candidates, the peer reflexive candidate
has a candidate ID. The candidate ID is derived from the candidate
IDs of the matching candidate pair. In particular, the candidate ID
is constructed by concatenating the native candidate ID with the
remote candidate ID (without the colon). The password for the new
candidate equals that of the native candidate ID in the matching
candidate pair.
On receipt of a STUN Binding Response whose reflexive transport
address didn't match the transport address in any native candidate,
the agent constructs the candidate ID that represents the peer
reflexive candidate, and checks to see if that candidate exists. It
may already exist if it had been constructed as a consequence of a
previous application of this logic on receipt of a Binding Response
for a different transport address pair of the same candidate pair.
If there is not yet a peer derived candidate with that candidate ID,
the agent creates it, and assigns it the newly computed candidate ID.
The priority of the new candidate MUST be set to the priority of the
generating candidate - the native candidate in the matching transport
address pair. Note that, at this time, the peer derived candidate
has no transport addresses in it.
Newly created or not, the agent extracts the component ID from the To use it, the XOR-MAPPED-ADDRESS needs to be associated with a
matching transport address pair, and sees if a transport address with candidate (called a peer-derived candidate). In this case, however,
that same component ID exists in the peer reflexive candidate. If the candidate isn't signaled through an offer/answer exchange; it is
not (and it shouldn't), the agent adds a transport address to the constructed dynamically from information in the STUN response. Like
peer reflexive candidate. This transport address is equal to the all other candidates, the peer-derived candidate has a candidate ID.
reflexive transport address from the STUN Binding Response. It is The candidate ID is derived from the candidate IDs of the target
assigned the component ID equal to the component ID in the matching candidate pair. In particular, the candidate ID is constructed by
transport address pair. This transport address will have a tid, concatenating the native candidate ID with the remote candidate ID
equal to the concatenation of the candidate ID for this new (without the colon). The password for the new candidate equals that
candidate, and the component ID, separated by a colon. of the native candidate ID in the matching candidate pair (note that,
this password would be the same for all native candidates for the
same media line).
The peer-derived candidate becomes usable once the number of When the Binding Response is received, the agent constructs the
transport addresses in it equals the transport address pair count of candidate ID that represents the peer reflexive candidate, and checks
candidate pair from which it is derived. Initially, the peer-derived to see if that candidate exists. It may already exist if it had been
candidate will start with a single transport address. More are added constructed as a consequence of a previous application of this logic
as the connectivity checks for the original candidate pair take on receipt of a Binding Response for a different transport address
place. Once the peer-derived candidate becomes usable, it has to be pair of the same candidate pair. If there is not yet a peer
paired up with remote candidates. However, unlike the procedures of reflexive candidate with that candidate ID, the agent creates it, and
Section 7.5, which pair up each remote candidate with each native assigns it the newly computed candidate ID. The priority of the
candidate, the peer-derived candidate is only paired up with the peer-derived candidate is set to the priority of its generating
remote candidate from the matching candidate pair. This creates a candidate - the native candidate in the target transport address
new candidate pair, and a set of new transport address pairs. pair. Note that, at this time, the peer derived candidate has no
transport addresses in it. The native candidate is then paired up
with a remote candidate. However, unlike the procedures of
Section 7.5, which pair up each native candidate with each remote
candidate, this peer reflexive candidate is only paired up with the
remote candidate from the target candidate pair. This creates a new
candidate pair. This new candidate pair is inserted into the
candidate pair priority ordered list based on the ordering rules
defined in Section 7.5. Note that no entries are added to the
transport address pair check ordered list.
Recall that, for each candidate pair, one agent plays the role of Recall that, for each candidate pair, one agent plays the role of
offerer, and the other of answerer. For a peer-reflexive candidate, offerer, and the other of answerer. For a peer-reflexive candidate,
the role is identical to that of its generating candidate. the role is identical to that of its generating candidate.
The new transport address pairs have a state machine associated with Newly created or not, the agent extracts the component ID from the
them. The state that is entered, and actions to take as a target transport address pair, and sees if a transport address with
consequence, are specific to the transport protocol. For UDP, the that same component ID exists in the peer reflexive candidate. If it
procedures are defined here. Extensions that define processing for does, the agent does nothing further. This can happen in unusual
other transport protocols SHOULD describe the behavior. cases when there is a NAT reboot in the middle of a STUN transaction,
causing two requests in the same transaction two produce two
different transport addresses. If there is no transport address with
the same component ID in the peer reflexive candidate, the agent adds
a transport address to the peer reflexive candidate. This transport
address is equal to the XOR-MAPPED-ADDRESS from the incoming STUN
Binding Response, and has a transport protocol equal to the one used
for the Binding Response. It is assigned the component ID equal to
the component ID in the matching transport address pair. This
transport address will have a transport address ID, equal to the
concatenation of the candidate ID for this new candidate, and the
component ID, separated by a colon. The type of the transport
address is considered to be peer reflexive, though this is never
signaled through SDP and so there is no candidate-types value defined
for it. Recall that each transport address is associated with an
origination transport address. For server reflexive candidates, the
origination transport address is signaled through SDP. For peer
reflexive transport addresses, it is inherited from the origination
transport address of the generating transport address. If the
generating transport address was a local transport address, then the
origination transport address is that transport address. If the
generating transport address was server reflexive, the origination
transport address is the related transport address that was signaled
for that server reflexive candidate. If the generating transport
address was relayed, the origination transport address is the relayed
transport address itself. Whether and how other candidate attributes
defined by extensions are inherited depends on the extension.
The newly added transport address is paired up with the remote
transport address with the same component ID. Initially, the peer
reflexive candidate will start with a single transport address a
transport address pair. More are added as the connectivity checks
for the original candidate pair take place.
The new transport address pair has a state machine associated with
it. The state that is entered, and actions to take as a consequence,
are specific to the transport protocol. For UDP, the procedures are
defined here. Extensions that define processing for other transport
protocols SHOULD describe the behavior.
For UDP, the state machine enters the Recv-Valid state. Effectively, For UDP, the state machine enters the Recv-Valid state. Effectively,
the Binding Response just received "counts" as a validation in this the Binding Response just received "counts" as a validation in this
direction, even though it was formally done for a different candidate direction, even though it was formally done for a different candidate
pair. The transport address pairs are inserted into the ordered list pair. The peer will likely generate a Binding Request for this
of pairs based on the ordering described in Section 7.5, and candidate pair; processing of the request follows the logic described
processing follows the logic described in Section 7.6. in Section 7.6.
7.11 Subsequent Offer/Answer Exchanges As with all candidate pairs, the state of this new candidate pair is
derived from the states of its transport address pairs. Until the
number of transport address pairs in the candidate pair equals the
transport address pair count of the candidate pair from which it is
derived, the state of the candidate pair is Indeterminate. Once they
are equal, the state is derived just like any other candidate pair.
7.11. Subsequent Offer/Answer Exchanges
An agent MAY issue an updated offer at any time. This updated offer An agent MAY issue an updated offer at any time. This updated offer
may be sent for reasons having nothing to do with ICE processing (for may be sent for reasons having nothing to do with ICE processing (for
example, the addition of a video stream in a multimedia session), or example, the addition of a video stream in a multimedia session), or
it may be due to a change in ICE-related parameters. For example, if it may be due to a change in ICE-related parameters. For example, if
an agent acquires a new candidate after the initial offer/answer an agent acquires a new candidate after the initial offer/answer
exchange, it may seek to add it. exchange, it may seek to add it.
However, agents SHOULD follow the logic described in Section 7.9 to However, agents SHOULD follow the logic described in Section 7.9 to
determine when to send an updated offer as a consequence of promoting determine when to send an updated offer as a consequence of promoting
a candidate to active. a candidate to operating.
If there are any aspects of this processing that are specific to the If there are any aspects of this processing that are specific to the
transport protocol, those SHOULD be called out in ICE extensions that transport protocol, those SHOULD be called out in ICE extensions that
define operation with other transport protocols. There are no define operation with other transport protocols. There are no
additional considerations for UDP. additional considerations for UDP.
7.11.1 Sending of a Subsequent Offer 7.11.1. Sending of a Subsequent Offer
The offer MAY contain a new active candidate in the m/c line. This The offer MAY contain a new operating candidate in the m/c line.
candidate SHOULD be the native candidate from the highest candidate This candidate SHOULD be the native candidate from the highest
pair in the candidate pair priority ordered list whose state is priority candidate pair in the candidate pair priority ordered list
Valid. If there are no candidate pairs in this state, the highest whose state is Valid. If there are no candidate pairs in this state,
one whose state is Send-Valid or Recv-Valid SHOULD be used. If there the highest one whose state is Send-Valid or Recv-Valid SHOULD be
are no candidate pairs in these states, the candidate pair that is used. If there are no candidate pairs in these states, the candidate
most likely to work with this peer, as described in Section 7.2, pair that is most likely to work with this peer, as described in
SHOULD be used. The candidate is encoded into the m/c line in an Section 7.2, SHOULD be used. The candidate is encoded into the m/c
updated offer as described in Section 7.3. Note that, while peer- line in an updated offer as described in Section 7.3. Note that,
derived candidates never appear in a=candidate attributes (only their while peer-derived candidates never appear in a=candidate attributes
generating candidates appear there), a peer-derived candidate can (only their generating candidates appear there), a peer-derived
appear in the m/c line if it has been selected for usage for media. candidate can appear in the m/c line if it has been selected for
usage for media.
If the candidate pair whose native candidate was encoded into the If the candidate pair whose native candidate was encoded into the
m/c-line was Valid, Send-Valid or Recv-Valid, the agent MUST include m/c-line was Valid, Send-Valid or Recv-Valid, the agent MUST include
an a=remote-candidate attribute into the offer. This attribute MUST an a=remote-candidate attribute into the offer. This attribute MUST
contain the candidate ID of the remote candidate in the candidate contain the candidate ID of the remote candidate in the candidate
pair. It is used by the recipient of the offer in selecting its pair. It is used by the recipient of the offer in selecting its
candidate for the answer. candidate for the answer. Because the native candidate in the m/c-
line will typically be Valid, Send-Valid or Recv-Valid in every offer
after the initial one, the a=remote-candidate attribute will
typically be used in all subsequent offers.
The meaning of a=candidate attributes within a subsequent offer have The meaning of a=candidate attributes within a subsequent offer have
the same meaning as they do in an initial offer. They are a request the same meaning as they do in an initial offer. They are a request
for the peer to attempt (or continue to attempt if the candidate was for the peer to attempt (or continue to attempt if the candidate was
provided previously) a connectivity check using STUN from each of its provided previously) a connectivity check using STUN from each of its
own candidates. When an updated offer is sent, there are several own candidates. When an updated offer is sent, there are several
dispositions regarding the candidates: dispositions regarding the candidates:
retained: A candidate is retained if the candidate ID for the retained: A candidate is retained if the candidate ID for the
candidate is included in the new offer, and matches the candidate candidate is included in the new offer, and matches the candidate
skipping to change at page 43, line 31 skipping to change at page 54, line 50
added: A candidate is added if its candidate ID appeared in the new added: A candidate is added if its candidate ID appeared in the new
offer, but was not present in a previous offer or answer from that offer, but was not present in a previous offer or answer from that
agent. agent.
The following rules are used to determine the disposition of the each The following rules are used to determine the disposition of the each
of the current native candidates in the new offer: of the current native candidates in the new offer:
o If a candidate is invalid, and all peer reflexive candidates o If a candidate is invalid, and all peer reflexive candidates
generated from it are invalid as well, it SHOULD be removed. generated from it are invalid as well, it SHOULD be removed.
o If the candidate in the m/c-line is valid, all other candidates o If the candidate in the m/c-line is valid, all other lower
SHOULD be removed. This has the effect of stopping connectivity priority candidates SHOULD be removed. This has the effect of
checks of other candidates. This SHOULD would not be followed if stopping connectivity checks of other candidates. This SHOULD
an agent wanted to keep a candidate ready for usage should, for would not be followed if an agent wanted to keep a candidate ready
some reason, the active candidate later become invalid. for usage if, for some reason, the operating candidate later
become invalid.
o If the candidate in the m/c-line is valid, and it is not peer o If the candidate in the m/c-line is valid, and it is not peer
reflexive, that candidate MUST be retained. If the candidate in reflexive, that candidate MUST be retained. If the candidate in
the m/c-line is peer reflexive, its generating candidate MUST be the m/c-line is peer reflexive, its generating candidate MUST be
retained, even if it is itself invalid. retained, even if it is itself invalid.
o If the candidate in the m/c-line has not been validated, all other o If the candidate in the m/c-line has not been validated, all other
candidates that are not invalid, or candidates for whom their candidates that are not invalid, or candidates for whom their
derived candidates are not invalid, SHOULD be retained. derived candidates are not invalid, SHOULD be retained.
o Peer reflexive candidates MUST NOT be added; they continue to be o Peer reflexive candidates MUST NOT be added; they continue to be
used as long as their generating candidate was retained. Peer used as long as their generating candidate was retained. Peer
derived candidates are learned exclusively through the STUN derived candidates are learned exclusively through the STUN
connectivity checks. connectivity checks.
A new candidate MAY be added. This can happen when the candidate is A new candidate MAY be added. This can happen when the candidate is
a new one, learned since the previous offer/answer exchange, and it a new one, learned since the previous offer/answer exchange, and it
has a higher priority than the currently active candidate. It can has a higher priority than the currently operating candidate. It can
also occur when an agent wishes to restart checks for a transport also occur when an agent wishes to restart checks for a transport
address it had tried previously. Effectively, changing the candidate address it had tried previously. Effectively, changing the candidate
ID value in an updated offer will "restart" connectivity checks for ID value in an updated offer will "restart" connectivity checks for
that candidate. that candidate.
If a candidate is removed, the agent takes the following steps once If a candidate is removed, the agent takes the following steps once
the offer is sent: the offer is sent:
1. The agent eliminates any candidate pairs whose native candidate 1. The agent eliminates any candidate pairs whose native candidate
equalled the candidate that was removed. Equality is based on equalled the candidate that was removed. Equality is based on
comparison of candidate IDs. comparison of candidate IDs.
2. The agent eliminates any candidate pairs that had a native 2. The agent eliminates any candidate pairs that had a native
candidate that is a peer reflexive candidate generated from the candidate that is a peer reflexive candidate generated from the
candidate that was removed. candidate that was removed.
3. The candidate pairs that are eliminated are removed from the 3. The candidate pairs that are eliminated are removed from the
candidate pair priority ordered list and candidate pair check candidate pair priority ordered list. Their corresponding
ordered list. As a consequence of this, if connectivity checks transport address pairs are removed from the transport address
had not yet begun for the candidate pair, they won't. pair check ordered list. As a consequence of this, if
connectivity checks had not yet begun for the candidate pair,
they won't. If a transport address pair had been pruned from the
transport address pair check ordered list because it was
redundant with one of the transport address pairs which was just
removed, that transport address pair is added back to the list.
4. If connectivity checks were already in progress for transport 4. If connectivity checks were already in progress for transport
addresses in a candidate pair that was removed, the agent SHOULD addresses in a candidate pair that was removed, the agent SHOULD
immediately terminate them. No further retransmissions take immediately terminate them. No further retransmissions take
place, and no further transactions from that candidate will be place, and no further transactions from that candidate will be
made. made.
5. If the removed candidate was a relayed candidate, the agent 5. If the removed candidate was a relayed candidate, the agent
SHOULD de-allocate its transport addresses from the STUN relay if SHOULD de-allocate its transport addresses from the STUN relay if
it is not using those resources elswhere. If a local candidate it is not using those resources elswhere. If a local candidate
skipping to change at page 44, line 48 skipping to change at page 56, line 26
resources for each of the transport addresses in the local resources for each of the transport addresses in the local
candidate SHOULD be de-allocated, as long as it is not using candidate SHOULD be de-allocated, as long as it is not using
those resources elsewhere. The resources may be in use elsewhere those resources elsewhere. The resources may be in use elsewhere
if they were included in an initial offer which generated if they were included in an initial offer which generated
multiple answers (as can happen with SIP forking). In such a multiple answers (as can happen with SIP forking). In such a
case, a subsequent offer which removes the candidate will not case, a subsequent offer which removes the candidate will not
imply its removal with the other branches; each becomes a imply its removal with the other branches; each becomes a
separate offer/answer relationship. separate offer/answer relationship.
Subsequent offers MUST contain a=ice-pwd attributes that specify the Subsequent offers MUST contain a=ice-pwd attributes that specify the
password for the candidates for each media stream. The password for password for the candidates for each media stream. If any of the
the candidates for a particular media stream SHOULD have the same candidates for a particular m-line are the same as the previous
value as in previous offers. However, an agent MAY change it if, for offer, the ICE password for that m-line MUST be the same. If all of
some reason, the agent believes that the password may have been the candidates for a particular m-line are different from the
compromised. Note that it is permissible to use a session-level previous offer, the ICE password for that m-line MAY be different.
attribute in one offer, and in a subseqeunt offer, provide the same Note that it is permissible to use a session-level attribute in one
password as a media-level attribute. This is not a change in the offer, but to provide the same password as a media-level attribute in
password; merely a change in its representation. An agent MUST be a subsequent offer. This is not a change in password, just a change
prepared to receive connectivity checks that use either the new or in its representation.
old password until Tpw seconds after it receives the answer. Tpw
SHOULD be configurable, and SHOULD default to 2 seconds.
7.11.2 Receiving the Offer and Sending an Answer 7.11.2. Receiving the Offer and Sending an Answer
To generate the answer, the answerer has to decide which transport To generate the answer, the answerer has to decide which transport
addresses to include in the m/c line, and which to include in addresses to include in the m/c line, and which to include in
candidate attributes. candidate attributes.
The first step in the process is to look for the a=remote-candidate The first step in the process is to look for the a=remote-candidate
attribute in the offer. The a=remote-candidate exists to eliminate a attribute in the offer. The a=remote-candidate exists to eliminate a
race condition between the updated offer and the response to the STUN race condition between the updated offer and the response to the STUN
Binding Request that moved a candidate into the Valid state. This Binding Request that moved a candidate into the Valid state. This
race condition is shown in Figure 7. On receipt of message 5, agent race condition is shown in Figure 11. On receipt of message 5, agent
A can move its transport address pair state machine into the Valid A can move its transport address pair state machine into the Valid
state. It sends a STUN response to the request (message 6), but this state. It sends a STUN response to the request (message 6), but this
is lost. Agent A proceeds with an updated offer (message 7), which is lost. Agent A proceeds with an updated offer (message 7), which
is received at agent B. As far as agent B is concerned, the transport is received at agent B. As far as agent B is concerned, the transport
address pair is still in the Send-Valid state. It will move into the address pair is still in the Send-Valid state. It will move into the
Valid state only on receipt of the STUN response in message 10. Valid state only on receipt of the STUN response in message 10.
Thus, upon receipt of the offer, agent B cannot determine which Thus, upon receipt of the offer, agent B cannot determine which
candidate to include in its answer. To eliminate this condition, the candidate to include in its answer. To eliminate this condition, the
identity of the validated candidate is included in the offer itself. identity of the validated candidate is included in the offer itself.
Note, however, that the answerer will not send media until it has Note, however, that the answerer will not send media until it has
received this STUN response. received this STUN response.
Agent A Network Agent B Agent A Network Agent B
|(1) Offer | | |(1) Offer | |
|------------------------------------------>| |------------------------------------------>|
|(2) Answer | | |(2) Answer | |
skipping to change at page 46, line 28 skipping to change at page 57, line 34
| |Lost | | |Lost |
|(7) Offer | | |(7) Offer | |
|------------------------------------------>| |------------------------------------------>|
|(8) Answer | | |(8) Answer | |
|<------------------------------------------| |<------------------------------------------|
|(9) STUN Req. | | |(9) STUN Req. | |
|<------------------------------------------| |<------------------------------------------|
|(10) STUN Res. | | |(10) STUN Res. | |
|------------------------------------------>| |------------------------------------------>|
Figure 7 Figure 11
If the a=remote-candidate attribute is present, the agent examines If the a=remote-candidate attribute is present, the agent examines
the transport addresses in the m/c-line of the offer. It compares the transport addresses in the m/c-line of the offer. It compares
these with the transport addresses in the remote candidates of all these with the transport addresses in the remote candidates of all
candidate pairs. If there is at least one match, the agent compares candidate pairs. If there is no match, no further processing of the
the native candidate ID of each matching pair with the value of the a=remote-candidate attribute is done. If there is at least one
a=remote-candidate attribute. If there is a match, that candidate match, the agent compares the native candidate ID of each matching
pair is selected. For each transport address pair in that candidate pair with the value of the a=remote-candidate attribute. If there is
pair, if the state of the transport address pair is Send-Valid, the a match, that candidate pair is selected. For each transport address
agent considers the state to be Valid just for the purpose of pair in that candidate pair, if the state of the transport address
selecting the m/c-line as discussed in the paragraph below. The pair is Send-Valid, the agent considers the state to be Valid just
actual state MUST remain Send-Valid. This is necessary to prevent for the purpose of constructing the answer. In particular, it will
against DoS attacks. impact selection of the candidate for the m/c-line and the set of
additional candidates to include or exclude from the answer.
However, the actual state MUST remain Send-Valid. This state will be
used to determine when it is safe to send media. Keeping it at Send-
Valid is necessary to prevent against DoS attacks.
Note that the a=remote-candidate attribute SHOULD NOT be included in
the answer, and if included, will just be ignored by the offerer,
since it is not used in any processing of the answer.
Rules for choosing transport addresses for the m/c-line are as Rules for choosing transport addresses for the m/c-line are as
follows. The agent examines the transport addresses in the m/c-line follows. The agent examines the transport addresses in the m/c-line
of the offer. It compares these with the transport addresses in the of the offer. It compares these with the transport addresses in the
remote candidates of candidate pairs whose states are Valid. If remote candidates of candidate pairs whose states are Valid. If
there is a matching candidate pair in that state, the pair with the there is a matching candidate pair in that state, the pair with the
highest priority MUST be chosen, and the native candidate from that highest priority MUST be chosen, and the native candidate from that
pair used as the active candidate. If there were no matching pair used as the operating candidate. If there were no matching
candidate pairs in the Valid state, the candidate that is most likely candidate pairs in the Valid state (possibly because the transport
to work with this peer, as described in Section 7.2, SHOULD be used. addresses in the m/c-line in the offer didn't match any of the remote
candiadtes), the candidate that is most likely to work with this
peer, as described in Section 7.2, SHOULD be used. Note that this
candidate may be Valid as a consequence of being temporarily changed
to such by the a=remote-candidate attribute.
Like the offerer, the answerer can decide, for each of its Like the offerer, the answerer can decide, for each of its
candidates, whether they are retained or removed. The same rules candidates, whether they are retained or removed. The same rules
defined in Section 7.11.1 for determining their disposition apply to defined in Section 7.11.1 for determining their disposition apply to
the answerer. Similarly, if a candidate is removed, the same rules the answerer. Similarly, if a candidate is removed, the same rules
in Section 7.11.1 regarding removal of canididate pairs and freeing in Section 7.11.1 regarding removal of canididate pairs and freeing
of resources apply. of resources apply. As with selection of the candidate for the m/c-
line, the state of one of the candidates may be Valid as a
consequence of being temporarily changed to such by the a=remote-
candidate attribute.
Once the answer is sent, the answerer will have the set of native and Once the answer is sent, the answerer will have the set of native and
remote candidates before this offer/answer exchange, and the set of remote candidates before this offer/answer exchange, and the set of
native and remote candidates afterwards. A peer derived candidate native and remote candidates afterwards. A peer derived candidate
continues to be used as long as its generating parent continues to be continues to be used as long as its generating parent continues to be
used. The agent then pairs up the native and remote candidates which used. The agent then pairs up the native and remote candidates which
were added or retained. This leads to a set of current candidate were added or retained. This leads to a set of current candidate
pairs. pairs.
If a candidate pair existed previously, but as a consequence of the If a candidate pair existed previously, but as a consequence of the
offer/answer exchange, it no longer exists, the agent takes the offer/answer exchange, it no longer exists, the agent takes the
following steps: following steps:
1. The candidate pair is removed from the candidate pair priority 1. The candidate pair is removed from the candidate pair priority
ordered list and candidate pair check ordered list. As a ordered list. Their corresponding transport address pairs are
removed from the transport address pair check ordered list. As a
consequence of this, if connectivity checks had not yet begun for consequence of this, if connectivity checks had not yet begun for
the candidate pair, they won't. the candidate pair, they won't. If a transport address pair had
been pruned from the transport address pair check ordered list
because it was redundant with one of the transport address pairs
which was just removed, that transport address pair is added back
to the list.
2. If connectivity checks were already in progress for that 2. If connectivity checks were already in progress for that
candidate pair, the agent SHOULD immediately terminate any STUN candidate pair, the agent SHOULD immediately terminate any STUN
transactions in progress from that candidate. No further transactions in progress from that candidate. No further
retransmissions take place, and no further transactions from that retransmissions take place, and no further transactions from that
candidate will be made. candidate will be made.
3. If the agent receives a STUN Binding Request for that candidate 3. If the agent receives a STUN Binding Request for that candidate
pair, the agent SHOULD generate a 430 response. pair, however, processing occurs as defined in Section 7.8.
If a candidate pair existed previously, and continues to exist, no If a candidate pair existed previously, and continues to exist, no
changes are made; any STUN transactions in progress for that changes are made; any STUN transactions in progress for that
candidate pair continue, and it remains on the candidate pair candidate pair continue, it remains on the candidate pair priority
priority ordered list and candidate pair check ordered list. ordered list, and its transport address pairs remain on the transport
address pair check ordered list.
If a candidate pair is new (because either its native candidate is If a candidate pair is new (because either its native candidate is
new, or its remote candidate is new, or both), the agent takes the new, or its remote candidate is new, or both), the agent takes the
role of answerer for this candidate pair. The new candidate pair is role of answerer for this candidate pair. The new candidate pair is
inserted into the candidate pair priority ordered list and candidate inserted into the candidate pair priority ordered list, and the
pair check ordered list. STUN connectivity checks will start for transport address pair check ordered list is rederived. STUN
them based on the logic described in Section 7.6. connectivity checks will start for them based on the logic described
in Section 7.6.
7.11.3 Receiving the Answer 7.11.3. Receiving the Answer
Once the answer is received, the answerer will have the set of native Once the answer is received, the answerer will have the set of native
and remote candidates before this offer/answer exchange, and the set and remote candidates before this offer/answer exchange, and the set
of native and remote candidates afterwards. It then follows the same of native and remote candidates afterwards. It then follows the same
logic described in Section 7.11.2, pairing up the candidate pairs, logic described in Section 7.11.2, pairing up the candidate pairs,
removing ones that are no longer in use, and beginning of processing removing ones that are no longer in use, and beginning of processing
for ones that are new. for ones that are new.
7.12 Binding Keepalives 7.12. Binding Keepalives
Once a candidate is promoted to active, and media begins flowing, it Once a candidate is promoted to operating, and media begins flowing,
is still necessary to keep the bindings alive at intermediate NATs it is still necessary to keep the bindings alive at intermediate NATs
for the duration of the session. Normally, the media stream packets for the duration of the session. Normally, the media stream packets
themselves (e.g., RTP) meet this objective. However, several cases themselves (e.g., RTP) meet this objective. However, several cases
merit further discussion. Firstly, in some RTP usages, such as SIP, merit further discussion. Firstly, in some RTP usages, such as SIP,
the media streams can be "put on hold". This is accomplished by the media streams can be "put on hold". This is accomplished by
using the SDP "sendonly" or "inactive" attributes, as defined in RFC using the SDP "sendonly" or "inactive" attributes, as defined in RFC
3264 [4]. RFC 3264 directs implementations to cease transmission of 3264 [4]. RFC 3264 directs implementations to cease transmission of
media in these cases. However, doing so may cause NAT bindings to media in these cases. However, doing so may cause NAT bindings to
timeout, and media won't be able to come off hold. timeout, and media won't be able to come off hold.
Secondly, some RTP payload formats, such as the payload format for Secondly, some RTP payload formats, such as the payload format for
text conversation [36], may send packets so infrequently that the text conversation [31], may send packets so infrequently that the
interval exceeds the NAT binding timeouts. interval exceeds the NAT binding timeouts.
Thirdly, if silence suppression is in use, long periods of silence Thirdly, if silence suppression is in use, long periods of silence
may cause media transmission to cease sufficiently long for NAT may cause media transmission to cease sufficiently long for NAT
bindings to time out. bindings to time out.
To prevent these problems, ICE implementations MUST continue to list To prevent these problems, ICE implementations MUST continue to list
their active candidate in a=candidate lines for UDP-based media their operating candidate in a=candidate lines for UDP-based media
streams. As a consequence of this, STUN packets will be transmitted streams. As a consequence of this, STUN packets will be transmitted
periodically independently of the transmission (or lack thereof) of periodically independently of the transmission (or lack thereof) of
media packets. This provides a media independent, RTP independent, media packets. These will be received on the same IP address and
and codec independent solution for keeping the NAT bindings alive. port as the media streams. The agent determines whether the packet
is media or STUN by looking for the magic cookie in bits 32-63 of the
data. If present, it indicates that the packet is STUN, and if not,
indicates that it is media. This provides a media independent, RTP
independent, and codec independent solution for keeping the NAT
bindings alive. However, an ICE implementation MUST be prepared for
the transport address received in an m/c-line to not correspond to
any a=candidate attributes.
If an ICE implementation is communciating with one that does not If an ICE implementation is communciating with one that does not
support ICE, keepalives MUST still be sent. Indeed, these keepalives support ICE, keepalives MUST still be sent. Indeed, these keepalives
are essential even if neither endpoint implements ICE. As such, this are essential even if neither endpoint implements ICE. As such, this
specification defines keepalive behavior generally, for endpoints specification defines keepalive behavior generally, for endpoints
that support ICE, and those that do not. that support ICE, and those that do not.
All endpoints MUST send keepalives for each media session. These All endpoints MUST send keepalives for each media session. These
keepalives MUST be sent regardless of whether the media stream is keepalives MUST be sent regardless of whether the media stream is
currently inactive, sendonly, recvonly or sendrecv. The keepalive currently inactive, sendonly, recvonly or sendrecv. The keepalive
SHOULD be sent using a format which is supported by its peer. ICE SHOULD be sent using a format which is supported by its peer. ICE
endpoints allow for STUN-based keepalives for UDP streams, and as endpoints allow for STUN-based keepalives for UDP streams, and as
such, STUN keepalives MUST be used when an agent is communicating such, STUN keepalives MUST be used when an agent is communicating
with a peer that supports ICE. An agent can determine that its peer with a peer that supports ICE. An agent can determine that its peer
supports ICE by the presence of the a=candidate attributes for each supports ICE by the presence of the a=candidate attributes for each
media session. If the peer does not support ICE, the choice of a media session. If the peer does not support ICE, the choice of a
packet format for keepalives is a matter of local implementation. A packet format for keepalives is a matter of local implementation. A
format which allows packets to easily be sent in the absence of format which allows packets to easily be sent in the absence of
actual media content is RECOMMENDED. Examples of formats which actual media content is RECOMMENDED. Examples of formats which
readily meet this goal are RTP No-Op [31] and RTP comfort noise [26]. readily meet this goal are RTP No-Op [28] and RTP comfort noise [24].
If the peer doesn't support any formats that are particularly well
suited for keepalives, an agent SHOULD send RTP packets with an
incorrect version number, or some other form of error which would
cause them to be discarded by the peer.
STUN-based keepalives will be sent periodically every Tr seconds as a STUN-based keepalives will be sent periodically every Tr seconds as a
consequence of the rules in in Section 7.7. If STUN keepalives are consequence of the rules in in Section 7.7. If STUN keepalives are
not in use (because the peer does not support ICE), an agent SHOULD not in use (because the peer does not support ICE), an agent SHOULD
ensure that a media packet is sent every Tr seconds. If one is not ensure that a media packet is sent every Tr seconds. If one is not
sent as a consequence of normal media communications, a keepalive sent as a consequence of normal media communications, a keepalive
packet using one of the formats discussed above SHOULD be sent. packet using one of the formats discussed above SHOULD be sent.
7.13 Sending Media 7.13. Sending Media
When an agent receives an offer and sends an answer, or when it When an agent receives an offer and sends an answer, or when it
receives an answer to an offer it sent, it begins connectivity receives an answer to an offer it sent, it begins connectivity
checks. These checks will include validation of the active candidate checks. If there is a candidate that corresponds to the m/c-line,
pair, if there was one. An agent SHOULD NOT send media on the active these checks will include validation of the operating candidate pair.
In that case, an agent SHOULD NOT send media on the operating
candidate pair until that candidate pair has reached the Valid or candidate pair until that candidate pair has reached the Valid or
Recv-Valid state. This is to help prevent a denial-of-service Recv-Valid state. This is to help prevent a denial-of-service
attack, described in Section 13. Once the active candidate pair attack, described in Section 13. Once the operating candidate pair
reaches the Valid or Recv-Valid state, an agent MAY start sending reaches the Valid or Recv-Valid state, an agent MAY start sending
media to that candidate pair. media to that candidate pair. If there is no candidate that
corresponds to the m/c-line, the m/c-line cannot be validated, and
media is sent to it as described in RFC 3264 [4]. Under normal
conditions, there will be a candidate for the m/c-line. Indeed - ICE
itself requires that an agent include one. However, actual SIP
deployments have seen usage of network intermediaries which
manipulate the m/c-line of offers and answers. Should such elements
ignore the candidate attributes, it would manifest itself like an
agent which did not include a candidate for the m/c-line. For this
reason, this use case is explicitly supported by ICE.
However, offer/answer exchanges are used with protocols, like SIP, Offer/answer exchanges are used with protocols, like SIP, which
which require media to be sent "early", from the answerer to the require media to be sent "early", from the answerer to the offer,
offer, prior to completion of the initial offer/answer exchange. It prior to completion of the initial offer/answer exchange. It is
is highly desirable (and sometimes necessary) for this early media to highly desirable (and sometimes necessary) for this early media to
use the candidate pair ultimately selected by ICE connectivity use the candidate pair ultimately selected by ICE connectivity
checks. For this reason, ICE provides an early media mechanism that checks. For this reason, ICE provides an early media mechanism that
allows for a candidate pair to be used in one direction prior to its allows for a candidate pair to be used in one direction prior to its
promotion to active in a subsequent offer/answer exchange. Note promotion to operating in a subsequent offer/answer exchange. Note
that, with ICE, early media pertains to media sent to a candidate that, with ICE, early media pertains to media sent to a candidate
pair until its promotion to active in a subsequent offer/answer pair until its promotion to operating in a subsequent offer/answer
exchange. This is a broader definition than is used in [29], which exchange. This is a broader definition than is used in [26], which
defines early media as media sent prior to acceptance of a call. defines early media as media sent prior to acceptance of a call.
As a consequence of the connectivity checks, an agent will change the As a consequence of the connectivity checks, an agent will change the
states for each transport address pair, and consequently, for the states for each transport address pair, and consequently, for the
candidate pairs. When a candidate pair becomes Valid or Recv-Valid, candidate pairs. When a candidate pair becomes Valid or Recv-Valid,
and the candidate pair is not equal to the active candidate pair, and and there is a candidate pair for the m/c-line, and the candidate
the agent is in the role of answerer for that candidate pair, the pair is not equal to the operating candidate pair, and the agent is
agent checks the position of that pair in the candidate pair priority in the role of answerer for that candidate pair, the agent checks the
ordered list. If it is the first, the agent selects this candidate position of that pair in the candidate pair priority ordered list.
pair for early media. If this candidate pair is not the first on the If it is the first, the agent selects this candidate pair for early
candidate pair priority ordered list, but is higher priority than the media. If this candidate pair is not the first on the candidate pair
active candidate pair, and the early media wait-state timer has not priority ordered list, but is higher priority than the operating
yet been set, the agent sets this timer to Tws seconds. Tws SHOULD candidate pair, and the early media wait-state timer has not yet been
be configurable, and SHOULD have a default of 100ms. This timer set, the agent sets this timer to Tws seconds. Though the early
allows for a higher priority connectivity check to complete, in the media wait state timer has the same value as the wait state timer
event its STUN Binding Request or Response was lost or delayed in the described in Section 7.9, these are different timers and indeed are
network. If, prior to the wait-state timer firing, another set by different entites. The early media wait state timer allows
connectivity check completes and a candidate pair enters the Valid or for a higher priority connectivity check to complete, in the event
Recv-Valid states, there is no need to reset or cancel the timer. its STUN Binding Request or Response was lost or delayed in the
Once the timer fires, the agent SHOULD select the highest priority network. If, prior to the early media wait-state timer firing,
candidate pair in the Valid or Recv-Valid state for which the agent another connectivity check completes and a candidate pair enters the
has the role of answerer, and use that candidate pair for early Valid or Recv-Valid states, there is no need to reset or cancel the
media. timer. Once the timer fires, the agent SHOULD select the highest
priority candidate pair in the Valid or Recv-Valid state for which
the agent has the role of answerer, and use that candidate pair for
early media.
ICE processing will ensure that, under almost all circumstances, the ICE processing will ensure that, under almost all circumstances, the
candidate pair selected by the answerer for early media will also be candidate pair selected by the answerer for early media will also be
the one selected by the offerer for eventual promotion to active. the one selected by the offerer for eventual promotion to operating.
The early media state implies that the answerer knows that this The early media state implies that the answerer knows that this
candidate pair is to be used, but the offerer doesn't know yet that candidate pair is to be used, but the offerer doesn't know yet that
it will eventually be validated. It is for this reason that the it will eventually be validated. It is for this reason that the
candidate pair can be used for early media. candidate pair can be used for early media.
If a candidate pair is selected for early media, an agent MAY send If a candidate pair is selected for early media, an agent MAY send
media on that candidate pair, even if it is not the same as the media on that candidate pair, even if it is not the same as the
active candidate pair. However, to deal with cases in which the operating candidate pair. However, to deal with cases in which the
offerer and answerer do not agree on the eventual selection of this offerer and answerer do not agree on the eventual selection of this
candidate for promotion to active (a rare but possible case), the candidate for promotion to operating (a rare but possible case), the
agent MUST discontinue using the candidate pair for sending media Tlo agent MUST discontinue using the candidate pair for sending media Tlo
seconds after the answer has been reliably delivered. An answer is seconds after the next opportunity its peer would have to send an
considered reliably delivered when the agent receives a confirmation updated offer. In the case of an answer delivered in a 200 OK to an
that is has been delivered. In the case of an answer delivered in a offer in a SIP INVITE (regardless of whether that same answer
200 OK to an offer in an INVITE (in the SIP case), the answer is appeared in an earlier unreliable provisional response), this would
considered reliably delivered upon receipt of the ACK. Tlo SHOULD be be Tlo seconds after receipt of the ACK. Tlo SHOULD be configurable
configurable and SHOULD have a default of 5 seconds. This time and SHOULD have a default of 5 seconds. This time represents the
represents the amount of time it should take the offerer to perform amount of time it should take the offerer to perform its connectivity
its connectivity checks, arrive at the same conclusion about the checks, arrive at the same conclusion about the viability of the
viability of the early candidate, and then generate an updated offer early candidate, and then generate an updated offer promoting it to
promoting it to active. If, after Tlo seconds, no updated offer operating. If, after Tlo seconds, no updated offer arrives, the
arrives, the answerer MUST cease using the early candidate. Media answerer MUST cease using the early candidate. Media MAY be sent to
MAY be sent to the active candidate pair if it is in the Valid or the operating candidate pair if it is in the Valid or Recv-Valid
Recv-Valid state. state.
If an updated offer does arrive prior to the expiration of the timer, If an updated offer does arrive prior to the expiration of the timer,
the agent MUST execute the procedures in Section 7.11.2, which will the agent MUST execute the procedures in Section 7.11.2, which will
result in the selection of a candidate for the m/c-line in the result in the selection of a candidate for the m/c-line in the
answer. At that point, the procedures of this section SHOULD be answer. At that point, the procedures of this section SHOULD be
restarted by the answerer. This implies that the active candidate restarted by the answerer. This implies that the operating candidate
pair, if Valid or Recv-Valid, will be used. If a higher priority pair, if Valid or Recv-Valid, will be used. If a higher priority
candidate pair subsequently enters the Valid or Recv-Valid state, it candidate pair subsequently enters the Valid or Recv-Valid state, it
may end up being used as an early candidate. may end up being used as an early candidate.
To use a candidate pair, whether it is early or active, media is sent To use a candidate pair, whether it is early or operating, media is
to the IP addresses and ports of the components in the remote sent to the IP addresses and ports of the components in the remote
candidate, and sends that media from the IP addresses and ports of candidate, and sends that media from the IP addresses and ports of
the components in the native candidate. Transport addresses are the components in the native candidate. Transport addresses are
paired up based on component ID. For example, if a remote candidate paired up based on component ID. For example, if a remote candidate
has two components R1 and R2, and the native candidate has two has two components R1 and R2, and the native candidate has two
components L1 and L2, media packets are sent from L1 to R1 and from components L1 and L2, media packets are sent from L1 to R1 and from
L2 to R2. This provides a property known as symmetry. This L2 to R2. This provides a property known as symmetry. This
symmetric behavior MUST be followed by an agent even if its peer in symmetric behavior MUST be followed by an agent even if its peer in
the session doesn't support ICE. the session doesn't support ICE.
The definition of sending media "from" a particular transport address The definition of sending media "from" a particular transport address
skipping to change at page 51, line 39 skipping to change at page 63, line 44
The newer candidate may result in RTP packets taking a different path The newer candidate may result in RTP packets taking a different path
through the network - one with different delay characteristics. As through the network - one with different delay characteristics. As
discussed below, agents are encouraged to re-adjust jitter buffers discussed below, agents are encouraged to re-adjust jitter buffers
when there are changes in source or destination address. when there are changes in source or destination address.
Furthermore, many audio codecs use the marker bit to signal the Furthermore, many audio codecs use the marker bit to signal the
beginning of a talkspurt, for the purposes of jitter buffer beginning of a talkspurt, for the purposes of jitter buffer
adaptation. For such codecs, it is RECOMMENDED that the sender adaptation. For such codecs, it is RECOMMENDED that the sender
change the marker bit when an agent switches transmission of media change the marker bit when an agent switches transmission of media
from one candidate pair to another. from one candidate pair to another.
7.14 Receiving Media 7.14. Receiving Media
ICE implementations MUST be prepared to receive media on a candidate ICE implementations MUST be prepared to receive media on a candidate
pair if it is in the role of offerer for that candidate pair, even if pair if it is in the role of offerer for that candidate pair, even if
that candidate pair is not currently active. This is a consequence that candidate pair is not currently operating. This is a
of the early media mechanism described in the previous section. consequence of the early media mechanism described in the previous
section.
If an agent determines that its peer supports ICE (an offerer knows If an agent determines that its peer supports ICE (an offerer knows
this when the answer contains a=candidate attributes), it SHOULD this when the answer contains a=candidate attributes), it SHOULD
discard any media packets received on a candidate pair prior to the discard any media packets received on a candidate pair prior to the
candidate pair entering the Send Valid state. This helps eliminate candidate pair entering the Send Valid state. This helps eliminate
certain attacks, as discussed in Section 13. certain attacks, as discussed in Section 13. Note that, in cases of
forking, an agent may get multiple answers to its offer, each for a
different peer. Consequently, if would only discard media packets
received on a candidate pair once it has determined that all forked
targets support ICE.
It is RECOMMENDED that, when an agent receives an RTP packet with a It is RECOMMENDED that, when an agent receives an RTP packet with a
new source or destination IP address for a particular media stream, new source or destination IP address for a particular media stream,
that the agent re-adjust its jitter buffers. that the agent re-adjust its jitter buffers.
RFC 3550 [23] describes an algorithm in Section 8.2 for detecting RFC 3550 [21] describes an algorithm in Section 8.2 for detecting
SSRC collisions and loops. These algorithms are based, in part, on SSRC collisions and loops. These algorithms are based, in part, on
seeing different source IP addresses and ports with the same SSRC. seeing different source IP addresses and ports with the same SSRC.
However, when ICE is used, such changes will naturally occur as the However, when ICE is used, such changes will naturally occur as the
media streams switch between candidates. An agent will be able to media streams switch between candidates. An agent will be able to
determine that a media stream is from the same peer as a consequence determine that a media stream is from the same peer as a consequence
of the STUN exchange that proceeds media transmission. Thus, if of the STUN exchange that proceeds media transmission. Thus, if
there is a change in source IP address and port, but the media there is a change in source IP address and port, but the media
packets come from the same peer agent, this SHOULD NOT be treated as packets come from the same peer agent, this SHOULD NOT be treated as
an SSRC collision. an SSRC collision.
8. Guidelines for Usage with SIP 8. Guidelines for Usage with SIP
SIP [2] makes use of the offer/answer model, and is one of the SIP [2] makes use of the offer/answer model, and is one of the
primary targets for usage of ICE. SIP allows for offer/answer primary targets for usage of ICE. SIP allows for offer/answer
exchanges to occur in many different combinations of messages, exchanges to occur in many different combinations of messages,
including INVITE/200 OK and 200 OK/ACK. When support for reliable including INVITE/200 OK and 200 OK/ACK. When support for reliable
provisional responses (RFC 3262 [11]) and UPDATE (RFC 3311 [27]) are provisional responses (RFC 3262 [11]) and UPDATE (RFC 3311 [25]) are
added, additional combinations of messages that can be used for added, additional combinations of messages that can be used for
offer/answer exchanges are added. As such, this section provides offer/answer exchanges are added. As such, this section provides
some guidance on good ways to make use of SIP with ICE. some guidance on good ways to make use of SIP with ICE.
ICE requires a series of STUN-based connectivity checks to take place ICE requires a series of STUN-based connectivity checks to take place
between endpoints. These checks start from the answerer on between endpoints. These checks start from the answerer on
generation of its answer, and start from the offerer when it receives generation of its answer, and start from the offerer when it receives
the answer. These checks can take time to complete, and as such, the the answer. These checks can take time to complete, and as such, the
selection of messages to use with offers and answers can effect selection of messages to use with offers and answers can effect
perceived user latency. Two latency of figures are of particular perceived user latency. Two latency figures are of particular
interest. These are the post-pickup delay and the post-dial delay. interest. These are the post-pickup delay and the post-dial delay.
The post-pickup delay refers to the time between when a user "answers The post-pickup delay refers to the time between when a user "answers
the phone" and when any speech they utter can be delivered to the the phone" and when any speech they utter can be delivered to the
caller. The post-dial delay refers to the time between when a user caller. The post-dial delay refers to the time between when a user
enters the destination address for the user, and ringback begins as a enters the destination address for the user, and ringback begins as a
consequence of having succesfully started ringing the phone of the consequence of having succesfully started ringing the phone of the
called party. called party.
To reduce post-dial delays, it is RECOMMENDED that the caller begin To reduce post-dial delays, it is RECOMMENDED that the caller begin
gathering candidates prior to actually sending its initial INVITE. gathering candidates prior to actually sending its initial INVITE.
This can be started upon user interface cues that a call is pending, This can be started upon user interface cues that a call is pending,
such as activity on a keypad or the phone going offhook. such as activity on a keypad or the phone going offhook.
To reduce post-pickup delays, ICE allows for media to be sent from To reduce post-pickup delays, ICE allows for media to be sent from
the answerer to the offerer on a candidate pair, prior to its the answerer to the offerer on a candidate pair, prior to its
promotion to active. However, this requires the answerer to have promotion to operating. However, this requires the answerer to have
generated its answer and sent it. In most cases, it will require generated its answer and sent it. In most cases, it will require
this answer to be received by the offerer. The reason is that this answer to be received by the offerer. The reason is that
connectivity checks or RTP packets from the answerer to the offerer connectivity checks or RTP packets from the answerer to the offerer
will not be forwarded by NATs towards the offerer until the offerer will not be forwarded by NATs towards the offerer until the offerer
has established a permission in the NAT by generating a packet has established a permission in the NAT by generating a packet
towards the answerer. towards the answerer.
For this reason, if an offer is received in an INVITE request, the For this reason, if an offer is received in an INVITE request, the
UAS SHOULD immediately gather its candidates and then generate an UAS SHOULD immediately gather its candidates and then generate an
answer in a provisional response. When reliable provisional answer in a provisional response. When reliable provisional
responses are not used, the SDP in the provisional response is not responses are not used, the SDP in the provisional response is the
formally the answer; the value in the 200 OK is the actual answer. answer, and that exact same answer reappears in the 200 OK. To deal
However, RFC 3261 allows for SDP to appear in an unreliable
provisional response, in which case its value has to be identical to
the value placed in the 200 OK. Thus, we refer to the SDP in the
provisional response, even when unreliable, as the answer. To deal
with possible losses of the provisional response, it SHOULD be with possible losses of the provisional response, it SHOULD be
retransmitted until some indication of receipt. This indication can retransmitted until some indication of receipt. This indication can
either be through PRACK [11], or through the receipt of a STUN either be through PRACK [11], or through the receipt of a STUN
Binding Request with a correct username and password. Even if PRACK Binding Request with a correct username and password. Even if PRACK
is not used, the provisional response SHOULD be retransmitted using is not used, the provisional response SHOULD be retransmitted using
the exponential backoff described in [11]. Furthermore, once the the exponential backoff described in [11]. Furthermore, once the
answer has been sent, the agent SHOULD begin its connectivity checks. answer has been sent, the agent SHOULD begin its connectivity checks.
Once a candidate reaches the Valid or Recv-Valid state, the UAS has a Once a candidate reaches the Valid or Recv-Valid state, the UAS has a
known-valid path for media packets towards the UAC. This point is known-valid path for media packets towards the UAC. This point is
called the connected point in ICE. called the connected point in ICE.
Once the UAS reaches the connected point, media can be sent from the Once the UAS reaches the connected point, media can be sent from the
UAS towards the UAC without any additional delays. However, between UAS towards the UAC without any additional delays. However, between
the receipt of the INVITE and the connected point, any media that the receipt of the INVITE and the connected point, any media that
needs to be sent towards the caller (such as SIP early media [29] needs to be sent towards the caller (such as SIP early media [26]
cannot be transmitted. For this reason, implementations MAY choose cannot be transmitted. For this reason, implementations MAY choose
to delay alerting the called party until the connected point is to delay alerting the called party until the connected point is
reached. In the case of a PSTN gateway, this would mean that the reached. In the case of a PSTN gateway, this would mean that the
setup message into the PSTN is delayed until the connected point. setup message into the PSTN is delayed until the connected point.
Doing this increases the post-dial delay, but has the effect of Doing this increases the post-dial delay, but has the effect of
eliminating 'ghost rings'. Ghost rings are cases where the called eliminating 'ghost rings'. Ghost rings are cases where the called
party hears the phone ring, picks up, but hears nothing and cannot be party hears the phone ring, picks up, but hears nothing and cannot be
heard. This technique works without requiring support for, or usage heard. This technique works without requiring support for, or usage
of, preconditions [7], since its a localized decision. It also has of, preconditions [7], since its a localized decision. It also has
the benefit of guaranteeing that not a single packet of early media the benefit of guaranteeing that not a single packet of early media
will get clipped. If an agent chooses to delay local alerting in will get clipped. If an agent chooses to delay local alerting in
this way, it SHOULD generate a 180 response once alerting begins. this way, it SHOULD generate a 180 response once alerting begins.
A slight variation of this approach is to wait for a connectivity A slight variation of this approach is to wait for a connectivity
check to succeed to a higher priority candidate pair than the active check to succeed to a higher priority candidate pair than the
one. This allows for the agent to only ever send media, early or operating one. This allows for the agent to only ever send media,
otherwise, to a single candidate, which will work better with jitter early or otherwise, to a single candidate, which will work better
buffers, at the expense of even greater post-dial delays. with jitter buffers, at the expense of even greater post-dial delays.
Note that, prior to the promotion of a candidate pair to active, the Note that, prior to the promotion of a candidate pair to operating,
offerer will not be able to send using the candidate pair. When used the offerer will not be able to send using the candidate pair. When
with SIP, if the initial offer is sent in the INVITE, and the answer used with SIP, if the initial offer is sent in the INVITE, and the
is sent in both the provisional and final 200 OK response, the answer is sent in both the provisional and final 200 OK response, the
offerer will not be able to send media until it sends a re-INVITE and offerer will not be able to send media until it sends a re-INVITE and
receives the 200 OK response to that re-INVITE. This can take receives the 200 OK response to that re-INVITE. This can take
several hundred milliseconds. If this latency is an issue (it is several hundred milliseconds. If this latency is an issue (it is
generally not considered an issue for voice systems), reliable generally not considered an issue for voice systems), reliable
provisional responses [11] MAY be used, in which case an UPDATE [27] provisional responses [11] MAY be used, in which case an UPDATE [25]
can be used to send an updated offer prior to the call being can be used to send an updated offer prior to the call being
answered. answered.
As discussed in Section 13, offer/answer exchanges SHOULD be secured As discussed in Section 13, offer/answer exchanges SHOULD be secured
against eavesdropping and man-in-the-middle attacks. To do that, the against eavesdropping and man-in-the-middle attacks. To do that, the
usage of SIPS [2] is RECOMMENDED when used in concert with ICE. usage of SIPS [2] is RECOMMENDED when used in concert with ICE.
9. Interactions with Forking 9. Interactions with Forking
SIP allows INVITE requests carrying offers to fork, which means that SIP allows INVITE requests carrying offers to fork, which means that
skipping to change at page 55, line 9 skipping to change at page 67, line 21
[7] and RFC 4032 [8], apply only to the IP addresses and ports listed [7] and RFC 4032 [8], apply only to the IP addresses and ports listed
in the m/c lines in an offer/answer. If ICE changes the address and in the m/c lines in an offer/answer. If ICE changes the address and
port where media is received, this change is reflected in the m/c port where media is received, this change is reflected in the m/c
lines of a new offer/answer. As such, it appears like any other re- lines of a new offer/answer. As such, it appears like any other re-
INVITE would, and is fully treated in RFC 3312 and 4032, which INVITE would, and is fully treated in RFC 3312 and 4032, which
applies without regard to the fact that the m/c lines are changing applies without regard to the fact that the m/c lines are changing
due to ICE negotiations ocurring "in the background". due to ICE negotiations ocurring "in the background".
However, usage of early candidates with QoS preconditions is NOT However, usage of early candidates with QoS preconditions is NOT
RECOMMENDED, since QoS will only be reserved for the candidate pair RECOMMENDED, since QoS will only be reserved for the candidate pair
in the m/c-line. An agent SHOULD only send to the active candidate in the m/c-line. An agent SHOULD only send to the operating
(once it enters the Valid or Recv-Valid states) if QoS preconditions candidate (once it enters the Valid or Recv-Valid states) if QoS
are used for a media session. preconditions are used for a media session.
ICE also has (purposeful) interactions with connectivity ICE also has (purposeful) interactions with connectivity
preconditions [30]. Those interactions are described there. preconditions [27]. Those interactions are described there.
11. Examples 11. Examples
This section provides two examples. One is a very basic example, and This section provides two examples. One is a very basic example, and
the other is more elaborate. A common configuration and setup is the other is more elaborate. A common configuration and setup is
used in both cases. used in both cases.
Two agents, L and R, are using ICE. Both agents have a single IPv4 Two agents, L and R, are using ICE. Both agents have a single IPv4
interface, and are configured with a single STUN server each (indeed, interface. For agent L, it is 10.0.1.1, and for agent R, 192.0.2.1.
the same one for each). This STUN server supports both the Binding Both are configured with a single STUN server each (indeed, the same
Discovery usage and the Relay usage. Agent L is behind a NAT, and one for each), which is listening for STUN requests at an IP address
agent R is on the public Internet. of 192.0.2.2 and port 3478. This STUN server supports both the
Binding Discovery usage and the Relay usage. Agent L is behind a
To facilitate understanding, transport addresses are listed in a NAT, and agent R is on the public Internet. The public side of the
mnemonic form. This form is entity-type-seqno, where entity refers NAT has an IP address of 192.0.2.3.
to the entity whose interface the transport address is on, and is one
of "L", "R", "STUN", or "NAT". The type is either "PUB" for
transport addresses that are public, and "PRIV" for transport
addresses that are private. Finally, seq-no is a sequence number
that is different for each transport address of the same type on a
particular entity.
The STUN server has advertised transport address STUN-PUB-1 for both To facilitate understanding, transport addresses are listed using
the binding discovery usage and the relay usage. variables that have mnemonic names. This format of the anem is
entity-type-seqno, where entity refers to the entity whose interface
the transport address is on, and is one of "L", "R", "STUN", or
"NAT". The type is either "PUB" for transport addresses that are
public, and "PRIV" for transport addresses that are private.
Finally, seq-no is a sequence number that is different for each
transport address of the same type on a particular entity. Each
variable has an IP address and port, denoted by varname.IP and
varname.PORT, respectively, where varname is the name of the
variable.
In addition, candidate IDs are also listed in mnemonic form. Agent L In addition, candidate IDs are also listed using variables that have
uses candidate ID L1 for its local candidate, L2 for its server mnemonic names. Agent L uses candidate ID L1 for its local
reflexive candidate, and L3 for its relayed candidate. Agent R uses candidate, L2 for its server reflexive candidate, and L3 for its
R1 for its local candidate and R2 for its relayed candidate. The relayed candidate. Agent R uses R1 for its local candidate and R2
password is LPASS for each candidate from agent L, and RPASS for each for its relayed candidate. The password is LPASS for each candidate
candidate from agent R. from agent L, and RPASS for each candidate from agent R.
In example SDP messages, $TADDR.IP is used to refer to the value of The STUN server has advertised transport address STUN-PUB-1 (which is
the IP address of the transport address with mnemonic name "taddr". 192.0.2.2:3478) for both the binding discovery usage and the relay
Similarly, $TADDR.PORT is used to refer to the value of the port of usage.
the transport address with mnemonic name "TADDR".
In the call flow itself, STUN messages are annotated with several In the call flow itself, STUN messages are annotated with several
attributes. The "S=" attribute indicates the source transport attributes. The "S=" attribute indicates the source transport
address of the message. The "D=" attribute indicates the destination address of the message. The "D=" attribute indicates the destination
transport address of the message. The "MA=" attribute is used in transport address of the message. The "MA=" attribute is used in
STUN Binding Response messages, STUN Binding Response messages STUN Binding Response messages, STUN Binding Response messages
carried in a STUN Send Request or Data Indication, and in a Allocate carried in a STUN Send Request or Data Indication, and in a Allocate
Response, and refers to the reflexive transport address derived from Response, and refers to the reflexive transport address derived from
the XOR-MAPPED-ADDRESS attribute. The "RA=" attribute is used in the XOR-MAPPED-ADDRESS attribute. The "RA=" attribute is used in
STUN Data Indications, and refers to the value of the REMOTE-ADDRESS STUN Data Indications, and refers to the value of the REMOTE-ADDRESS
attribute. The "U=" attribute is used in STUN Requests, and attribute. The "U=" attribute is used in STUN Requests, and
corresponds to the STUN USERNAME. The "DA=" attribute is used in corresponds to the STUN USERNAME. The "DA=" attribute is used in
STUN Send requests, and refers to the value of the DESTINATION- STUN Send requests, and refers to the value of the DESTINATION-
ADDRESS attribute. The "R=" attribute is used in Allocate responses, ADDRESS attribute. The "R=" attribute is used in Allocate responses,
and it indicates the value of the RELAY-ADDRESS attribute. and it indicates the value of the RELAY-ADDRESS attribute.
The call flow examples omit STUN authentication operations. The call flow examples omit STUN authentication operations.
11.1 Basic Example 11.1. Basic Example
In this example, the NAT has the address and port independent mapping In this example, the NAT has an endpoint independent mapping property
property and the address dependent permission property. Neither and an address dependent filtering property. Neither agent is using
agent is using the STUN relay usage, only the binding discovery the STUN relay usage, only the binding discovery usage. As a
usage. As a consequence, agent L will end up with two candidates - a consequence, agent L will end up with two candidates - a local
local candidate and a server reflexive candidate. Agent R will have candidate and a server reflexive candidate. Agent R will have one -
one - a local candidate (the reflexive candidate will be identical to a local candidate (the reflexive candidate will be identical to the
the local one, and thus discarded). The agents are seeking to local one, and thus discarded). The agents are seeking to
communicate using a single RTP-based voice stream. RTCP is not used. communicate using a single RTP-based voice stream. RTCP is not used.
As a consequence, each candidate has one component. As a consequence, each candidate has one component.
L NAT STUN R L NAT STUN R
| | | |
| | | |
| | | |
|RTP STUN alloc. | | |RTP STUN alloc. | |
| | | |
| | | |
| | | |
|(1) STUN Req | | | |(1) STUN Req | | |
|S=L-PRIV-1 | | | |S=$L-PRIV-1 | | |
|D=STUN-PUB-1 | | | |D=$STUN-PUB-1 | | |
|------------->| | | |------------->| | |
| | | |
| | | |
| |(2) STUN Req | | | |(2) STUN Req | |
| |S=NAT-PUB-1 | | | |S=$NAT-PUB-1 | |
| |D=STUN-PUB-1 | | | |D=$STUN-PUB-1 | |
| |------------->| | | |------------->| |
| | | |
| |(3) STUN Res | | | |(3) STUN Res | |
| |S=STUN-PUB-1 | | | |S=$STUN-PUB-1 | |
| |D=NAT-PUB-1 | | | |D=$NAT-PUB-1 | |
| |MA=NAT-PUB-1 | | | |MA=$NAT-PUB-1 | |
| |<-------------| | | |<-------------| |
| | | |
|(4) STUN Res | | | |(4) STUN Res | | |
|S=STUN-PUB-1 | | | |S=$STUN-PUB-1 | | |
|D=L-PRIV-1 | | | |D=$L-PRIV-1 | | |
|MA=NAT-PUB-1 | | | |MA=$NAT-PUB-1 | | |
|<-------------| | | |<-------------| | |
| | | |
| | | |
| | | |
| | | |
|(5) Offer | | | |(5) Offer | | |
|------------------------------------------->| |------------------------------------------->|
| | | |
| | | |
| | | |
| | | |
| | | |RTP STUN alloc. | | | |RTP STUN alloc.
| | | |
| | | |
| | | |
| | |(6) STUN Req | | | |(6) STUN Req |
| | |S=R-PUB-1 | | | |S=$R-PUB-1 |
| | |D=STUN-PUB-1 | | | |D=$STUN-PUB-1 |
| | |<-------------| | | |<-------------|
| | | |
| | |(7) STUN Res | | | |(7) STUN Res |
| | |S=STUN-PUB-1 | | | |S=$STUN-PUB-1 |
| | |D=R-PUB-1 | | | |D=$R-PUB-1 |
| | |MA=R-PUB-1 | | | |MA=$R-PUB-1 |
| | |------------->| | | |------------->|
| | | |
| | | |
| | | |
| | | |
|(8) answer | | | |(8) answer | | |
|<-------------------------------------------| |<-------------------------------------------|
| | | | | |(9) Bind Req | |
| | | | | |S=$R-PUB-1 | |
|(9) Bind Req | | | | |D=$NAT-PUB-1 | |
|S=L-PRIV-1 | | | | |<----------------------------|
|D=R-PUB-1 | | | | |Dropped | |
|(10) Bind Req | | |
|S=$L-PRIV-1 | | |
|D=$R-PUB-1 | | |
|------------->| | | |------------->| | |
| | | | | |(11) Bind Req | |
| | | | | |S=$NAT-PUB-1 | |
| |(10) Bind Req | | | |D=$R-PUB-1 | |
| |S=NAT-PUB-1 | |
| |D=R-PUB-1 | |
| |---------------------------->| | |---------------------------->|
| | | | | |(12) Bind Res | |
| |(11) Bind Res | | | |S=$R-PUB-1 | |
| |S=R-PUB-1 | | | |D=$NAT-PUB-1 | |
| |D=NAT-PUB-1 | | | |MA=$NAT-PUB-1 | |
| |MA=NAT-PUB-1 | |
| |<----------------------------| | |<----------------------------|
| | | | |(13) Bind Res | | |
|(12) Bind Res | | | |S=$R-PUB-1 | | |
|S=R-PUB-1 | | | |D=$L-PRIV-1 | | |
|D=L-PRIV-1 | | | |MA=$NAT-PUB-1 | | |
|MA=NAT-PUB-1 | | |
|<-------------| | | |<-------------| | |
| | | |
| | | |
| | | |
| | | |
|RTP flows | | | |RTP flows | | |
| | | | | |(14) Bind Req | |
| | | | | |S=$R-PUB-1 | |
| | | | | |D=$NAT-PUB-1 | |
| |(13) Bind Req | |
| |S=R-PUB-1 | |
| |D=NAT-PUB-1 | |
| |<----------------------------| | |<----------------------------|
| | | | |(15) Bind Req | | |
| | | | |S=$R-PUB-1 | | |
|(14) Bind Req | | | |D=$L-PRIV-1 | | |
|S=R-PUB-1 | | |
|D=L-PRIV-1 | | |
|<-------------| | | |<-------------| | |
| | | | |(16) Bind Res | | |
|(15) Bind Res | | | |S=$L-PRIV-1 | | |
|S=L-PRIV-1 | | | |D=$R-PUB-1 | | |
|D=R-PUB-1 | | | |MA=$R-PUB-1 | | |
|MA=R-PUB-1 | | |
|------------->| | | |------------->| | |
| | | | | |(17) Bind Res | |
| |(16) Bind Res | | | |S=$NAT-PUB-1 | |
| |S=NAT-PUB-1 | | | |D=$R-PUB-1 | |
| |D=R-PUB-1 | | | |MA=$R-PUB-1 | |
| |MA=R-PUB-1 | |
| |---------------------------->| | |---------------------------->|
| | | |
| | | |
| | | |
| | | |
| | | |RTP flows | | | |RTP flows
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
Figure 8 Figure 12
First, agent L obtains a server reflexive transport address for its First, agent L obtains a server reflexive transport address for its
RTP packets (messages 1-4). Recall that the NAT has the address and RTP packets (messages 1-4). Recall that the NAT has the address and
port independent mapping property. Here, it creates a binding of port independent mapping property. Here, it creates a binding of
NAT-PUB-1 for this UDP request, and this becomes the server reflexive NAT-PUB-1 for this UDP request, and this becomes the server reflexive
transport address for RTP, the sole component of its server reflexive transport address for RTP, the sole component of its server reflexive
candidate. candidate.
With its two candidates, agent L prioritizes them, choosing the local With its two candidates, agent L prioritizes them, choosing the local
candidate as highest priority, followed by the server reflexive candidate as highest priority, followed by the server reflexive
candidate. It chooses its server reflexive candidate as the active candidate. It chooses its server reflexive candidate as the
candidate, and encodes it into the m/c-line. The resulting offer operating candidate, and encodes it into the m/c-line. The resulting
(message 5) looks like: offer (message 5) looks like (lines folded for clarity):
v=0 v=0
o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP
s= s=
c=IN IP4 $STUN-PUB-1.IP c=IN IP4 $NAT-PUB-1.IP
t=0 0 t=0 0
a=ice-pwd:$LPASS a=ice-pwd:$LPASS
m=audio $STUN-PUB-1.PORT RTP/AVP 0 m=audio $NAT-PUB-1.PORT RTP/AVP 0
a=rtpmap:0 PCMU/8000 a=rtpmap:0 PCMU/8000
a=candidate $L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT a=candidate:$L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT typ local
a=candidate $L2 1 UDP 0.7 $NAT-PUB-1.IP $NAT-PUB-1.PORT a=candidate:$L2 1 UDP 0.7 $NAT-PUB-1.IP $NAT-PUB-1.PORT typ srflx raddr
$L-PRIV-1.IP rport $L-PRIV-1.PORT
The offer, with the variables replaced with their values, will look
like (lines folded for clarity):
v=0
o=jdoe 2890844526 2890842807 IN IP4 10.0.1.1
s=
c=IN IP4 192.0.2.3
t=0 0
a=ice-pwd:asd88fgpdd777uzjYhagZg
m=audio 45664 RTP/AVP 0
a=rtpmap:0 PCMU/8000
a=candidate:8hhY 1 UDP 1.0 10.0.1.1 8998 typ local
a=candidate:Bzo8 1 UDP 0.7 192.0.2.3 45664 typ srflx raddr
10.0.1.1 rport 8998
This offer is received at agent R. Agent R will gather its server This offer is received at agent R. Agent R will gather its server
reflexive transport address (messages 6-7). Since R is not behind a reflexive transport address (messages 6-7). Since R is not behind a
NAT, this address is identical to its local transport address, and NAT, this address is identical to its local transport address, and
thus does not represent a separate candidate. It therefore ends up was obtained from its local transport address, and thus does not
with a single local candidate with a single component for RTP. Its represent a separate candidate. It therefore ends up with a single
resulting answer looks like: local candidate with a single component for RTP. Its resulting
answer looks like:
v=0 v=0
o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP
s= s=
c=IN IP4 $R-PUB-1.IP c=IN IP4 $R-PUB-1.IP
t=0 0 t=0 0
a=ice-pwd:$RPASS a=ice-pwd:$RPASS
m=audio $R-PUB-1.PORT RTP/AVP 0 m=audio $R-PUB-1.PORT RTP/AVP 0
a=rtpmap:0 PCMU/8000 a=rtpmap:0 PCMU/8000
a=candidate $R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT a=candidate:$R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT typ local
Next, agents L and R form candidate pairs and the transport address With the variables filled in:
check ordered list. This list will start with the single component
in the currently active candidate pair, L2:1:R1:1. Agent L begins v=0
its connectivity checks (messages 9-12), which succeed, placing the o=bob 2808844564 2808844564 IN IP4 192.0.2.1
transport address pair and resulting candidate pair into the Recv- s=
Valid state. Media can now flow. When agent R receives this request c=IN IP4 192.0.2.1
(message 10), the state of the candidate pair moves to Send-Valid. t=0 0
Agent R begins its connectivity checks (messages 13-16). When the a=ice-pwd:YH75Fviy6338Vbrhrlp8Yh
check arrives at the NAT (message 13), it is permitted to pass since m=audio 3478 RTP/AVP 0
a permission was created towards $R-PUB-1 as a consequence of message a=rtpmap:0 PCMU/8000
a=candidate:9uB6 1 UDP 1.0 192.0.2.1 3478 typ local
Next, agents L and R form candidate pairs, the candidate pair
priority ordered list and transport address pair check ordered list.
The candidate pair priority ordered list will have two entries, and
be identical for L and R. The highest priority one will be the one
containing L2 and R1 (since its the operating candidate pair), and
the second one will be L1 and R1. The transport address pair check
ordered list initially starts with two entries. For agent L, this
will be L2:1:R1:1 and L1:1:R1:1. However, after the trimming
operation, agent L will remove the second transport address pair,
since it shares the same origination transport address as the first
(L-PRIV-1 for both). However, R will keep both transport address
pairs.
Agent R begins its connectivity check (message 9) for transport
address pair L2:1:R1:1 (note that, from its perspective, the
transport address pair has the ID R1:1:L2:1, and this ID would appear
in the USERNAME of STUN requests it receives). Since the NAT has a
filtering policy of address dependent, the connectivity check is
discarded.
When agent L gets the answer, it begins its connectivity check for
L2:1:R1:1 (messages 10-13), which succeed, placing the transport
address pair and resulting candidate pair into the Recv-Valid state.
L can now send media to R. When agent R receives the connectivity
check (message 11), it is a match for the transport address pair, and
the state of the transport address pair moves to Send-Valid. Agent R
begins its connectivity checks (messages 14-17). When the check
arrives at the NAT (message 14), it is permitted to pass since a
permission was created towards R-PUB-1 as a consequence of message
10. This check arrives at agent L, which generates a success 10. This check arrives at agent L, which generates a success
response (message 11), and updates the state of the candidate pair to response (message 16), and updates the state of the transport address
Valid. This response arrives at agent R, which also updates the pair to Valid. This response arrives at agent R, which also updates
state of the candidate pair to valid. Now, media can flow from agent the state of the transport address pair to Valid. Now, media can
R to agent L as well. flow from agent R to agent L as well.
11.2 Advanced Example 11.2. Advanced Example
In this more advanced example, The NAT has address and port dependent In this more advanced example, The NAT has address and port dependent
mapping and filtering properties. Both agents use the STUN relay mapping and filtering properties. Both agents use the STUN relay
usage in addition to the binding discovery usage. As a consequence, usage in addition to the binding discovery usage. As a consequence,
agent L will end up with three candidates - a local candidate, a agent L will end up with three candidates - a local candidate, a
relayed candidate, and a server reflexive candidate. Agent R will relayed candidate, and a server reflexive candidate. Agent R will
have two - a local candidate and a relayed candidate (the server have two - a local candidate and a relayed candidate (the server
reflexive candidate will equal the local candidate and thus not be reflexive candidate will equal the local candidate and thus not be
used). The agents are seeking to communicate using a single RTP- used). The agents are seeking to communicate using a single RTP-
based voice stream, but are using RTCP. As a consequence, each based voice stream, but are using RTCP. As a consequence, each
skipping to change at page 73, line 27 skipping to change at page 85, line 46
| | | | | | | |
|(66) Answer | | | |(66) Answer | | |
|<-------------------------------------------| |<-------------------------------------------|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
Figure 11 Figure 17
First, agent L obtains both server reflexive and relayed transport First, agent L obtains both server reflexive and relayed transport
addresses for its RTP packets, using a STUN Allocate request, which addresses for its RTP packets, using a STUN Allocate request, which
will provide it with both types of addresses (messages 1-4). Recall will provide it with both types of addresses (messages 1-4). Recall
that the NAT has the address and port dependent mapping property. that the NAT has the address and port dependent mapping property.
Here, it creates a binding of NAT-PUB-1 for this UDP request, and Here, it creates a binding of NAT-PUB-1 for this UDP request, and
this becomes the server reflexive transport address for RTP. The this becomes the server reflexive transport address for RTP. The
relayed transport address is STUN-PUB-2, allocated by the STUN relayed transport address is STUN-PUB-2, allocated by the STUN
server. Agent L repeats this process for RTCP (messages 5-8) Ta server. Agent L repeats this process for RTCP (messages 5-8) Ta
seconds later, and obtains NAT-PUB-2 as its server reflexive seconds later, and obtains NAT-PUB-2 as its server reflexive
transport address for RTCP and STUN-PUB-3 for its relayed transport transport address for RTCP and STUN-PUB-3 for its relayed transport
address. address.
With its three candidates, agent L prioritizes them, choosing the With its three candidates, agent L prioritizes them, choosing the
local candidate as highest priority, followed by the server reflexive local candidate as highest priority, followed by the server reflexive
candidate, followed by the relayed candidate. It chooses its relayed candidate, followed by the relayed candidate. It chooses its relayed
candidate as the active candidate, and encodes it into the m/c-line. candidate as the operating candidate, and encodes it into the m/c-
The resulting offer (message 17) looks like: line. The resulting offer (message 17) looks like:
v=0 v=0
o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP
s= s=
c=IN IP4 $STUN-PUB-2.IP c=IN IP4 $STUN-PUB-2.IP
t=0 0 t=0 0
a=ice-pwd:$LPASS a=ice-pwd:$LPASS
m=audio $STUN-PUB-2.PORT RTP/AVP 0 m=audio $STUN-PUB-2.PORT RTP/AVP 0
a=rtpmap:0 PCMU/8000 a=rtpmap:0 PCMU/8000
a=rtcp:$STUN-PUB-3.PORT a=rtcp:$STUN-PUB-3.PORT
a=candidate $L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT a=candidate:$L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT
a=candidate $L1 2 UDP 1.0 $L-PRIV-2.IP $L-PRIV-2.PORT a=candidate:$L1 2 UDP 1.0 $L-PRIV-2.IP $L-PRIV-2.PORT
a=candidate $L2 1 UDP 0.7 $NAT-PUB-1.IP $NAT-PUB-1.PORT a=candidate:$L2 1 UDP 0.7 $NAT-PUB-1.IP $NAT-PUB-1.PORT
a=candidate $L2 2 UDP 0.7 $NAT-PUB-2.IP $NAT-PUB-2.PORT a=candidate:$L2 2 UDP 0.7 $NAT-PUB-2.IP $NAT-PUB-2.PORT
a=candidate $L3 1 UDP 0.3 $STUN-PUB-2.IP $STUN-PUB-2.PORT a=candidate:$L3 1 UDP 0.3 $STUN-PUB-2.IP $STUN-PUB-2.PORT
a=candidate $L3 2 UDP 0.3 $STUN-PUB-3.IP $STUN-PUB-3.PORT a=candidate:$L3 2 UDP 0.3 $STUN-PUB-3.IP $STUN-PUB-3.PORT
This offer is received at agent R. Agent R will gather its server This offer is received at agent R. Agent R will gather its server
reflexive and relayed transport addresses for RTP from an Allocate reflexive and relayed transport addresses for RTP from an Allocate
request (messages 10-11). Since the server reflexive transport request (messages 10-11). Since the server reflexive transport
address matches its local transport address, no separate candidate is address matches its local transport address, no separate candidate is
used for it. The agent then gathers its server reflexive and relayed used for it. The agent then gathers its server reflexive and relayed
transport addresses for RTCP (messages 12-13). It prioritizes the transport addresses for RTCP (messages 12-13). It prioritizes the
local candidate with higher priority than the relayed candidate, and local candidate with higher priority than the relayed candidate, and
selects the relayed candidate as the active candidate. Its resulting selects the relayed candidate as the operating candidate. Its
answer looks like: resulting answer looks like:
v=0 v=0
o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP
s= s=
c=IN IP4 $STUN-PUB-4.IP c=IN IP4 $STUN-PUB-4.IP
t=0 0 t=0 0
a=ice-pwd:$RPASS a=ice-pwd:$RPASS
m=audio $STUN-PUB-4.PORT RTP/AVP 0 m=audio $STUN-PUB-4.PORT RTP/AVP 0
a=rtpmap:0 PCMU/8000 a=rtpmap:0 PCMU/8000
a=rtcp:$STUN-PUB-5.PORT a=rtcp:$STUN-PUB-5.PORT
a=candidate $R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT a=candidate:$R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT
a=candidate $R1 2 UDP 1.0 $R-PUB-2.IP $R-PUB-2.PORT a=candidate:$R1 2 UDP 1.0 $R-PUB-2.IP $R-PUB-2.PORT
a=candidate $R2 1 UDP 0.3 $STUN-PUB-4.IP $STUN-PUB-4.PORT a=candidate:$R2 1 UDP 0.3 $STUN-PUB-4.IP $STUN-PUB-4.PORT
a=candidate $R2 2 UDP 0.3 $STUN-PUB-5.IP $STUN-PUB-5.PORT a=candidate:$R2 2 UDP 0.3 $STUN-PUB-5.IP $STUN-PUB-5.PORT
Next, agents L and R form candidate pairs and the transport address Next, agents L and R form candidate pairs and the transport address
check ordered list. This list will start with the two components in pair check ordered list. This list will start with the two
the currently active candidate pair - relayed candidates. Agent R components in the currently operating candidate pair - relayed
begins its checks (message 15). It will check connectivity between candidates. Agent R begins its checks (message 15). It will check
the active candidate pair, starting with the first component, which connectivity between the operating candidate pair, starting with the
is STUN-PUB-4 for agent R and STUN-PUB-2 for agent L. The state first component, which is STUN-PUB-4 for agent R and STUN-PUB-2 for
machine for that transport address pair moves to the Testing state. agent L. The state machine for that transport address pair moves to
the Testing state. Since this is a relayed transport address for
Since this is a relayed transport address for agent R, it utilizes agent R, it utilizes the STUN Send Indication to deliver the Binding
the STUN Send Indication to deliver the Binding Request. The Request. The DESTINATION-ADDRESS is STUN-PUB-2.
DESTINATION-ADDRESS is STUN-PUB-2.
The STUN server will extract the content of the Send indication, The STUN server will extract the content of the Send indication,
which is a STUN Binding Request, and deliver it to the destination, which is a STUN Binding Request, and deliver it to the destination,
STUN-PUB-4. This request will be sent from the relayed address STUN-PUB-4. This request will be sent from the relayed address
allocated to R, which is STUN-PUB-4. As both interfaces are on the allocated to R, which is STUN-PUB-4. As both interfaces are on the
STUN server, this message is sent to itself (and thus the lack of a STUN server, this message is sent to itself (and thus the lack of a
message number in the sequence diagram above). Note that the message number in the sequence diagram above). Note that the
USERNAME in the Binding Request is L3:1:R2:1, which represents the USERNAME in the Binding Request is L3:1:R2:1, which represents the
transport address pair ID. This message gets discarded by the STUN transport address pair ID. This message gets discarded by the STUN
server since, as of yet, there are no permissions established for the server since, as of yet, there are no permissions established for the
STUN-PUB-2 allocation. However, it did have the side effect of STUN-PUB-2 allocation. However, it did have the side effect of
establishing a permission on the STUN-PUB-4 binding, allowing establishing a permission on the STUN-PUB-4 binding, allowing
incoming packets from STUN-PUB-2. incoming packets from STUN-PUB-2.
Once L gets the offer, it will attempt to validate the first Once L gets the offer, it will attempt to validate the first
transport address pair in the transport address pair check ordered transport address pair in the transport address pair check ordered
list, which will be the active candidate. The state machine for this list, which will be the operating candidate. The state machine for
transport address pair moves into the Testing state. Like agent R this transport address pair moves into the Testing state. Like agent
did, it will use the STUN Send Indication to send a STUN Binding R did, it will use the STUN Send Indication to send a STUN Binding
Request from its relayed transport address, STUN-PUB-2, to STUN-PUB-4 Request from its relayed transport address, STUN-PUB-2, to STUN-PUB-4
(message 16). This packet traverses the NAT (message 17) and arrives (message 16). This packet traverses the NAT (message 17) and arrives
at the STUN server. The STUN server will unwrap the contents of the at the STUN server. The STUN server will unwrap the contents of the
packet and send them from STUN-PUB-2 to STUN-PUB-4. It will also, as packet and send them from STUN-PUB-2 to STUN-PUB-4. It will also, as
a consequence, add a permission for STUN-PUB-4. The contents of the a consequence, add a permission for STUN-PUB-4. The contents of the
packet are a STUN Binding Request with USERNAME R2:1:L3:1 (note how packet are a STUN Binding Request with USERNAME R2:1:L3:1 (note how
this is the flip of the USERNAME in the Binding Request sent by agent this is the flip of the USERNAME in the Binding Request sent by agent
R). This is also a packet from the STUN server to itself. However, R). This is also a packet from the STUN server to itself. However,
now, the packet is not discarded, as a permission had been installed now, the packet is not discarded, as a permission had been installed
as a consequence of the "suicide packet" from agent R (a suicide as a consequence of the "suicide packet" from agent R (a suicide
skipping to change at page 77, line 8 skipping to change at page 89, line 29
(message 25). This Send Indication traverses the NAT (message 26) (message 25). This Send Indication traverses the NAT (message 26)
and is received by the STUN server. Its contents are decapsulated, and is received by the STUN server. Its contents are decapsulated,
and sent to STUN-PUB-4, which is again a loop on the same host. This and sent to STUN-PUB-4, which is again a loop on the same host. This
packet is then sent towards agent R in a Data Indication (message packet is then sent towards agent R in a Data Indication (message
27). The contents of the DATA Indication are extracted, and the 27). The contents of the DATA Indication are extracted, and the
agent sees a successful Binding Response. It therefore moves the agent sees a successful Binding Response. It therefore moves the
state machine from the Send-Valid state to the Valid state. At this state machine from the Send-Valid state to the Valid state. At this
point, the transport address pair is in the Valid state for both point, the transport address pair is in the Valid state for both
agents. agents.
Approximately Ta seconds after agent R sent message 15, agent R will Approximately Tb seconds after agent R sent message 15, agent R will
start checks for the next transport address pair in its transport start checks for the next transport address pair in its transport
address pair check ordered list. This is the second component of the address pair check ordered list. This is the second component of the
same candidate pair, used for RTCP. This sequence, messages 28 same candidate pair, used for RTCP. This sequence, messages 28
through 40, are identical to the ones for RTP, but differ only in the through 40, are identical to the ones for RTP, but differ only in the
specific transport addresses. specific transport addresses.
Once that validation happens, the second transport address pair has Once that validation happens, the second transport address pair has
been validated. The candidate pair moves into the valid state, and been validated. The candidate pair moves into the valid state, and
both candidates are considered valid. The active candidate has now both candidates are considered valid. The operating candidate has
been validated, and media can begin to flow. It will do so through now been validated, and media can begin to flow. It will do so
the STUN server; indeed, it is relayed "twice" through the STUN through the STUN server; indeed, it is relayed "twice" through the
server. Even though there is a single STUN server, it is logically STUN server. Even though there is a single STUN server, it is
acting as two separate STUN servers. Indeed, had L and R used two logically acting as two separate STUN servers. Indeed, had L and R
separate STUN servers, media would be relayed through both STUN used two separate STUN servers, media would be relayed through both
servers in a trapezoid configuration. STUN servers in a trapezoid configuration.
The actual media flows are shown as well. It is important to note The actual media flows are shown as well. It is important to note
that, since the ICE checks have not yet concluded on the candidate that, since the ICE checks have not yet concluded on the candidate
that will ultimately be used, no STUN Set Active Destinations have that will ultimately be used, no STUN Set Active Destinations have
been sent. As a consequence, media that is sent through the STUN been sent. As a consequence, media that is sent through the STUN
servers has to be sent using STUN Send indications. This introduces servers has to be sent using STUN Send indications. This introduces
some overhead, but is a transient condition. In message 41, agent L some overhead, but is a transient condition. In message 41, agent L
sends an RTP packet to agent R using a Send indication. It is sent sends an RTP packet to agent R using a Send indication. It is sent
to STUN-PUB-4. This traverses the NAT (message 42), and arrives at to STUN-PUB-4. This traverses the NAT (message 42), and arrives at
the STUN server. It is decapsulated, looped to itself, and arrives the STUN server. It is decapsulated, looped to itself, and arrives
skipping to change at page 77, line 45 skipping to change at page 90, line 18
and sent to agent R (message 43). In the reverse direction, agent R and sent to agent R (message 43). In the reverse direction, agent R
will send an RTP packet using a STUN Send indication (message 42), will send an RTP packet using a STUN Send indication (message 42),
and send it to STUN-PUB-2. This is received by the STUN server, and send it to STUN-PUB-2. This is received by the STUN server,
decapsulated, and sent to STUN-PUB-2 from STUN-PUB-4. This is again decapsulated, and sent to STUN-PUB-2 from STUN-PUB-4. This is again
a loop within the same host, arriving at STUN-PUB-4. The contents of a loop within the same host, arriving at STUN-PUB-4. The contents of
the packet are sent to agent L through a STUN Data Indication the packet are sent to agent L through a STUN Data Indication
(message 45), which traverses the NAT (message 46) to arrive at agent (message 45), which traverses the NAT (message 46) to arrive at agent
L. Since this call flow is already long enough, RTCP packet L. Since this call flow is already long enough, RTCP packet
transmission is not shown. transmission is not shown.
Approximately Ta seconds after it sends message 29, agent L goes to Approximately Tb seconds after it sends message 29, agent L goes to
the next transport address pair in its transport address pair check the next transport address pair in its transport address pair check
ordered list that is in the Waiting state. This will be the RTP ordered list that is in the Waiting state. This will be the RTP
candidate for the top priority candidate pair, which is L-PRIV-1 on candidate for the top priority candidate pair, which is L-PRIV-1 on
agent L and R-PUB-1 on agent R. This is a local candidate for each agent L and R-PUB-1 on agent R. This is a local candidate for each
agent. To perform the check, agent L sends a STUN Binding Request agent. To perform the check, agent L sends a STUN Binding Request
from L-PRIV-1 to R-PUB-1 (message 47). Note the USERNAME of from L-PRIV-1 to R-PUB-1 (message 47). Note the USERNAME of
R1:1:L1:1, which identifies this transport address pair. This R1:1:L1:1, which identifies this transport address pair. This
traverses the NAT (message 48). Since the NAT has the address and traverses the NAT (message 48). Since the NAT has the address and
port dependent mapping property, and this is a new destination IP port dependent mapping property, and this is a new destination IP
address, the NAT allocates a new transport address on its public address, the NAT allocates a new transport address on its public
skipping to change at page 78, line 30 skipping to change at page 90, line 51
network. network.
Now, as a consequence of receiving message 48, agent R will have Now, as a consequence of receiving message 48, agent R will have
constructed a peer-derived candidate. The candidate ID for this constructed a peer-derived candidate. The candidate ID for this
candidate is L1R1, and it initially contains a single transport candidate is L1R1, and it initially contains a single transport
address pair, NAT-PUB-3 and R-PUB-1. However, the candidate isn't address pair, NAT-PUB-3 and R-PUB-1. However, the candidate isn't
yet usable until the other component gets added. Similarly, agent L yet usable until the other component gets added. Similarly, agent L
will have constructed the same peer-derived candidate, with the same will have constructed the same peer-derived candidate, with the same
candidate ID and the same transport address pair. candidate ID and the same transport address pair.
Some Ta seconds after sending message 28, agent R will move to the Some Tb seconds after sending message 28, agent R will move to the
next transport address pair in the transport address pair check next transport address pair in the transport address pair check
ordered list whose state is Waiting. This is the RTCP component of ordered list whose state is Waiting. This is the RTCP component of
the highest priority candidate pair. It will attempt a connectivity the highest priority candidate pair. It will attempt a connectivity
check, from R-PUB-2 to L-PRIV-2 (message 52). Since L-PRIV-1 is check, from R-PUB-2 to L-PRIV-2 (message 52). Since L-PRIV-1 is
private, this message is discarded. private, this message is discarded.
Some Ta seconds after sending message 47, agent L will move to the Some Tb seconds after sending message 47, agent L will move to the
next transport address pair in the transport address pair check next transport address pair in the transport address pair check
ordered list whose state is Waiting. This is the RTCP component of ordered list whose state is Waiting. This is the RTCP component of
the highest priority candidate pair. It will attempt a connectivity the highest priority candidate pair. It will attempt a connectivity
check, from L-PRIV-2 to R-PUB-2 (message 53), which operates nearly check, from L-PRIV-2 to R-PUB-2 (message 53), which operates nearly
identically to messages 47-50, with the exception of the specific identically to messages 47-50, with the exception of the specific
addresses. Here, the NAT will create a new binding for the RTCP, addresses. Here, the NAT will create a new binding for the RTCP,
NAT-PUB-4, and this transport address is new for both participants. NAT-PUB-4, and this transport address is new for both participants.
On receipt of this Binding Request at agent R (message 54), agent R On receipt of this Binding Request at agent R (message 54), agent R
constructs the candidate ID for the peer-derived candidate, L1R1, and constructs the candidate ID for the peer-derived candidate, L1R1, and
finds it already exists. As such, this new transport address is finds it already exists. As such, this new transport address is
added, and the peer-derived candidate becomes complete and usable. added, and the peer-derived candidate becomes complete and usable.
Agent L does the same thing on receipt of message 56. This candidate Agent L does the same thing on receipt of message 56. This candidate
will have the same priority as its generating candidate L1 (1.0), and will have the same priority as its generating candidate L1 (1.0), and
is paired up with R1 (also at priority 1.0). Since L1R1 has the same is paired up with R1 (also at priority 1.0). Since L1R1 has the same
priority as L1 itself, the ordering algorithm in Section 7.5 will use priority as L1 itself, the ordering algorithm in Section 7.5 will use
the reverse lexicographic order of the candidate ID iself to the reverse ASCII sort order of the candidate ID iself to determine
determine order. L1R1 is larger than L1, so that the peer-derived order. L1R1 is larger than L1, so that the peer-derived candidate
candidate will come before its generating candidate. As a will come before its generating candidate. As a consequence, the
consequence, the peer-derived candidate pair will have a higher peer-derived candidate pair will have a higher priority than its
priority than its generating candidate, and appear just before it in generating candidate, and appear just before it in the candidate pair
the candidate pair priority ordered list. priority ordered list.
As a consequence, after agent R sends message 55 and completes the As a consequence, after agent R sends message 55 and completes the
peer-derived candidate, it will move the two transport addresses in peer-derived candidate, it will move the two transport addresses in
the peer derived candidate into the Send-Valid state, and send a the peer derived candidate into the Send-Valid state, and send a
Binding Request for each in rapid succession (agent L will have moved Binding Request for each in rapid succession (agent L will have moved
both into the Recv-Valid state upon receipt of message 56). The both into the Recv-Valid state upon receipt of message 56). The
first of these connectivity checks are for the RTP component, from first of these connectivity checks are for the RTP component, from
R-PUB-1 to NAT-PUB-3 (message 57). Note the USERNAME in the STUN R-PUB-1 to NAT-PUB-3 (message 57). Note the USERNAME in the STUN
Binding Request, L1R1:1:R1:1, which identifies the peer-derived Binding Request, L1R1:1:R1:1, which identifies the peer-derived
transport address pair. This will succesfully traverse the NAT and transport address pair. This will succesfully traverse the NAT and
skipping to change at page 79, line 36 skipping to change at page 92, line 9
reflexive transport address, R-PUB-1, is not new to agent R and thus reflexive transport address, R-PUB-1, is not new to agent R and thus
does not result in the creation of a new peer-derived candidate. does not result in the creation of a new peer-derived candidate.
Messages 61 through 64 show the same basic flow for RTCP. Upon Messages 61 through 64 show the same basic flow for RTCP. Upon
receipt of message 64, both transport address pairs are Valid at both receipt of message 64, both transport address pairs are Valid at both
agents, causing the peer derived candidate to become valid. Timer agents, causing the peer derived candidate to become valid. Timer
Tws is set at agent L, and fires without any higher priority Tws is set at agent L, and fires without any higher priority
candidate pairs becoming validated. At agent R, media can now be candidate pairs becoming validated. At agent R, media can now be
sent on this candidate pair from answerer (agent R) to offerer (agent sent on this candidate pair from answerer (agent R) to offerer (agent
L). Agent L sends an updated offer to promote the peer-derived L). Agent L sends an updated offer to promote the peer-derived
candidate to active. This offer (message 65) looks like: candidate to operating. This offer (message 65) looks like:
v=0 v=0
o=jdoe 2890844526 2890842808 IN IP4 $L-PRIV-1.IP o=jdoe 2890844526 2890842808 IN IP4 $L-PRIV-1.IP
s= s=
c=IN IP4 $NAT-PUB-3.IP c=IN IP4 $NAT-PUB-3.IP
t=0 0 t=0 0
a=ice-pwd:$LPASS a=ice-pwd:$LPASS
m=audio $NAT-PUB-3.PORT RTP/AVP 0 m=audio $NAT-PUB-3.PORT RTP/AVP 0
a=rtpmap:0 PCMU/8000 a=rtpmap:0 PCMU/8000
a=rtcp:$NAT-PUB-4.PORT a=rtcp:$NAT-PUB-4.PORT
a=remote-candidate:R1 a=remote-candidate:R1
a=candidate $L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT a=candidate:$L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT
a=candidate $L1 2 UDP 1.0 $L-PRIV-2.IP $L-PRIV-2.PORT a=candidate:$L1 2 UDP 1.0 $L-PRIV-2.IP $L-PRIV-2.PORT
There are several important things to note in this offer. Firstly, There are several important things to note in this offer. Firstly,
note how the m/c-line now contains NAT-PUB-3 and NAT-PUB-4, the peer note how the m/c-line now contains NAT-PUB-3 and NAT-PUB-4, the peer
derived transport addresses it learned through the ICE processing. derived transport addresses it learned through the ICE processing.
Secondly, note how there remains a candidate encoded into the Secondly, note how there remains a candidate encoded into the
a=candidate attributes. This is candidate L1, NOT candidate L1R1. a=candidate attributes. This is candidate L1, NOT candidate L1R1.
Recall that the peer-derived candidates are never encoded into the Recall that the peer-derived candidates are never encoded into the
SDP. Rather, their generating candidate is encoded. This will cause SDP. Rather, their generating candidate is encoded. This will cause
keepalives to take place for the generating candidate if valid keepalives to take place for the generating candidate if valid
(though its not) and any of its derived candidates, which is what we (though its not) and any of its derived candidates, which is what we
want. Finally, notice the inclusion of the a=remote-candidate want. Finally, notice the inclusion of the a=remote-candidate
skipping to change at page 80, line 30 skipping to change at page 93, line 4
v=0 v=0
o=bob 2808844564 2808844565 IN IP4 $R-PUB-1.IP o=bob 2808844564 2808844565 IN IP4 $R-PUB-1.IP
s= s=
c=IN IP4 $R-PUB-1.IP c=IN IP4 $R-PUB-1.IP
t=0 0 t=0 0
a=ice-pwd:$RPASS a=ice-pwd:$RPASS
m=audio $R-PUB-1.PORT RTP/AVP 0 m=audio $R-PUB-1.PORT RTP/AVP 0
a=rtpmap:0 PCMU/8000 a=rtpmap:0 PCMU/8000
a=rtcp:$R-PUB-2.PORT a=rtcp:$R-PUB-2.PORT
a=candidate $R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT a=candidate:$R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT
a=candidate $R1 2 UDP 1.0 $R-PUB-2.IP $R-PUB-2.PORT a=candidate:$R1 2 UDP 1.0 $R-PUB-2.IP $R-PUB-2.PORT
With this, media can now flow directly between endpoints. The With this, media can now flow directly between endpoints. The
removal of the relayed candidates from the offer/answer exchange will removal of the relayed candidates from the offer/answer exchange will
cause the STUN relay allocations to be removed. cause the STUN relay allocations to be removed.
12. Grammar 12. Grammar
This specification defines three new SDP attributes - the This specification defines three new SDP attributes - the
"candidate", "remote-candidate" and "ice-pwd" attributes. "candidate", "remote-candidate" and "ice-pwd" attributes.
The candidate attribute is a media-level attribute only. It contains The candidate attribute is a media-level attribute only. It contains
a transport address for a candidate that can be used for connectivity a transport address for a candidate that can be used for connectivity
checks. There may be multiple candidate attributes in a media block. checks. There may be multiple candidate attributes in a media block.
There is no requirement that a=candidate attribute which indicate
components for the same candidate appear one right after the other or
in component ID order.
The syntax of this attribute is defined using Augmented BNF as The syntax of this attribute is defined using Augmented BNF as
defined in RFC 4234 [9]: defined in RFC 4234 [9]:
candidate-attribute = "candidate" ":" candidate-id SP component-id SP candidate-attribute = "candidate" ":" candidate-id SP component-id SP
transport SP transport SP
qvalue SP ;qvalue from RFC 3261 qvalue SP ;qvalue from RFC 3261
addr SP ;addr from RFC 3266 connection-address SP ;from RFC 4566
port ;port from RFC 2327 port ;port from RFC 4566
[SP cand-type]
[SP rel-addr]
[SP rel-port]
*(SP extension-att-name SP *(SP extension-att-name SP
extension-att-value) extension-att-value)
transport = "UDP" / transport-extension transport = "UDP" / transport-extension
transport-extension = token transport-extension = token ; from RFC 3261
candidate-id = 1*base64-char candidate-id = 1*base64-char
base64-char = ALPHANUM / DIGIT / "+" / "/" base64-char = ALPHA / DIGIT / "+" / "/"
;ALPHANUM from RFC 3261
component-id = 1*DIGIT component-id = 1*DIGIT
extension-att-name = byte-string ;from RFC 2327 cand-type = "typ" SP candidate-types
candidate-types = "local" / "srflx" / "relay" / token
rel-addr = "raddr" SP connection-address
rel-port = "rport" SP port
extension-att-name = byte-string ;from RFC 4566
extension-att-value = byte-string extension-att-value = byte-string
The candidate-id is used to group together the transport addresses The candidate-id is used to group together the transport addresses
for a particular candidate. It MUST be constructed with at least 24 for a particular candidate. It MUST be constructed with at least 24
bits of randomness. It MUST have the same value for all transport bits of randomness. It MUST have the same value for all transport
addresses within the same candidate. It MUST have a different value addresses within the same candidate. It MUST have a different value
for transport addresses within different candidates for the same for transport addresses within different candidates for the same
media stream. The candidate-id uses a syntax that is defined to be media stream. The candidate-id uses a syntax that is defined to be
equal to the base64 alphabet [3], which allows the candidate-id to be equal to the base64 alphabet [3], which allows the candidate-id to be
generated by performing a base64 encoding of a randomly generated generated by performing a base64 encoding of a randomly generated
value (note, however, that this does not mean that the candidate-id value (note, however, that this does not mean that the candidate-id
or password is base64 decoded when use in STUN messages). In or password is base64 decoded when use in STUN messages). In
skipping to change at page 81, line 43 skipping to change at page 94, line 24
addition, if content is base64 encoded to generate the candidate-id, addition, if content is base64 encoded to generate the candidate-id,
it MUST NOT be padded with '='. Section 2.2 of RFC 3548 indicates it MUST NOT be padded with '='. Section 2.2 of RFC 3548 indicates
that some base64 usages do not require padding, and it requests that that some base64 usages do not require padding, and it requests that
such usages call out that fact. ICE is one such usage. This is such usages call out that fact. ICE is one such usage. This is
because the data is never decoded. The component-id is a positive because the data is never decoded. The component-id is a positive
integer, which identifies the specific component of the candidate. integer, which identifies the specific component of the candidate.
It MUST start at 1 and MUST increment by 1 for each component of a It MUST start at 1 and MUST increment by 1 for each component of a
particular candidate. particular candidate.
The addr production is taken from [10], allowing for IPv4 addresses, The addr production is taken from [10], allowing for IPv4 addresses,
IPv6 addresses and FQDNs. The port production is taken from RFC 2327 IPv6 addresses and FQDNs. The port production is taken from RFC 4566
[5]. The token production is taken from RFC 3261 [2]. The transport [5]. The token production is taken from RFC 3261 [2]. The transport
production indicates the transport protocol for the candidate. This production indicates the transport protocol for the candidate. This
specification only defines UDP. However, extensibility is provided specification only defines UDP. However, extensibility is provided
to allow for future transport protocols to be used with ICE, such as to allow for future transport protocols to be used with ICE, such as
TCP or the Datagram Congestion Control Protocol (DCCP) [34]. TCP or the Datagram Congestion Control Protocol (DCCP) [30].
The cand-type production encodes the type of transport address. This
specification defines the values "local" for a local transport
address, "srflx" for a server reflexive transport address, and
"relay" for a relayed transport address. The set of candidate types
is extensible for the future. Note that there is no value defined
for peer reflexive transport addresses. This is because these
transport addresses are never carried in the SDP itself; they are
learned implicitly through connectivity checks. Inclusion of the
candidate type is optional.
The rel-addr and rel-port productions convey information on related
transport addresses. For a server reflexive transport address, the
rel-addr and rel-port contain the associated local transport address.
For a relayed transport address, the rel-addr and rel-port contain
the server reflexive transport address towards the relay. If rel-
addr is present, rel-port MUST be present, and if rel-port is
present, rel-addr MUST be present. If the candidate type is "local",
rel-addr and rel-port MUST NOT be present. If the candidate type is
"srflx" or "relayed", both rel-addr and rel-port MUST be present.
The a=candidate attribute can itself be extended. The grammar allows The a=candidate attribute can itself be extended. The grammar allows
for new name/value pairs to be added at the end of the attribute. An for new name/value pairs to be added at the end of the attribute. An
implementation MUST ignore any name/value pairs it doesn't implementation MUST ignore any name/value pairs it doesn't
understand. understand.
The syntax of the "remote-candidate" attribute is defined using The syntax of the "remote-candidate" attribute is defined using
Augmented BNF as defined in RFC 4234 [9]: Augmented BNF as defined in RFC 4234 [9]:
remote-candidate-att = "remote-candidate" ":" candidate-id remote-candidate-att = "remote-candidate" ":" candidate-id
skipping to change at page 82, line 26 skipping to change at page 95, line 27
The syntax of the "ice-pwd" attribute is defined as: The syntax of the "ice-pwd" attribute is defined as:
ice-pwd-att = "ice-pwd" ":" password ice-pwd-att = "ice-pwd" ":" password
password = 1*base64-char password = 1*base64-char
The "ice-pwd" attribute can appear at either the session-level or The "ice-pwd" attribute can appear at either the session-level or
media-level. When present in both, the value in the media-level media-level. When present in both, the value in the media-level
takes precedence. Thus, the value at the session level is takes precedence. Thus, the value at the session level is
effectively a default that applies to all media streams, unless effectively a default that applies to all media streams, unless
overriden by a media-level value. It MUST have at least 128 bits of overriden by a media-level value. It MUST have at least 128 bits of
randomness. Like the candidate-ID, its syntax is taken from the randomness. Like the candidate ID, its syntax is taken from the
base64 alphabet, allowing the password to be generted from a base64 base64 alphabet, allowing the password to be generted from a base64
encoding of a 128 bit value. In addition, if content is base64 encoding of a 128 bit value. In addition, if content is base64
encoded to generate the candidate-id, it MUST NOT be padded with '='. encoded to generate the candidate ID, it MUST NOT be padded with '='.
13. Security Considerations 13. Security Considerations
There are several types of attacks possible in an ICE system. This There are several types of attacks possible in an ICE system. This
section considers these attacks and their countermeasures. section considers these attacks and their countermeasures.
13.1 Attacks on Connectivity Checks 13.1. Attacks on Connectivity Checks
An attacker might attempt to disrupt the STUN-based connectivity An attacker might attempt to disrupt the STUN-based connectivity
checks. Ultimately, all of these attacks fool an agent into thinking checks. Ultimately, all of these attacks fool an agent into thinking
something incorrect about the results of the connectivity checks. something incorrect about the results of the connectivity checks.
The possible false conclusions an attacker can try and cause are: The possible false conclusions an attacker can try and cause are:
False Invalid: An attacker can fool a pair of agents into thinking a False Invalid: An attacker can fool a pair of agents into thinking a
candidate pair is invalid, when it isn't. This can be used to candidate pair is invalid, when it isn't. This can be used to
cause an agent to prefer a different candidate (such as one cause an agent to prefer a different candidate (such as one
injected by the attacker), or to disrupt a call by forcing all injected by the attacker), or to disrupt a call by forcing all
skipping to change at page 83, line 23 skipping to change at page 96, line 23
the attacker, for eavesdropping or other purposes. the attacker, for eavesdropping or other purposes.
False Valid on False Candidate: An attacker has already convinced an False Valid on False Candidate: An attacker has already convinced an
agent that there is a candidate with an address that doesn't agent that there is a candidate with an address that doesn't
actually route to that agent (for example, by injecting a false actually route to that agent (for example, by injecting a false
peer-derived candidate or false STUN-derived candidate). It must peer-derived candidate or false STUN-derived candidate). It must
then launch an attack that forces the agents to believe that this then launch an attack that forces the agents to believe that this
candidate is valid. candidate is valid.
Of the various techniques for creating faked STUN messages described Of the various techniques for creating faked STUN messages described
in [13], many are not applicable for the connectivity checks. in [12], many are not applicable for the connectivity checks.
Compromises of STUN servers are not much of a concern, since the STUN Compromises of STUN servers are not much of a concern, since the STUN
servers are embedded in endpoints and distributed throughout the servers are embedded in endpoints and distributed throughout the
network. Thus, compromising the STUN server is equivalent to network. Thus, compromising the STUN server is equivalent to
comprimising the endpoint, and if that happens, far more problematic comprimising the endpoint, and if that happens, far more problematic
attacks are possible than those against ICE. Similarly, DNS attacks attacks are possible than those against ICE. Similarly, DNS attacks
are irrelevant since STUN servers are not discovered via DNS, they are irrelevant since STUN servers are not discovered via DNS, they
are signaled via SIP. Injecti