MMUSIC J. Rosenberg Internet-Draft Cisco Systems Expires:December 28, 2006 June 26,March 4, 2007 August 31, 2006 Interactive Connectivity Establishment (ICE): A Methodology for Network Address Translator (NAT) Traversal for Offer/Answer Protocolsdraft-ietf-mmusic-ice-09draft-ietf-mmusic-ice-10 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire onDecember 28, 2006.March 4, 2007. Copyright Notice Copyright (C) The Internet Society (2006). Abstract This document describes a protocol for Network Address Translator (NAT) traversal for multimedia session signaling protocols based on the offer/answer model, such as the Session Initiation Protocol (SIP). This protocol is called Interactive Connectivity Establishment (ICE). ICE makes use of the Simple Traversalof UDP throughUnderneath NAT(STUN),(STUN) protocol, applying its bindingdiscovery, connectivity checkdiscovery and relayusages.usages, in addition to defining a new usage for checking connectivity between peers. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . . 43. Terminology .2.1. Gathering Candidate Addresses . . . . . . . . . . . . . . 6 2.2. Connectivity Checks . . . . . . . . . .15 4. Sending the Initial Offer. . . . . . . . . 8 2.3. Sorting Candidates . . . . . . . . .18 5. Receipt of the Offer and Generation of the Answer. . . . . .19 6. Processing the Answer. . . . . 10 2.4. Frozen Candidates . . . . . . . . . . . . . . .19 7. Common Procedures. . . . . 10 2.5. Security for Checks . . . . . . . . . . . . . . . . .20 7.1. Gathering Candidates. . 11 3. Terminology . . . . . . . . . . . . . . . .20 7.2. Prioritizing the Candidates and Choosing an Operating One. . . . . . . . . 11 4. Sending the Initial Offer . . . . . . . . . . . . . . . . . .25 7.3. Encoding13 4.1. Gathering Candidatesinto SDP. . . . . . . . . . . . . .27 7.4. Forming Candidate Pairs .. . . . . 13 4.2. Prioritizing Candidates . . . . . . . . . . .31 7.5. Ordering the Candidate Pairs. . . . . . 15 4.3. Choosing In-Use Candidates . . . . . . . .33 7.6. Performing the Connectivity Checks. . . . . . . . 18 4.4. Encoding the SDP . . .36 7.7. Sending a Binding Request for Connectivity Checks. . . .42 7.8. Receiving a Binding Request for Connectivity Checks. . .44 7.9. Promoting a Candidate to Operating. . . . . . . . . . .46 7.10. Learning New Candidates from Connectivity Checks19 5. Receiving the Initial Offer . . . .47 7.10.1. On Receipt of a Binding Request. . . . . . . . . .47 7.10.2. On Receipt of a Binding Response. . . 20 5.1. Verifying ICE Support . . . . . . .51 7.11. Subsequent Offer/Answer Exchanges. . . . . . . . . . . 20 5.2. Gathering Candidates .53 7.11.1. Sending of a Subsequent Offer. . . . . . . . . . .53 7.11.2. Receiving the Offer and Sending an Answer. . . . .56 7.11.3. Receiving the Answer. . 21 5.3. Prioritizing Candidates . . . . . . . . . . . . . .59 7.12. Binding Keepalives. . . 21 5.4. Choosing In Use Candidates . . . . . . . . . . . . . . . .59 7.13. Sending Media21 5.5. Encoding the SDP . . . . . . . . . . . . . . . . . . . . . 21 5.6. Forming the Check List .61 7.14. Receiving Media. . . . . . . . . . . . . . . . . 21 5.7. Performing Periodic Checks . . . .63 8. Guidelines for Usage with SIP. . . . . . . . . . . . 23 6. Receipt of the Initial Answer . . . .64 9. Interactions with Forking. . . . . . . . . . . . 24 6.1. Verifying ICE Support . . . . . .66 10. Interactions with Preconditions. . . . . . . . . . . . 24 6.2. Forming the Check List . . .67 11. Examples. . . . . . . . . . . . . . . 24 6.3. Performing Periodic Checks . . . . . . . . . . .67 11.1. Basic Example. . . . . 24 7. Connectivity Checks . . . . . . . . . . . . . . . . .68 11.2. Advanced Example. . . . 24 7.1. Applicability . . . . . . . . . . . . . . . .72 12. Grammar. . . . . . 24 7.2. Client Discovery of Server . . . . . . . . . . . . . . . . 25 7.3. Server Determination of Usage . . . . .93 13. Security Considerations. . . . . . . . . 25 7.4. New Requests or Indications . . . . . . . . . .95 13.1. Attacks on Connectivity Checks. . . . . 25 7.5. New Attributes . . . . . . . .95 13.2. Attacks on Address Gathering. . . . . . . . . . . . . .98 13.3. Attacks on the Offer/Answer Exchanges25 7.6. New Error Response Codes . . . . . . . . . .99 13.4. Insider Attacks. . . . . . . 25 7.7. Client Procedures . . . . . . . . . . . . . .99 13.4.1. The Voice Hammer Attack. . . . . . 25 7.7.1. Sending the Request . . . . . . . .99 13.4.2. STUN Amplification Attack. . . . . . . . . 25 7.7.2. Processing the Response . . . .99 14. IANA. . . . . . . . . . . 26 7.8. Server Procedures . . . . . . . . . . . . . . . . . . . . 27 7.9. Security Considerations for Connectivity Check . . . . . . 29 8. Completing the ICE Checks . . . . . . . . . . . . . . .100 14.1. candidate Attribute. . . 29 9. Subsequent Offer/Answer Exchanges . . . . . . . . . . . . . . 30 9.1. Generating the Offer . .100 14.2. remote-candidate Attribute. . . . . . . . . . . . . . .100 14.3. ice-pwd Attribute. . 30 9.2. Receiving the Offer and Generating an Answer . . . . . . . 31 9.3. Updating the Check and Valid Lists . . . . . . . . . . .101 15. IAB Considerations. 32 10. Keepalives . . . . . . . . . . . . . . . . . . . .101 15.1. Problem Definition. . . . . . 33 11. Media Handling . . . . . . . . . . . . .102 15.2. Exit Strategy. . . . . . . . . . . 34 11.1. Sending Media . . . . . . . . . . .102 15.3. Brittleness Introduced by ICE. . . . . . . . . . . 34 11.2. Receiving Media . . .103 15.4. Requirements for a Long Term Solution. . . . . . . . . .104 15.5. Issues. . . . . . . . 35 12. Usage withExisting NAPT BoxesSIP . . . . . . . . . . . . .104 16. Acknowledgements. . . . . . . . . . . 35 12.1. Latency Guidelines . . . . . . . . . . .104 17. References. . . . . . . . . 35 12.2. Interactions with Forking . . . . . . . . . . . . . . . .105 17.1. Normative References37 12.3. Interactions with Preconditions . . . . . . . . . . . . . 37 12.4. Interactions with Third Party Call Control . . . . .105 17.2. Informative References. . . 38 13. Grammar . . . . . . . . . . . . . .106 Author's Address. . . . . . . . . . . . . 38 14. Example . . . . . . . . . . .108 Intellectual Property and Copyright Statements. . . . . . . . .109 1. Introduction RFC 3264 [4] defines a two-phase exchange of Session Description Protocol (SDP) messages [5] for the purposes of establishment of multimedia sessions. This offer/answer mechanism is used by protocols such as the Session Initiation Protocol (SIP) [2]. Protocols using offer/answer are difficult to operate through Network. . . . . . . 40 15. Security Considerations . . . . . . . . . . . . . . . . . . . 46 15.1. Attacks on Connectivity Checks . . . . . . . . . . . . . . 46 15.2. Attacks on AddressTranslators (NAT). Because their purpose is to establish a flow of media packets, they tend to carry IP addresses within their messages, which is known to be problematic through NAT [15]. The protocols also seek to create a media flow directly between participants, so that there is no application layer intermediary between them. This is done to reduce media latency, decrease packet loss, and reduce the operational costs of deploying the application. However, this is difficult to accomplish through NAT. A full treatment of the reasons for this is beyond the scope of this specification. Numerous solutions have been proposed for allowing these protocols to operate through NAT. These include Application Layer Gateways (ALGs), the Middlebox Control Protocol [17], Simple Traversal of UDP through NAT (STUN) [14] and its revision [12], the STUN Relay Usage [13], and Realm Specific IP [18] [19] along with session description extensions needed to make them work, such as the Session Description Protocol (SDP) [5] attribute forGathering . . . . . . . . . . . . . . . 49 15.3. Attacks on theReal Time Control Protocol (RTCP) [1]. Unfortunately, these techniques all have pros and cons which make each one optimal in some network topologies, but a poor choice in others.Offer/Answer Exchanges . . . . . . . . . . 49 15.4. Insider Attacks . . . . . . . . . . . . . . . . . . . . . 50 15.4.1. Theresult is that administrators and implementors are making assumptions about the topologies of the networks in which their solutions will be deployed. This introduces complexity and brittleness into the system. What is needed is a single solution which is flexible enough to work well in all situations. This specification provides that solution for media streams established by signaling protocols based on the offer-answer model. It is called Interactive Connectivity Establishment, or ICE. ICE makes use ofVoice Hammer Attack . . . . . . . . . . . . . . . 50 15.4.2. STUNand its relay extension, commonly called TURN, but uses them in a specific methodology which avoids many of the pitfalls of using any one alone. 2. Overview of ICE A typical architecture for an ICE deployment is shown in Figure 1. The figure shows two endpoints (known as agents in RFC 3264 terminology) which we call L and R (for left and right, which helps visualize call flows). Both L and R are behind a NAT. The type of NAT and its properties are unknown. Indeed, it is not known whether the agent is behind a NAT at all, or whether there are multiple NATs between it and the network. Agents A and B are capable of engaging in an offer/answer exchange [4]Amplification Attack . . . . . . . . . . . . . . 50 16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 51 16.1. candidate Attribute . . . . . . . . . . . . . . . . . . . 51 16.2. remote-candidates Attribute . . . . . . . . . . . . . . . 51 16.3. ice-pwd Attribute . . . . . . . . . . . . . . . . . . . . 52 16.4. ice-ufrag Attribute . . . . . . . . . . . . . . . . . . . 52 17. IAB Considerations . . . . . . . . . . . . . . . . . . . . . . 53 17.1. Problem Definition . . . . . . . . . . . . . . . . . . . . 53 17.2. Exit Strategy . . . . . . . . . . . . . . . . . . . . . . 53 17.3. Brittleness Introduced bywhich they can exchange SDP messages, whose purpose is to set upICE . . . . . . . . . . . . . . 54 17.4. Requirements for amedia session between A and B. Of course, the offer/answer exchange itself must be capable of traversing the NAT. Such traversal is facilitated through signaling elements such as SIP servers,Long Term Solution . . . . . . . . . . 55 17.5. Issues with Existing NAPT Boxes . . . . . . . . . . . . . 55 18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 56 19. References . . . . . . . . . . . . . . . . . . . . . . . . . . 56 19.1. Normative References . . . . . . . . . . . . . . . . . . . 56 19.2. Informative References . . . . . . . . . . . . . . . . . . 57 Appendix A. Design Motivations . . . . . . . . . . . . . . . . . 58 A.1. Applicability to Gateways andis outside the scopeServers . . . . . . . . . . 59 A.2. Pacing ofthis specification. Different solutions are applied for traversalSTUN Transactions . . . . . . . . . . . . . . . 60 A.3. Candidates with Multiple Bases . . . . . . . . . . . . . . 61 A.4. Purpose of thesignaling that carries the offer/answer exchange, and for the media set up by that offer/answer exchange. This is becauseTranslation . . . . . . . . . . . . . . . . 63 A.5. Importance of thevastly different requirements on latency, packet loss,STUN Username . . . . . . . . . . . . . 63 A.6. The Candidate Pair Sequence Number Formula . . . . . . . . 64 A.7. The Frozen State . . . . . . . . . . . . . . . . . . . . . 65 A.8. The remote-candidates attribute . . . . . . . . . . . . . 65 A.9. Why are Keepalives Needed? . . . . . . . . . . . . . . . . 66 A.10. Why Prefer Peer Reflexive Candidates? . . . . . . . . . . 67 A.11. Why Can't Offerers Send Media When a Pair Validates . . . 67 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 69 Intellectual Property andoverall bandwidth betweenCopyright Statements . . . . . . . . . . 70 1. Introduction RFC 3264 [4] defines a two-phase exchange of Session Description Protocol (SDP) messages [10] for thesignaling and media. For example, usagepurposes ofa signaling intermediary,establishment of multimedia sessions. This offer/answer mechanism is used by protocols such asa SIP proxy, as a relay for all signaling at all times,the Session Initiation Protocol (SIP) [3]. Protocols using offer/answer are difficult to operate through Network Address Translators (NAT). Because their purpose isacceptable, whereas usageto establish a flow ofrelays at all times formediais highly undesirable. In additionpackets, they tend tothe agents, a SIP server and NATs, ICEcarry IP addresses within their messages, which istypically used in concert with STUN servers in the network. Each agent can have its own STUN server, or they canknown to bethe same. +-------+ | SIP | +-------+ | Srvr | +-------+ | STUN | | | | STUN | | Srvr | +-------+ | Srvr | | | | | +-------+ +-------+ +--------+ +--------+ | NAT | |problematic through NAT| +--------+ +--------+ +-------+ +-------+ | Agent | | Agent | | L | | R | | | | | +-------+ +-------+ Figure 1 Prior[14]. The protocols also seek toinitiating an offer, the offering agent (L in this example) starts by performingcreate aprocess known as address gathering.media flow directly between participants, so that there is no application layer intermediary between them. Thisprocess allows the clientis done toobtain one or more transport addresses, one more of which might be viable addresses at which the agent can receive incomingreduce mediapackets fromlatency, decrease packet loss, and reduce theother agent, which we call its peer.operational costs of deploying the application. However, this is difficult to accomplish through NAT. Atransport addressfull treatment of the reasons for this isjustbeyond thecombinationscope ofan IP addressthis specification. Numerous solutions have been proposed for allowing these protocols to operate through NAT. These include Application Layer Gateways (ALGs), the Middlebox Control Protocol [15], Simple Traversal Underneath NAT (STUN) [13] andport. With ICE, an agent will actually provide its peer with all ofitspossible transport addresses,revision [11], the STUN Relay Usage [12], andICE will figure out which one to actually use. Naturally, one viable transport address is one obtained directly from a local interfaceRealm Specific IP [17] [18] along with session description extensions needed to make them work, such as theclient has towardsSession Description Protocol (SDP) [10] attribute for thenetwork. Such a transport address is called a local transport address. The local interface could beReal Time Control Protocol (RTCP) [2]. Unfortunately, these techniques all have pros and cons which make each oneon a local layer 2optimal in some networktechnology, such as ethernet or WiFi, or it could be one that is obtained through a tunnel mechanism, such astopologies, but aVirtual Private Network (VPN) or Mobile IP (MIP). In all cases, these appear topoor choice in others. The result is that administrators and implementors are making assumptions about theagent as a local interface fromtopologies of the networks in whichports (and thus transport addresses) cantheir solutions will beallocated. If an agentdeployed. This introduces complexity and brittleness into the system. What is needed ismultihomed, it can obtainatransport address from each interface. Dependingsingle solution which is flexible enough to work well in all situations. This specification provides that solution for media streams established by signaling protocols based on thelocationoffer-answer model. It is called Interactive Connectivity Establishment, or ICE. ICE makes use of STUN and its relay extension, commonly called TURN, but uses them in a specific methodology which avoids many of thepeer on the IP network, the agent may be reachable throughpitfalls of using any one alone. 2. Overview ofthose interfaces, or through another. Consider, for example, an agent which hasICE In alocal interfacetypical ICE deployment, we have two endpoints (known as agents in RFC 3264 terminology) which want toa private net 10 network, and alsocommunicate. They are able to communicate indirectly via some signaling system such as SIP, by which they can perform an offer/answer exchange of SDP [4] messages. Note that ICE is not intended for NAT traversal for SIP, which is assumed tothe public Internet. A transport address from the net10 interface willbedirectly reachable when communicating with a peer onprovided via some other mechanism [31]. At thesame private net 10 network, while a transport address frombeginning of thepublic interface willICE process, the agents are ignorant of their own topologies. In particular, they might or might not bedirectly reachable when communicating withbehind apeer onNAT (or multiple tiers of NATs). ICE allows thepublic Internet. Rather than tryingagents toguess which interface will work priordiscover enough information about their topologies tosending an offer, the offering agent includes both transport addresses in its offer. Indeed, when usingfind amedia technology like the Real Time Transport Protocol (RTP), an agent needs two transport addresses on each interface - onepath or paths by which they can communicate. Figure Figure 1 shows a typical environment forthe RTP,ICE deployment. The two endpoints are labelled L andone for the Real Time Control Protocol (RTCP). Other media technologies may require a multiplicity of transport addresses to be usedR (for left andtreatedright, which helps visualize call flows). Both L and R are behind NATs -- though asa bundle. Eachmentioned before, they don't know that. The type ofthese transport addresses is called a component. ThereNAT and its properties aretwo components in an RTP stream - the RTP itself,also unknown. Agents A andthe RTCP. In ICE, the setB are capable oftransport addresses that representengaging in anatomic grouping onoffer/answer exchange by whichcommunications is possiblethey can exchange SDP messages, whose purpose iscalled a candidate. In the example so far, the agent would obtain two candidates - one from the net 10 interface, and one from the interface on the public Internet. Each candidate would contain two transport addresses, corresponding to each of the two components. Once the agent has obtained local transport addresses, it uses STUNtoobtain additional transport addresses. To do this, it would sendset up aSTUN Binding Request, using the Binding Discovery Usage [12] or the Relay Usage [13] frommedia session between A and B. Typically, this exchange will occur through alocal transport address, to its STUNSIP server.It is assumed that the address ofIn addition to theSTUNagents, a SIP server and NATs, ICE isconfigured, or learnedtypically used insome way. Indeed, an agent might even have multiple STUN servers. As a consequence of communicatingconcert withtheSTUNserver,servers in the network. Each agent canlearn potentially two new types of transport addresses - server reflexive transport addresses and relayed transport addresses. The relationship of these addresses tohave its own STUN server, or they can be thelocal transport address is shown in Figure 2. To Internetsame. +-------+ | SIP | +-------+ |/------------ RelayedSrvr | +-------+ | STUN |/ Address +--------+| | | STUN | |ServerSrvr | +-------+ | Srvr |+--------+| | / \ |/------------ Server |/ Reflexive +------------+ Address| +-------+ / \ +-------+ / \ / \ / \ / \ / <- Signalling -> \ / \ / \ +--------+ +--------+ | NAT |+------------+| NAT |/------------ Local |/ Address+--------+ +--------+ / \ / \ / \ +-------+ +-------+ | Agent | | Agent | | L | |+--------+R | | | | | +-------+ +-------+ Figure21 Thelocal transport addressbasic idea behind ICE isresident on the agent itself. Through either the Binding Discovery Usage or the Relay Usage, theas follows: each agentcan discover its server reflexivehas a variety of candidate transportaddress. This isaddresses it could use to communicate with theaddress onother agent. These might include: o It's directly attached network interface (or interfaces in thepublic sidecase ofthe NAT, facing the STUN server. It is the transporta multihomed machine o A translated addressallocated to the agenton the public side ofthe NAT asaconsequenceNAT (a "server reflexive" address) o The address of a media relay thetransmissionagent is using. Potentially, any ofthe STUN request through the NAT, to the STUN server. The NAT will allocate a binding, mapping this server reflexiveL's candidate transportaddressaddresses can be used tothe local transport address. Packets received at the NAT, targeted towards the server reflexivecommunicate with any of R's transportaddress,addresses. In practice, however, many combinations willhave their destination address rewritten to the local transport address by the NAT,not work. For instance, if L and R are both behind NATs then their directly interface addresses are unlikely to beforwardedable tothe agent. When there are multiple NATs between the agent and the STUN server, the STUN request will create a binding on each NAT, but only the outermost server reflexive transport addresscommunicate directly (this is why ICE is needed, after all!). The purpose of ICE is to discover which pairs of addresses willbe discovered by the agent. In addition, through the Relay Usage, the agent can requestwork. The way thatthe STUN server itself allocateICE does this is to systematically try all possible pairs (in atransport address fromcarefully sorted order) until it finds one or more that works. 2.1. Gathering Candidate Addresses In order to execute ICE, an agent has to identify all of itslocal interfaces, and establish a binding that maps that transportaddress(calledcandidates. Naturally, one viable candidate is one obtained directly from arelayed transport address, naturally) towardslocal interface thesource transport address ofclient has towards theSTUN request, which will actuallynetwork. Such a candidate is called a HOST CANDIDATE. The local interface could beequalone on a local layer 2 network technology, such as ethernet or WiFi, or it could be one that is obtained through a tunnel mechanism, such as a Virtual Private Network (VPN) or Mobile IP (MIP). In all cases, these appear to theserver reflexive transport address allocated byagent as a local interface from which ports (and thus a candidate) can be allocated. If an agent is multihomed, it can obtain a candidate from each interface. Depending on theoutermost NAT. Consequently, packets sent tolocation of therelayed transport address will be routed bypeer on the IP networktowardsrelative to theSTUN server. The STUN server will receive them, rewriteagent, thedestination address toagent may beequal toreachable by theserver reflexive transport address,peer through one of those interfaces, or through another. Consider, for example, an agent which has a local interface to a private net 10 network, andforward them. Theyalso to the public Internet. A candidate from the net10 interface willthen arrive atbe directly reachable when communicating with a peer on theNAT, wheresame private net 10 network, while a candidate from thedestination address is rewritten once again, andpublic interface will be directly reachable when communicating with a peer on thepacket forward finallypublic Internet. Rather than trying to guess which interface will work prior to sending an offer, the offering agentatincludes both candidates in itslocal address. Sinceoffer. Once theserver reflexive transport addresses and relayed transport addresses andagent has obtainedfrom a local transport address, they are saidhost candidates, it uses STUN tobe derived transport addresses, since they are derived from (and ultimately map to) their associated local transport address. During the process of address gathering, the agent willobtainas many transportadditional candidates. These come in two flavors: translated addresses on the public side of agiven type as are needed for the media session. For example, with RTP, two transportNAT (SERVER REFLEXIVE CANDIDATES) and addressesare needed for a candidate.of media relays (RELAYED CANDIDATES). Theagent will obtain two server reflexive transport addresses (each derived from a local transport address), and they would be usedrelationship of these candidates toconstitute a server reflexive candidate. The local transport addresses make up a local candidate, andtherelayed transport addresses make up a relayed candidate. Server Server Reflexive Reflexive Candidate Candidate .............. .............. . . . . . +-+ +-+ . . +-+ +-+ . . | | | | . . | | | | . . +-+ +-+ . . +-+ +-+ . . ^ ^ . . ^ ^ . ....|....|.... ....|....|.... |host candidate is shown in Figure 2. Both types of candidates are discovered using STUN. To Internet | | | /------------ Relayed | / Candidate +--------+ | | |....|....|.... ....|....|.... .STUN | |. .Server | |. . +-+ +-+ . Local . +-+ +-+ . Local .| +--------+ | | |./------------ Server |/ Reflexive +------------+ Candidate.| NAT | +------------+ | |./------------ Host |/ Candidate. +-+ +-+ . . +-+ +-+ . . | | . . | | . ....|....|.... ....|....|.... | | | | | | | | ....|....|.... ....|....|.... . V V . . V V . . +-+ +-+ . . +-+ +-+ . . | | | | . .+--------+ | | | Agent |. . +-+ +-+ . . +-+ +-+ . . . . . .............. .............. Relayed Relayed Candidate Candidate Legend ------ +-+| |Transport Address +-+ ---> Derived From ... . . Candidate ... Figure 3 The relationship between these various transport addresses and candidates is shown pictorially in+--------+ Figure3. The figure shows our example agent with two local interfaces, each of which provides two transport address pairs to make up two candidates. From those two local candidates,2 To find a server reflexiveand relayed candidate are derived. Oncecandidate, the agenthas completed gathering its candidates, it assigns eachsends acandidate identifier, calledSTUN Binding Request, using thecandidate ID. The candidate ID is a random number used to uniquely identifyBinding Discovery Usage [11] from each host candidate,andto its STUN server. (It isused in the connectivity checks discussed below. The components of each candidate are ordered numerically, starting at one, suchassumed thateach transportthe addresshas a component ID. For example,of the STUN server is configured, or learned inan RTP candidate there are two components, component ID 1 and component ID 2. Each transport address pair is therefore uniquely identified by a combination of its candidate ID and its component ID. The combination ofsome way.) When thetwoagents sends the Binding Request, the NAT (assuming there iscalled, unsurprisingly, a transport address ID, or tid for short. The agentone) willplace all of its candidates in an offer, usingallocate anew SDP attribute called thebinding, mapping this server reflexive candidateattribute. This attribute containsto theactual transport address,host candidate. Outgoing packets sent from the host candidateID and component ID, and a q-value. The q-value is used for the agent to prioritize its candidates. An agentwilltypically prefer to receive media at particular candidates over other candidates, based on local policy. For example, an agent would normally preferbe translated by the NAT toreceive interactive voice RTPthe server reflexive candidate. Incoming packetsat its local candidate as opposed to its relayed candidate, duesent to theextra latency incurred by traveling through the relay. Theserver relexive candidateattributewillalso include an indicator ofbe translated by thetype ofNAT to the host candidate(server reflexive, local, relayed),andits related transport address. For server reflexive transport addresses, the related transport address is the local transport address from which it was derived. For relayed transport addresses,forwarded to therelated transport address isagent. We call the host candidate associated with a given server reflexiveaddress towardscandidate the BASE. Note "Base" refers to therelay. The related transportaddress you'd send from forreflexivea particular candidate. Thus, as a degenerate case host candidatesis used byalso have a base, but it's theICE algorithm itself,same asexplained below. For relayed candidates,therelated transport address is not used by ICE directly; it is useful for diagnostic purposes and for Quality of Service mechanisms that require knowledge of addresses closer to the agent. Finally,host candidate. When there are multiple NATs between the agentchooses one of its candidates for inclusion in the mandc lines (calledthem/c-line collectively). Assuming thatSTUN server, the STUN request will create a binding on each NAT, but only the outermost server reflexive candidateis verified as functionalwill be discovered by theICE connectivity checks described below, thisagent. If the agent is not behind a NAT, then theactual IP address and port to which mediabase candidate will besent. The candidate selected for inclusion in the m/c- line of an offer (or an answer) is calledtheoperating candidate, since it issame as theone that isserver reflexive candidate and thein-use destination for receiptserver reflexive candidate can be ignored. The final type ofmedia traffic. Once the operatingcandidate ischosen, the agenta RELAYED candidate. The STUN Relay Usage [12] allows a STUN server to act as a media relay, forwarding traffic between L and R. In order to send traffic to L, R sends traffic to theoffer. Throughmedia relay which forwards it to L and vice versa. The same thing happens in thewonders or SIP orothersignaling protocols, this offer is delivereddirection. Traffic from L tothe peer, which must now selectR has itsanswer. To create the answer,addresses rewritten twice: first by theagent startsNAT and second bygathering addresses, in exactlythesame waySTUN relay server. Thus, theoffered did. It includes those as candidates in its answer,address that R knows about andselectsthe oneasthat it wants to send to is theoperating candidate, just likeone on theoffered did. It then sends the answer. Each agent then pairs up each of its candidates with the candidates of its peer. FromSTUN relay server. This address is theperspectivefinal kind ofthe offerer, the setcandidate, which we call a RELAYED CANDIDATE. 2.2. Connectivity Checks Once L has gathered all ofcandidates it sent initsoffer are called its nativecandidates, it orders them highest to lowest priority and sends them to R over theones received in the answersignalling channel. The candidates are carried in attributes in theremote candidates. Similarly, from the perspective of the answerer,SDP offer. When R receives theset of candidatesoffer, itsent in its answer areperforms thenative candidates,same gathering process andthe ones received in the offer are the remote candidates. Both agents pair up each of their native candidatesresponds witheachits own list of candidates. At theremote candidates, producingend of this process, each agent has asetcomplete list ofcandidate pairs. If there were N nativeboth its candidates andM remote candidates, there will be N*M candidate pairs. Within each candidate pair, the transport addresses themselves are paired up one for one, resulting in transport address pairs as well. The transport addresses are pairedits peer's candidates and is ready to perform connectivity checks by pairing upsuch that they have identical component IDs. Each transport address pair has an ID, calledthetransport addresscandidates to see which pairID, formed by concatenating the transport address IDsworks. The basic principle ofits two transport addresses. Oncethepairingconnectivity checks isdone,simple: 1. Sort thetransport addresscandidate pairsare orderedinsuch a way that both the offerer and answerer will end up with the samepriority order.This ordering is done by using the q-values2. Send checks on eachside provided, along with thecandidateIDs to help break ties. Then, each side begins a process known as connectivity checks. Connectivitypair in priority order. 3. Acknowledge checksare STUN transactions, usingreceived from the other agent. A complete connectivity checkusage of STUN, sent from the native transport address to the remote transport address offor aparticular transport address pair. If an agent sendssingle candidate pair is a simple 4-message handshake: A B - - STUN requestand-> \ A's <- STUN response / check <- STUN request \ B's STUN response -> / check Figure 3 As an optimization, as soon as B getsa successful response, the transport address pair is saidA's check message he immediately sends his own check message tobe Receive Valid, or Recv Valid for short, sinceA on theagent knows that its peer was able to receive a packet. If an agent receives a request and sends a response,same candidate pair. This accelerates thetransport address pair is said to be Send Valid, sinceprocess of finding a valid candidate. At theagent knowsend of this handshake, both A and B know thatits peer was able tothey can sendit a packet. When transactions(and receive) messages end-to-end in bothdirections complete, the transport address pair is said to be Valid. The idea behind ICE isdirections. Note thatif a transport address pair is valid,as soon as B receives A's STUN response itmeansknows thatagents were able to succesfully exchange IP packets in both directions. Consequently, any media packets, which are sent to and from exactlythesame IP addressesB->A path works andports, should also work, since they don't differ in their IP addresses or ports. It's important to point out that, when used with ICE, an agent will always send and receiveit can start sending media onthe same transport address. That is, if an agent includes a transport address of 192.0.2.1:2444 (meaning an IP address of 192.0.2.1 and port of 2444) in its SDPthat path right away, as shown below. This allows forreceiving RTP packets (and also'early media' to flow as fast as possible: A B - - STUNconnectivity check), it will not only receiverequest -> \ A's <- STUNrequests and RTP packets on this transport address, it will also sendresponse / check <- STUNrequests andrequest \ B's STUN response -> / check <- RTPpackets from this transport address. This property, known as symmetric RTP, is essentialData Figure 4 Once any connectivity check forproper operation of ICE. Peer reflexive transport addresses, discussed further below, will generally only work when symmetric RTP is used. Symmetric RTP is also keya candidate forkeeping NAT bindings alive. Since there can be quiteafew transport address pairs to check, performinggiven media component succeeds, ICE uses that candidate and immediately abandons allof theother connectivity checksin parallel can cause substantial load onfor that component. Note that due to race conditions and packet loss, this may mean that thenetwork. Instead, each agent will start at"best" candidate isn't selected, but it does guarantee thetopselection ofthe ordered list they each created, and every 50ms, beginanew connectivity check. In order to succesfullycandidate that works, and because of the sorting processa STUN connectivity check, an agent mustit will generally beable to correlateone of theSTUN request or response withmost preferred ones. 2.3. Sorting Candidates Because thetransport addressalgorithm above searches all candidate pairs, if a working pairwhose connectivity the STUN message is meant to validate. To perform this correlation,exists it will eventually find it no matter what order theSTUN connectivity checks contain a USERNAME attribute formed in a special way.candidates are tried in. Inparticular, the USERNAME contains the actual transport address pair ID, which, as described above, is formed by concatenating the transport address IDs of each oforder to produce faster (and better) results, thecandidates. The USERNAME is usedcandidates are sorted inconjunction with an authentication and message integrity operation on the STUN message that requiresapassword. This passwordspecified order. The algorithm isconveyeddescribed inthe offer/answer exchange, and is a random number valid only for the duration of the media session. This ensures that, if the signaling channel carrying the offer/answer exchange is secure, theSection 4.2 but follows two general principles: o Each agentcan be certain thatgives itsSTUN connectivity checks are taking placecandidates a numeric priority which is sent along with theagent which respondedcandidate to thesignaling. Because each agent is receiving STUN requests on the same IP addresspeer o The local andportremote priorities are combined so thatmedia will later be sent to,each agentis effectively acting as its own mini STUN server, implementinghas theconnectivity check usage described in [12]. Like all STUN servers, whensame ordering for theagent sends a STUN responsecandidate pairs. The second property is important for getting ICE to work when there are NATs in front of A and B. Frequently, NATs will not allow packets in from arequest, the response includeshost until theXOR- MAPPED-ADDRESS attribute that containsagent behind thesource IP address and portNAT has sent a packet towards thatthe request came from. In certain deployment scenarios, andhost. Consequently, ICE checks inparticular where one ofeach direction will not succeed until both sides have sent a check through their respective NATs. In general theagentspriority algorithm isbehind a NAT whose addressdesigned so that candidates of similar type get similar priorities andport mapping propertiesso that more direct routes areaddress and port dependent [32], this source IP address and port may differ from the server reflexive ones allocated by the peer during the address gathering phase. This source IP address and port, conveyed in the XOR-MAPPED-ADDRESS attribute of the STUN response, therefore constitutes a new transport address, calledfavored over indirect ones. Within those guidelines, however, agents have apeer reflexive transport address, which can be used for communications. +-------+ | STUN | | Srvr | | | +-------+ ^ | | | | +--------------------------+ | | NAT-2| |NAT-1 | +-----------+ | | APD NAT | | +-----------+ | | | | \ | VL1 \|R1 +-------+ +-------+ | Agent | | Agent | | L | | R | | | | | +-------+ +-------+ Figure 4 Consider the examplefair amount ofFigure 4.discretion about how to tune their algorithms. 2.4. Frozen Candidates Theagent onprevious description only addresses theleft, agent L, hascase where the agents wish to establish a singleinterface and is not behind a NAT. Consequently, it ends up withmedia component--i.e., a singlecandidateflow with a singletransport address (normally twohost-port quartet. However, in many cases (in particular RTP and RTCP) the agents actually need to establish connectivity forRTP, but we'll consider justmore than onefor ease of explanation), transport address L1. It sends an offerflow. The naive way toagent R, whichattack this problem would be to simply do independent ICE exchanges for each media component. This isbehind one of these Address and Port Dependent (APD) mapping NATs. Agent R has a local transport address R1,obviously inefficient because the network properties are likely to be very similar for each component (especially because RTP andobtains a server reflexive transport address from its STUN server, transport address NAT-1. Now, when agent R sends a connectivity checkRTCP are typically run on adjacent ports). Thus, it should be possible to leverage information fromits local transport address (R1)one media component in order toL's local transport address (L1),determine the best candidates for another. ICE does thischeck will traversewith a mechanism called "frozen candidates." The basic principle behind frozen candidates is that initially only theNAT.candidates for a single media component are tested. The other media components are marked "frozen". When the connectivitycheck itself will create a new mapping inchecks for theNAT and be allocated a new binding onfirst component succeed, theNAT - NAT-2. This STUN request arrives at L, which generates a STUN response containing transport address NAT-2. Agent R, noticing that this is notcorresponding candidates for thesame as itsothertwo transport addresses, treats thiscomponents are unfrozen and checked immediately. This avoids repeated checking of components which are superficially more attractive but in fact are likely to fail. While we've described "frozen" here as anew peer reflexive transport address. This new peer reflexive transport addressseparate mechanism for expository purposes, in fact it ispaired up withan integral part of ICE and theremote transport address containingtheSTUN server from whichICE prioritization algorithm automatically ensures thattransport address was learned (transport address L1 intheexample above). This becomes a new transport address pair, and connectivity checksright candidates arerun on it as well. Once all of the transport address pairsunfrozen and checked ina candidate pair have been validated, that candidate pairthe right order. 2.5. Security for Checks Because ICE isreadyused to discover which addresses can beused. Media starts being sent onused to send media between two agents, itimmediately, andis important to ensure that theofferer willprocess cannot be hijacked to sendan updated offer, now containing the agents half of the validated candidate pair inmedia to them/c-line. Thiswrong location. Each STUN connectivity check iscalled "promotingcovered by acandidate to operating". The updated offer only containsmessage authentication code (MAC) computed using asingle candidate attribute - the one forkey exchanged in theoperating candidate. It also containssignalling channel. This MAC provides message integrity and data origin authentication, thus stopping anattribute, called the remote-candidate attribute, which tells the answerer the remote candidateattacker from forging or modifying connectivity check messages. The MAC also aids inthe validated candidate pair.disambiguating ICE exchanges from forked calls. 3. Terminology Theanswerer useskey words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in thisattribute, along with its own view on the states of the candidate pairs,document are toplace a candidatebe interpreted as described in RFC 2119 [1]. This specification makes use of them/c-line and populate the candidate attributes in its answer. It is important to understand that, when ICE isfollowing terminology: Agent: As defined inuse, mediaRFC 3264, an agent isnot sent to a candidate without validation, even if that candidate appearsthe protocol implementation involved in them/c-line. This isoffer/answer exchange. There are two agents involved inorder to avoid denial-of-service attacks. In particular, without ICE, an offerer can sendanoffer to another agent, and listoffer/answer exchange. Peer: From theIP address and portperspective of one ofa target in the offer. Iftheagent is an automata that answers a call automatically, it will do so and then proceed to send media to the target. This provides substantial packet amplifications. ICE fixes this by requiring that an agent never send media packets unless it has sent a STUN message towards the target of the RTP packets, and received a reply from that target. See Section 7.13 for details. A summary of this overall behavior is shown in the basic call flow in Figure 5. Agent A STUN Servers Agent B |(1) Gather Addresses | | |-------------------->| | |(2) Offer | | |------------------------------------------>| | |(3) Gather Addresses | | |<--------------------| |(4) Answer | | |<------------------------------------------| |(5) STUN Check | | |<------------------------------------------| |(6) STUN Check | | |------------------------------------------>| |(7) Media | | |<------------------------------------------| |(8) Media | | |------------------------------------------>| |(9) Offer | | |------------------------------------------>| |(10) Answer | | |<------------------------------------------| Figure 5 3. Terminology Several new terms are introduced in this specification: Agent: As defined in RFC 3264, an agent is the protocol implementation involved in the offer/answer exchange. There are two agents involved in an offer/answer exchange. Peer: From the perspective of one of the agents inagents in a session, its peer is the other agent. Specifically, from the perspective of the offerer, the peer is the answerer. From the perspective of the answerer, the peer is the offerer. Transport Address: The combination of an IP address and port.Local Transport Address:Candidate: Alocaltransport address that is to be tested by ICE procedures in order to determine its suitability for usage for receipt of media. Host Candidate: A candidate obtained by binding to atransport address that has been allocatedspecific port fromthe operating systeman interface on the host. This includestransport addressesboth physical interfaces and logical ones, such as ones obtained through Virtual Private Networks (VPNs) andtransport addresses obtained throughRealm Specific IP (RSIP)[18][17] (which lives at the operating system level).Transport addresses are typicallyServer Reflexive Candidate: A candidate obtained bybindingsending a STUN request from a host candidate toan interface. m/c line: The media and connection lines in the SDP, which together holda STUN server, distinct from thetransportpeer, whose addressused foris configured or learned by thereceipt of media. Derived Transport Address:client prior to an offer/answer exchange. Peer Reflexive Candidate: Aderived transport address is a transport address which iscandidate obtained by sending a STUN request from alocal transport address. The derived transport address is related to the associated local transport address in that packets senthost candidate to thederived transport address are receivedSTUN server running onthe socket bound to its associated local transport address. Derived addresses are obtained using protocols like STUN, and more generally, any UNSAF protocol [20]. Reflexive Transport Address: As defined in [12],aderived transport address learnedpeer's candidate. Relayed Candidate: A candidate obtained by sending a STUN Allocate request from aclient which identifies that client as seen by anotherhoston an IP network, typicallycandidate to a STUN server.When thereThe relayed candidate isan intervening NAT betweenresident on theclientSTUN server, and theother host,STUN server relays packets back towards the agent. Translation: The translation of a relayed candidate is thereflexivetransport addressrepresentsthat thebinding allocated torelay will forward a packet to, when one is received at theclient onrelayed candidate. For relayed candidates learned through thepublic sideSTUN Allocate request, the translation of theNAT. Reflexive transport addresses are learned fromrelayed candidate is the server reflexive candidate returned by theXOR-MAPPED-ADDRESS attribute in STUN Binding Responses andAllocateResponses [13]. Server Reflexive Transport Address: Aresponse. Base: The base of a server reflexivetransport addresscandidate is the host candidate from which it was derived. A host candidate is also said to have areflexive addressbase, equal to thatis reflected offcandidate itself. Similarly, the base of aserver, distinct from the peer, whose address is configured or learned by the client prior to an offer/answer exchange. Peer Reflexive Transport Address: A peer reflexive transport addressrelayed candidate is that candidate itself. Foundation: Each candidate has areflexive addressfoundation, which is an identifier that isreflected offdistinct for two candidates that have different types, different interface IP addresses for their base, and different IP addresses for their STUN servers. Two candidates have the same foundation when they are of thepeer. Peersame type, their bases have the same IP address, and, for server reflexivetransport addressesor relayed candidates, they come from the same STUN server. Foundations arelearned by connectivity checks. Relayed Transport Address: A derived transport addressused to correlate candidates, so thatterminates on a server, andwhen one candidate isforwarded towards the client. The STUN Allocate Request, defined as part offound to be valid, candidates sharing theSTUN relay usage [13]same foundation can beusedtested next, as they are likely toobtain a relayed transport address, for example. Associatedalso be valid. LocalTransport Address: When a peer sendsCandidate: A candidate that an agent has obtained and included in an offer or answer it sent. Remote Candidate: A candidate that an agent received in an offer or answer from its peer. In-Use Candidate: A candidate is in-use when it appears in the m/c- line of an active media stream. Candidate Pair: A pairing containing apacket tolocal candidate and atransport address,remote candidate. Check: A candidate pair where theassociatedlocaltransport addresscandidate isthe locala transport addressatfrom whichthose packets will actually arrive. Foran agent can send alocal transport address, its associated local transport address is the same as the local transport address itself. For reflexive and relayed transport addresses, however, they are not the same. The associated local transport address is the one from which the reflexive or relayed transport was derived. Candidate: A sequenceSTUN connectivity check. Check List: An ordered set oftransport addressesSTUN checks thatformanatomic set for usage withagent is to generate towards aparticular media session. Here, atomic meanspeer. Periodic Check: A connectivity check generated by an agent as a consequence of a timer thatallfires periodically, instructing it to send a check. Triggered Check: A connectivity check generated as a consequence oftransport addresses inthecandidate need to work beforereceipt of a connectivity check from the peer. Valid List: An ordered set of candidatewill be used for actual media transport. Inpairs that have been validated by a successful STUN transaction. 4. Sending thecase of RTP, there can be one or more transport addresses per candidate.Initial Offer In order to send themost common case, there are two - oneinitial offer in an offer/answer exchange, an agent must gather candidates, priorize them, choose ones forRTP,inclusion in the m/c-line, andanother for RTCP. Ifthen formulate and send theagent doesn't use RTCP, there would be just one. If Generic Forward Error Correction (FEC) [16]SDP. Each of these steps is described inuse, there may be more than two. The transport addressesthe subsections below. 4.1. Gathering Candidates An agent gathers candidates when it believes thatcomposecommunications is imminent. An offerer can do this based on acandidate are all of the same type - local, server reflexive, peer reflexiveuser interface cue, orrelayed. Local Candidate: A candidate whose transport addresses are local transport addresses. Server Reflexive Candidate: Abased on an explicit request to initiate a session. Every candidatewhoseis an IP address and port (also known as a transportaddressesaddress). It also has a type and a base. Three types are defined and gathered by this specification - host candidates, server reflexivetransport addresses. Peer Reflexive Candidate: A candidate whose transport addresses are peer reflexive transport addresses. Relayed Candidate: A candidate whose transport addresses arecandidates, and relayedtransport addresses. Generating Candidate:candidates. Thecandidate from whichbase of apeer reflexivecandidate isderived. Operating Candidate: Thecandidate thatis in use for exchange of media. This is the one thatan agentplaces in the m/c line ofmust send from when using that candidate. The first step is to gather host candidates. Host candidates are obtained by binding to ports (typically ephemeral) on anofferinterface (physical oranswer. Candidate ID: An identifiervirtual, including VPN interfaces) on the host. The process fora candidate. Component: When agathering host candidates depends on the transport protocol. Procedures are specified here for UDP. For each UDP mediastream, and as a consequence, its candidate, require several IP addresses and portsstream the agent wishes towork atomically, each ofuse, theconstituent IP addresses and ports representsagent SHOULD obtain a candidate for each component ofthat media stream. For example, RTP-basedthe mediastreams typically have two components - one for RTP, and one for RTCP. Component ID: An integer, starting with one withinstream on each interface that the host has. It obtains each candidateand incrementingbyone for each component, which identifies the component. Transport Address ID (tid): An identifier forbinding to atransport address, formed by concatenatingUDP port on the specific interface. A host candidateID with the component ID, separated by a "colon". Candidate Pair: The combination of a candidate from one agent along(and indeed every candidate) is always associated with acandidate from its peer. Native Candidate: From the perspective of each agent, the candidate in a candidate pairspecific component for whichrepresentsit is aset of addresses obtained by that agent. Remote Candidate: Fromcandidate. Each component has an ID assigned to it, called theperspective of each agent,component ID. For RTP-based media streams, thecandidate inRTP itself has acandidate pair which represents the set of addresses obtained by that agents peer. Transport Address Pair: The combination of the transport address for onecomponent ID of 1, and RTCP acandidate with the transport address of the samecomponentfor the matching candidate in a candidate pair. Transport Address Pair ID: An identifier for a transport address pair. Formed by concatenating the native transport addressIDwith the remote transport address ID, separated by a "colon". Matching Transport Address Pair: When a STUN Binding Request is received on a local transport address, the matching transport address pair is the transport address pair whose connectivity is being checked by that Binding Request. Candidate Pair Priority Ordering: An orderingofcandidate pairs based on2. If an agent is using RTCP it MUST obtain acombination of the qvalues of eachcandidate for it. If an agent is using both RTP andthe candidate IDs ofRTCP, it would end up with 2*K host candidates if an agent has K interfaces. The base for eachcandidate. Candidate Pair Check Ordering: An ordering ofhost candidatepairs thatissimilarset to the candidatepair priority ordering, except thatitself. Once theoperating candidate appears at the top ofagent has obtained host candidates, it obtains server reflexive and relayed candidates. The process for gathering server reflexive and relayed candidates depends on thelist, regardless of its priority. Transport Address Pair Check Ordering: An ordering oftransportaddress pairs that determines the sequence of connectivity checks performedprotocol. Procedures are specified here forthe pairs. Transport Address Pair Count:UDP. Agents which serve end users directly, such softphones, hardphones, terminal adapters and so on, SHOULD obtain relayed candidates and MUST obtain server reflexive candidates. Thenumber of transport address pairs in a candidate pair. Thisrequirement to obtain relayed candidates isequalat SHOULD strength tothe minimum of the number of transport addresses in the native candidateallow for provider variation. If they are not used, it is RECOMMENDED that it be implemented andthe number of transport addressesjust disabled through configuration, so that it can re-enabled through configuration if conditions change in theremote candidate. 4. Sendingfuture. Agents which represent network servers under theInitial Offer When an agent wishes to begincontrol of asession by sending an initial offer, it starts by gathering transport addresses,service provider, such asdescribedgateways to the telephone network, media servers, or conferencing servers that are targeted at deployment only inSection 7.1. This will produce a set of candidates, including local ones,networks with public IP addresses MAY skip obtaining server reflexiveones,and relayedones.candidates. The agent next pairs each host candidate with the STUN server with which it is configured or has discovered by some means. Thisprocessspecification only considers usage ofgathering candidates can actually happen at any time before sendinga single STUN server. Every Ta seconds, theinitial offer. Aagentcan pre-gather transport addresses, usingchooses another such pair (the order is inconsequential), and sends auser interface cue (such as picking upSTUN request to thephone, or entry into an address book) as a hintserver from thatcommunicationshost candidate. If the agent isimminent. Doing so eliminates any additional perceivable call setup delays due to address gathering. When it comes time to offer communications,using both relayed and server reflexive candidates, this request MUST be a STUN Allocate request from the relay usage [12]. If the agentdeterminesis using only server reflexive candidates, the request MUST be apriority for each candidate and identifiesSTUN Binding request using theoperating candidate that willbinding discovery usage [11]. The value of Ta SHOULD beused for receiptconfigurable, and SHOULD have a default ofmedia, as described in Section 7.2. The next step is50ms. Note that this pacing applies only toconstruct the offer message. For each media stream, it places its candidates into a=candidate attributes in the offerstarting STUN transactions with source andputs its operating candidate intodestination transport addresses (i.e., them/c line. The processhost candidate and STUN server respectively) fordoing this is described in Section 7.3. The offer is thenwhich a STUN transaction has not previously been sent.5. ReceiptConsequently, retransmissions of a STUN request are governed entirely by theOfferretransmission rules defined in [11]. Similarly, retries of a request due to recoverable errors (such as an authentication challenge) happen immediately andGenerationare not paced by timer Ta. Because ofthe Answer Upon receiptthis pacing, it will take a certain amount of time to obtain all of theoffer message, the agent checks if the offer contains any a=candidate attributes. If the offer does, the offerer supports ICE. In that case, it starts gathering candidates, as described in Section 7.1,server reflexive andprioritizes them as described in Section 7.2. This processing is done immediately on receiptrelayed candidates. Implementations should be aware of theoffer,time required toprepare for the case wheredo this, and if theuser should acceptapplication requires a time budget, limit thecall, or early media needs to be generated. By gatheringamount of candidates(and performing connectivity checks) while the user is being alerted to the request for communications, session establishment delayswhich arereduced. The agent then constructs its answer, encoding its candidates into a=candidate attributes and includinggathered. An Allocate Response will provide theoperating one inclient with a server reflexive candidate (obtained from them/c- line, as described in Section 7.3. The agent then formsmapped address) and a relayed candidatepairs as described in Section 7.4. These are ordered as described in Section 7.5. The agent then begins connectivity checks, as describedinSection 7.6. It followsthelogic in Section 7.10 on receipt ofRELAY-ADDRESS attribute. A BindingRequests and responses to learn new candidatesResponse will provide the client with a only server reflexive candidate (also obtained from thechecks themselves. Transmissionmapped address). The base ofmediathe server reflexive candidate isperformed according totheprocedures in Section 7.13. 6. Processinghost candidate from which theAnswer There are two possible cases for processingAllocate or Binding request was sent. The base of a relayed candidate is that candidate itself. A server reflexive candidate obtained from an Allocate response is theanswer. Ifcalled theanswerer did not support ICE,"translation" of theanswer will not contain any a=candidate attributes. As a result, the offerer knows that it cannot perform its connectivity checks. In this case, it proceeds with normal media processing as if ICE was not in use. However, it SHOULD send media with the symmetric property described in Section 7.13, and follow the keepalive procedures in Section 7.12. If the answer contains candidates, it implies that the answerer supports ICE. The offerer then formsrelayed candidatepairs as described in Section 7.4. These are ordered as described in Section 7.5.obtained from the same response. The agentthen begins connectivity checks, as described in Section 7.6. It follows the logic in Section 7.10 on receipt of Binding Requests and responseswill need tolearn new candidates fromremember thechecks themselves. Transmission of media is performed according totranslation for theprocedures in Section 7.13. 7. Common Procedures This section discusses procedures that are common between offerer and answerer. 7.1. Gathering Candidates An agent gathers candidates whenrelayed candidate, since itbelieves that communicationsisimminent. For offerers, this occurs before sending an offer (Section 4). For answerers, it occurs before sending an answer (Section 5). Eachplaced into the SDP. If a relayed candidatehas one or more components, each of whichisassociated withidentical to asequence number, starting at 1 forhost candidate (which can happen in rare cases), thefirst componentrelayed candidate MUST be discarded. Proper operation of ICE depends on each base being unique. Next, redundant candidates are eliminated. A candidate is redundant if its transport address equals another candidate, andincrementing by 1 for each additional component withinits base equals the base of that other candidate.These components represent a set ofNote that two candidates can have the same transportaddresses for which connectivity mustaddress yet have different bases, and these would not bevalidated. Forconsidered redundant. Finally, each candidate is assigned aparticular media stream, all of thefoundation. The foundation is an identifier, scoped within a session. Two candidatesSHOULDMUST have the samenumber of components. The number of components that are neededfoundation ID when they area function of the type of media stream. All of the components in a candidate MUST beof the same type- server reflexive,(host, relayed,or local, and obtained from the same server in the case ofserver reflexive, peer reflexive orrelayed candidates. For local candidates, each component MUST be obtained fromrelayed), their bases have the sameinterface. For serverIP address (the ports can be different), and, for reflexive and relayed candidates,each component MUST be derived from a component withthe STUN servers used to obtain them have the samecomponent ID, allIP address. Similarly, two candidates MUST have different foundations if their types are different, their bases have different IP addresses, or the STUN servers used to obtain them have different IP addresses. 4.2. Prioritizing Candidates The prioritization process results in the assignment ofwhich come fromasingle localpriority to each candidate.For traditional RTP-based media streams, it is RECOMMENDED that there be two components per candidate - oneAn agent does this by determining a preference forRTPeach type of candidate (server reflexive, per reflexive, relayed andonehost), and, when the agent is multihomed, choosing a preference forRTCP. The component withits interfaces. These two preferences are then combined to compute thecomponent ID of 1priority for a candidate. That priority MUST beRTP, andcomputed using theone withfollowing formula: priority = 1000*(type preference) + 100*(local preference) + 10*(stream ID) + 1*(10 - componentID of 2ID) The type preference MUST beRTCP. Ifanagent doesn't implement RTCP, it SHOULD have a single componentinteger from 0 to 9 inclusive, and represents the preference for theRTP stream (which will have a component ID of 1 by definition). Each componenttype ofathe candidatehas(where the types are local, server reflexive, peer reflexive and relayed). A 9 is the highest preference, and asingle transport address. The first step0 is the lowest. Setting the value togather local candidates. Locala 0 means that candidatesare obtained by binding to ports (typically ephemeral) on an interface (physical or virtual, including VPN interfaces) on the host.of this type will only be used as a last resort. Theprocesstype preference MUST be identical forgathering localall candidatesdepends onof thetransport protocol. Procedures are specified heresame type and MUST be different forUDP. Extensions to ICE that define procedurescandidates of different types. The type preference forother transport protocolspeer reflexive candidates MUSTspecify how local transport addresses are gathered. For each UDP media stream the agent wishes to use,be lower than that of server reflexive candidates. Note that candidates gathered based on theagent SHOULD obtain a setprocedures of Section 4.1 will never be peer reflexive candidates; candidates(one for each interface)of these type are learned from the STUN connectivity checks performed bybinding to N UDP ports on each interface, where NICE. The component ID is thenumber of components neededcomponent ID for thecandidate. For RTP, Ncandidate, and MUST be between 1 and 10 inclusive. The stream ID istypically two. If a host has K local interfaces, this will result in K candidatesan integer, starting at 9, that decrements by one for eachUDP stream, requiring K*N local transport addresses. Oncemedia stream in theagent has obtained local candidates, it obtains candidates with derived transport addresses. The process for gathering derived candidates depends onsession. When signaled in thetransport protocol. Procedures are specified here for UDP. Extensions to ICE that define procedures for other transport protocols MUST specify how derived transport addresses are gathered. Agents which serve end users directly, such as softphones, hardphones, terminal adaptersSDP, the first m-line is the one with stream ID 9, the next with stream ID 8, the next with stream ID 7, and soon, MUST implementon. In essence, theSTUN Binding Discovery usage and SHOULD use it to obtain server reflexive candidates. These devices SHOULD implementstream ID indicates theSTUN Relay usage, and SHOULD use its Allocate request to obtain both server reflexive and relayed candidates. They MAY implement and MAY use other protocolsposition of thatprovide derived transport addresses, such as TEREDO [29].media stream in the SDP itself. Therequirementstream ID MUST be less than or equal touse the relay Usage is at SHOULD strength9, and therefore ICE only works with multimedia sessions with 10 or fewer media streams. The local preference MUST be an integer from 0 toallow9 inclusive. It represents a preference forprovider variation. If it is not to be used, it is RECOMMENDED that it be implemented and just disabled through configuration, so that it can re-enabled through configuration if conditions change inthefuture. Agentsparticular interface from whichrepresent network servers underthecontrol ofcandidate was obtained, in cases where an agent is multihomed. A nine represents the highest preference, and aservice provider, such as gateways tozero, thetelephone network, media servers, or conferencing servers that are targeted at deploymentlowest. When there is onlyin networks with public IP addresses MAY use the STUN Binding Discovery usage and relay usage, or other similar protocols to obtain candidates. Why would these types of endpoints even bothera single interface, this value SHOULD be set toimplement ICE? The answer is that such an implementation greatly facilitates NAT traversalnine. Generally speaking, if there are multiple candidates forclients that connect to it. Consider a PC softphone behind a NAT whose mapping policy is address and port dependent. The softphone initiatesacall throughparticular component for agateway that implements ICE. The gateway doesn't obtain any server reflexive or relayed transport addresses, but it implements ICE, and consequently, is prepared to receive STUN connectivity checks on itsparticular media stream which have the same type, the localtransport addresses. The softphone will send a STUN connectivity to check topreference MUST be unique for each one. In this specification, this only happens for multi-homed hosts. These rules guarantee thatlocal transport address, causing the NAT to allocatethere is anew bindingunique priority forthe softphone. The connectivity checkeach candidate. This priority willinform the softphone of this address, allowing it tobe used by ICE to determine thegateway as a peer reflexive remote candidate. This allows direct media transmission between the gateway and softphone, without the need for relays. Furthermore, implementationorder of theSTUNconnectivity checksallows for NAT bindings alongand theway to be kept open. ICE also provides numerous security properties thatrelative preference for candidates. Consequently, what follows areindependent of NAT traversal, and would benefit any multimedia endpoint. See Section 13some guidelines fora discussion onselection of thesebenefits. Obtaining derived candidates requires transmissionvalues. One criteria for selection ofpackets which havetheeffect of creating bindings on NAT devices between the clienttype and local preference values is theSTUN servers. Experience has shown that many NAT devices have upper limits on the rate at which they will create new bindings. Furthermore, transmission of these packets on the network makesuse ofbandwidth and needsan intermediary. That is, if media is sent tobe rate limited by the agent. As a consequence, a client SHOULD pace its STUN transactions, suchthat candidate, will thestart of each new transaction occurs at least Ta seconds after the startmedia first transit an intermediate server before being received. Relayed candidates are clearly one type of candidates that involve an intermediary. Another are host candidates obtained from a VPN interface. When media is transited through an intermediary, it can increase theprevious transaction. The value of Ta SHOULD be configurable,latency between transmission andSHOULD have a defaultreception. It can increase the packet losses, because of50ms. Notethe additional router hops thatthis pacing applies only tomay be taken. It may increase thestartcost ofa new transaction; pacingproviding service, since media will be routed in and right back out ofretransmissions within a STUN transaction is governedan intermediary run by theretransmission rules defined by STUN. Derivedprovider. If these concerns are important, the type preference for relayed candidates can beobtained from the STUN Binding Discovery usage orset lower than theSTUN Relay usage. The latter is preferred sincetype preference for reflexive and host candidates. Indeed, itwill provide the client with bothis RECOMMENDED that in this case, host candidates have a type preference of nine, server reflexiveandcandidates have arelayed transport address with a single transaction. It is possible that some STUN servers will only support the Relay usage or only the Binding Discovery usage, in which case a client might be configured with different servers depending on the usage. To obtain both servertype preference of 5, peer reflexive have a type prefence of 6, and relayed candidatesusing the STUN Relay Usage, the client takeshave alocal UDP candidate,type preference of zero. Furthermore, if an agent is multi-homed and has multiple interfaces, the local preference foreach configured STUN server, produces both candidates. It is anticipated that clients mayhost candidates from a VPN interface SHOULD have amultiplicitypriority ofSTUN servers configured or discovered in network environments where there are multiple layers0. Another criteria for selection ofNAT,preferences is IP address family. ICE works with both IPv4 andwhereIPv6. It therefore provides a transition mechanism thatlayering is knownallows dual-stack hosts to prefer connectivity over IPv6, but to fall back to IPv4 in case theprovider of the client. To obtain these candidates,v6 networks are disconnected (due, foreach configured STUN server, the client initiates an Allocate Request transaction using the procedures of Section 8.1.2 of [13] from each transportexample, to a failure in a 6to4 relay) [22]. It can also help with hosts that have both a native IPv6 addressofand aparticular6to4 address. In such a case, lower localcandidate. The Allocate Response will providepreferences could be assigned to theclient with its server reflexive transport address (obtained fromv6 interface, followed by theXOR-MAPPED- ADDRESS attribute) and its relayed transport address in6to4 interfaces, followed by theRELAY- ADDRESS attribute. Indeed, these two transport addresses are relatedv4 interfaces. This allows a site toeach other. The relay will forward packets received on the relayed transport address towardsobtain and begin using native v6 addresses immediately, yet still fallback to 6to4 addresses when communicating with agents in other sites thatserver reflexive transport address. As such, the server reflexive transport addressdo not yet have native v6 connectivity. Another criteria for selecting preferences issaidsecurity. If a user is a telecommuter, and therefore connected to their corporate network and a local home network, they may prefer their voice traffic to be routed over theassociated server reflexive transport address for that relayed address. OnceVPN in order to keep it on theAllocate requests have givencorporate network when communicating within the enterprise, but use the local network when communicating with users outside of the enterprise. In such aclientcase, arelayed transport address for all transport addresses inVPN interface would have arelayed candidate, therehigher local preference than any other interfaces. Another criteria for selecting preferences isno reasontopological awareness. This is most useful fora client to obtain further relayedcandidatesthrough the same STUN server. Thus,that make use of relays. In those cases, ifthere are other local candidates from which the clientan agent hasnot yet obtained relayed transport address,preconfigured or dynamically discovered knowledge of theclient SHOULD NOT bothertopological proximity of the relays toobtain them. Instead,itself, itSHOULDcan usethe STUN Binding Discovery usage and obtain just server reflexive addresses fromthatSTUN server. The order in whichto assign higher localcandidates are tried against the STUN serverpreferences toobtain relayedcandidatesis a matter of local policy. To obtain server reflexiveobtained from closer relays. There may be transport-specific reasons for assigning preferences to candidates. In such a case, specifications defining usage of ICE with other transport protocols SHOULD document such considerations. 4.3. Choosing In-Use Candidates A candidate is said to be "in-use" if it appears in the m/c-line of an offer or answer. When communicating with an ICE peer, being in- use implies that, should these candidatesusingbe selected by theSTUN Binding Discovery usage,ICE algorithm, bidirectional media can flow and theclient takescandidates can be used. If alocal UDP candidate,candidate is selected by ICE but is not in-use, only unidirectional media can flow and only foreach configured STUN server, producesaserver reflexive candidate. To producebrief time; theserver reflexivecandidatefrom the local candidate, it follows the procedures of Section 12.2 of [12] for each local transport address in the local candidate. The Binding Response will provide the clientmust be made in-use through an updated offer/answer exchange. When communicating withits server reflexive transport address. Ifa peer that is not ICE-aware, theclient had K local candidates, thisin-use candidates willproduce S*K server reflexive candidates, where S isbe used exclusively for thenumberexchange ofSTUN servers. Since a client will pace its STUN transactions (both Binding and Allocate requests) atmedia, as defined in normal offer/answer procedures. An agent MUST choose atotal rateset of candidates, onenew transaction every Ta seconds, it will take a certain amountfor each component oftimeeach active media stream, tocompletebe in-use. A media stream is active if it does not contain theaddress gathering phase.a=inactive SDP attribute. It is RECOMMENDED thatimplementations have a configurable upper bound on the total amount of time allotted to address gathering. Any transactions not completed at that point SHOULD be abandoned, but MAY continue and be used in an updated offer once they complete. A default value of 5s is RECOMMENDED. Since the total number of allocations that couldin-use candidates bedone (basedchosen based on thenumberlikelihood ofSTUN servers and local interfaces) might exceed this value, clients SHOULD prioritize their local candidates and STUN servers, performing transactions from the highest priority localthose candidates tothe highest priority STUN servers first. A STUN server would typically be higher priority if it supports the STUN Relay Usage, since such a server provides two transport addresseswork withone transaction. Oncetheallocations are complete, any redundant candidates are discarded. Candidate Apeer that isredundant with candidate B if the transport addresses of each component match, and each component of their associated localbeing contacted. Unfortunately, it is difficult to ascertain which candidatesmatch. Forthat might be. As an example, consider aset ofuser within an enterprise. To reach non-ICE capable agents within the enterprise, host candidateswithhave to be used, since the enterprise policies may prevent communication between elements using asingle component. One candidate isrelay on the public network. However, when communicating to peers outside of the enterprise, relayed candidates from alocal candidate, and itspublically accessible STUN server are needed. Indeed, the difficulty in picking just onecomponent has a transport address of 10.0.1.1: 4458. A reflexivetransport address that will work isderived fromthe whole problem that motivated the development of thislocal transport address, producing a 10.0.1.1:4458. These two candidates are identical, and also have identical associated local transport addresses, so they are redundant. +----------+ | STUN Srvr| +----------+ | | ----- // \\ | | | B:net10 | | | \\ // ----- | | +----------+ | NAT | +----------+ | | ----- // \\ | A | |192.168/16 | | | \\ // ----- | | |192.168.1.1 ----- +----------+ // \\ +----------+ | | | | | | | Offerer |---------| C:net10 |---------| Answerer | | |10.0.1.1 | | 10.0.1.2 | | +----------+ \\ // +----------+ ----- Figure 6 Consider the more complicated case of Figure 6. In this case,specification in theoffererfirst place. As such, it ismulti-homed. It has one interface, 10.0.1.1, on network C, whichRECOMMENDED that relayed candidates be selected to be in-use. Furthermore, ICE isa net 10 private network. The Answereronly truly effective when it is supported onthis same network. The offerer is also connected to network A, which is 192.168/16. The offerer has an interfaceboth sides of192.168.1.1 on this network. Therethe session. It is therefore most prudent to deploy it to close-knit communities as aNAT onwhole, rather than piecemeal. In the example above, thisnetwork, natting into network B, which is another net10 private network, but not connectedwould mean that ICE would ideally be deployed completely within the enterprise, rather than just tonetwork C.parts of it. Thereismay be transport-specific reasons for selection of an in-use candidate. In such aSTUN server on network B. The offerer obtains localcase, specifications defining usage of ICE with other transportaddress on its interface on network C (10.0.1.1:2498) and a local transport address on its interface on network A (192.168.1.1:3344). It performs a STUN query to its configured STUN server from 192.168.1.1:3344. This query passes through the NAT, which happens to assignprotocols SHOULD document such considerations. 4.4. Encoding thebinding 10.0.1.1: 2498.SDP TheSTUN server reflects thisagent includes a single a=candidate media level attribute in theSTUN Binding Response. Now, the offerer has obtained aSDP for each candidatewith afor that media stream. The a=candidate attribute contains the IP address, port and transportaddress it already has (10.0.1.1:2498), but fromprotocol for that candidate. A Fully Qualified Domain Name (FQDN) for anew interface. It therefore keeps it. When it performs its connectivity checks,host MAY be used in place of a unicast address. In that case, when receiving an offer or answer containing an FQDN in an a=candidate attribute, theofferer will endFQDN is looked upsending packets from both interfaces, and those sent from its interface on network C will succeed. 7.2. Prioritizingin theCandidates and ChoosingDNS using anOperating One The prioritization process takesA or AAAA record, and theset of candidatesresulting IP address is used fora particular media stream and associates each with a priority. This priority reflectsthedesireremainder of ICE processing. The candidate attribute also includes the component ID for that candidate. For media streams based on RTP, candidates for theagent has to receiveactual RTP mediaat that candidate, and is assigned as a value from 0 to 1 (1 being most preferred). Priorities areMUST have apropertycomponent ID ofa candidate,1, andthus shared across all components ofcandidates for RTCP MUST have acandidate. Priorities are ordinal, so that their significance is only meaningful relativecomponent ID of 2. Other types of media streams which require multiple components MUST develop specifications which define the mapping of components toother candidates from that agentcomponent IDs. The candidate attribute also includes the priority, which is the value determined fora particular media stream. Candidates MAY havethesame priority. However, itcandidate as described in Section 4.2, and the foundation, which isRECOMMENDED that eachthe value determined for the candidatehaveas described in Section 4.1. The agent SHOULD include adistinct priority. Doing so improvestype for each candidate by populating theefficiency of ICE. This specification makes no normative statements on howcandidate-types production with theprioritization is done. However, some useful guidelines are suggested on how suchappropriate value - "host" for host candidates, "srflx" for server reflexive candidates, "prflx" for peer reflexive candidates (though these never appear in an initial offer/answer exchange), and "relay" for relayed candidates. The related address MUST NOT be included if aprioritization cantype was not included. If a type was included, the related address SHOULD bedetermined. One criteriapresent forchoosing oneserver reflexive, peer reflexive and relayed candidates. If a candidateover anotheriswhetherserver ornotpeer reflexive, the related address is equal to the base for that server or peer reflexive candidate. If the candidateinvolvesis relayed, theuse of an intermediary. That is, if mediarelated address issentequal tothat candidate, willthemedia first transit an intermediate server before being received. Relayed candidates are clearly one typetranslation ofcandidates that involve an intermediary. Another are local candidates associated withthe relayed address. If the candidiate is aVPN server. When mediahost candidate, there istransited through an intermediary, it can increase the latency between transmissionno related address andreception. It can increase the packet losses, because oftheadditional router hops that mayrel-addr production MUST betaken. It may increase the costomitted. STUN connectivity checks between agents make use ofproviding service, since media will be routeda short term credential that is exchanged inand right back out of an intermediary run bytheprovider. If these concerns are important, candidates withoffer/answer process. The username part of thisproperty can be listed with lower priority. Another criteria for choosing one candidate over anothercredential isIP address family. ICE works with both IPv4 and IPv6. It thereforeformed by concatenating a username fragment from each agent, separated by a colon. Each agent also provides atransition mechanism that allows dual-stack hosts to prefer connectivity over IPv6, but to fall backpassword, used toIPv4 in casecompute thev6 networks are disconnected (due,message integrity forexample, to a failure in a 6to4 relay) [23]. Itrequests it receives. As such, an SDP MUST contain the ice-ufrag and ice-pwd attributes, containing the username fragment and password respectively. These canalso help with hosts that have both a native IPv6 addressbe either session or media level attributes, and thus common across all candidates for all media streams, or all candidates for a6to4 address. In such a case, higher priority couldparticular media stream, respectively. However, if two media streams have identical ice-ufrag's, they MUST have identical ice-pwd's. The ice-ufrag and ice-pwd attributes MUST beafforded to the native v6 address, followed bychosen randomly at the6to4 address, followed by a native v4 address.beginning of a session. The ice-ufrag attribute MUST contain at least 24 bits of randomness, and the ice- pwd attribute MUST contain at least 128 bits of randomness. This means that the ice-ufrag attribute will be at least 4 characters long, and the ice-pwd at least 22 characters long, since the grammar for these attributes allowsa site to obtainfor 6 bits of randomness per character. The attributes MAY be longer than 4 andbegin using native v6 addresses immediately, yet still fallback to 6to4 addresses when communicating21 characters respectively, of course. The m/c-line is populated withagents in other sitesthe candidates thatdo not yet have native v6 connectivity. Another criteria for choosing oneare in-use. For streams based on RTP, this is done by placing the RTP candidateover anotherinto the m and c lines respectively. If the agent issecurity.utilizing RTCP, it MUST encode the RTCP candidate into the m/c-line using the a=rtcp attribute as defined in RFC 3605 [2]. Ifa userRTCP isa telecommuter, and therefore connected to their corporate networknot in use, the agent MUST signal that using b=RS:0 anda local home network, they may prefer their voice traffic tob=RR:0 as defined in RFC 3556 [5]. There MUST berouted overa candidate attribute for each component of theVPNmedia stream inorderthe m/c-line. Once an offer or answer are sent, an agent MUST be prepared tokeep itreceive both STUN and media packets on each candidate. As discussed in Section 11.1, media packets can be sent to a candidate prior to its appearence in thecorporate network when communicating within the enterprise, but usem/c-line. 5. Receiving thelocal network when communicating with users outside ofInitial Offer When an agent receives an initial offer, it will check if theenterprise. Another criteria for choosingofferor supports ICE, gather candidates, prioritize them, choose oneaddress over another is topological awareness. This is most usefulforcandidates that make use of relays. In those cases, ifin- use, encode and send an answer, and then form a check list and begin connectivity checks. 5.1. Verifying ICE Support The agenthas preconfigured or dynamically discovered knowledge ofwill proceed with thetopological proximity ofICE procedures defined in this specification if therelays to itself, it can use that to select closer relays with higher priority.following are both true: o Theremay be transport-specific reasonsis at least one a=candidate attribute forpreferringeach media stream in the SDP it just received. o For each media stream, at least onecandidate over another. In such a case, specifications defining usageofICE with other transport protocols SHOULD document such considerations. Oncethe candidateshave been prioritized, one may be selected as the operating one. Thisisthe candidate that will be useda match foractual exchange of media if and whenitsvalidated, until a higher priority candidate is validated. The operating candidate will also be used to receive media from ICE-unaware peers. As such, it is RECOMMENDED that one be chosenrespective in-use component in the m/c-line. If both of these conditions are not met, the agent MUST process the SDP based on normal RFC 3264 procedures, without using any of thelikelihoodICE mechanisms described in the remainder ofthat candidate to workthis specification, with thepeer that is being contacted. Unfortunately, itexception of Section 10, which describes keepalive procedures. 5.2. Gathering Candidates The process for gathering candidates at the answerer isdifficultidentical toascertain which candidatethe process for the offerer as described in Section 4.1. It is RECOMMENDED thatmight be. As an example, consider athis process begin immediately on receipt of the offer, prior to userwithinacceptance of a session. Such gathering MAY even be done pre-emptively when anenterprise. To reach non-ICE capable agents withinagent starts. 5.3. Prioritizing Candidates The process for prioritizing candidates at theenterprise, a local candidate hasanswerer is identical tobe used, sincetheenterprise policies may prevent communication between elements using a relay onprocess followed by thepublic network. However, when communicating to peers outside ofofferer, as described in Section 4.2. 5.4. Choosing In Use Candidates The process for selecting in-use candidates at theenterprise, a relayed candidate from a publically accessible STUN serveranswerer isneeded. Indeed,identical to thedifficultyprocess followed by the offerer, as described inpicking just one address that will workSection 4.3. 5.5. Encoding the SDP The process for encoding the SDP at the answerer is identical to thewhole problem that motivatedprocess followed by thedevelopment of this specificationofferer, as described in Section 4.4. 5.6. Forming thefirst place. As such, itCheck List Next, the agent forms the check list. The check list isRECOMMENDEDa sequence of STUN connectivity checks that are performed by theoperatingagent. To form the check list, the agent forms candidatebepairs, computes arelayedcandidatefrom a STUN server providing public IP addressespair priority, orders the pairs by priority, prunes them, and sets their states. These steps are described inresponse to an Allocate request. Furthermore, ICE is only truly effective when it is supported on both sides of the session. It is therefore most prudent to deploy it to close-knit communities as a whole, rather than piecemeal. In the example above,thiswould mean that ICE would ideally be deployed completely withinsection. First, theenterprise, rather than just to parts of it. An additional consideration for selectionagent takes each of its candidates (called local candidates) and pairs them with theoperatingcandidates it received from its peer (called remote candidates). A local candidate is paired with a remote candidate if and only if theswitching oftwo candidates are for the same mediastream destinations betweenstream, have theinitial offersame component ID, and have thesubsequent offer. The operating candidate pair in the initial offersame IP address version. It isvalidated first, and ifpossible thatvalidation succeeds, media will immediately begin to flow between the pair. Whensome of theICE checks complete and yieldlocal candidates don't get paired with ahigher priority candidate pair, media will begin to flow to it (there will also be an updated offer/answer exchange that changesremote candidate, and some of theoperating candidate).remote candidates don't get paired with local candidates. Thiswill result in a change incan happen if one agent didn't include candidates for thedestinationall of themedia packets. This may also cause a different pathcomponents forthea mediapackets. That path might have different delaystream. In the case of RTP, for example, this would happen when one agent provided candidates for RTCP, andjitter characteristics. As a consequence,thejitter buffers may see a glitch, causing possible media artifacts.other did not. Ifthese issues are a concern,this happens, theinitial offer MAY omit an operating candidate. Thisnumber of components for that media stream isdone by including an m/c-line with an a=inactive attribute. In such a case, an updated offer will needeffectively reduced, and considered to besent immediately when communicating with an ICE-unaware agent, setting an operating candidate. There may be transport-specific reasons for selection of an operating candidate. In such a case, specifications defining usageequal to the minimum across both agents ofICE with other transport protocols SHOULD document such considerations. 7.3. Encoding Candidates into SDP For each candidate for a media stream,the maximum component ID provided by each agentincludes a series of a=candidate attributes as media-level attributes, oneacross all components foreach component inthecandidate. Each candidate has a unique identifier, calledmedia stream. Once thecandidate ID. The candidate ID MUST be chosen randomly and contain at least 24 bits of randomness. This means thatpairs are formed, a candidateID mustpair priority is computed. Let O-P beat least 4 characters long, since each character inthebase64 alphabet usedpriority for the candidateIDs contains at most 6 bits of randomness. A candidate ID MAYprovided by the offerer. Let A-P belonger than 4 characters, and different candidate IDs MAY have different lengths. It is chosen only whenthe priority for the candidateis placed intoprovided by theSDP foranswerer. Let O-IP be thefirst time; subsequent offers or answers withinIP address (without thesame session containing that same candidate MUST useport) of thesamecandidateID used previously. 24 bits is sufficient becauseprovided by thecandidate ID is not providing security (the much more random password is). Its sole purpose isofferer. Let SZ be two tomake it highly unlikely that boththeoffererpower of 32 for IPv4 candidates, andanswerer selecttwo to thesame valuepower of 128 fora candidateIPv6 candidates. The priority forthe same media stream. Different valuesa pair is computed as: pair priority = 10000*MIN(O-P,A-P) + MAX(O-P,A-P) + O-IP/SZ OPEN ISSUE: This can be larger than 32 bits. Should consider ways of reducing that. This formula ensures a unique priority forthe candidate ID are required to break tieseach pair in most cases. One theprocedure thatpriority isused to orderassigned, the agent sorts the candidatepairs. Each componentpairs in decreasing order of priority. If two pairs have identical priority, the ordering amongst them is arbitrary. This sorted list of candidatehas an identifier, called the component ID. The component IDpairs is used to determine a sequencenumber. For each candidate, it starts at one, and increments by one for each component. As discussed below, ICE will performof connectivity checkssuch that, betweenthat will be performed. Each check involves sending apair of candidates, checksrequest from a local candidate to a remote candidate. Since an agent cannot send requests directly from a reflexive candidate, but onlyoccur between transport addresses withfrom its base, thesame component ID. As a consequence, if oneagent next goes through the sorted list of candidate pairs. For each pair where the local candidatehas three components, and itispaired with aserver reflexive, the server reflexive candidatethat has two, there will onlyMUST betwo transport address pairsreplaced by its base. Once this has been done, the agent MUST remove redundant pairs. A pair is redundant if its local andtwo connectivity checks. ICE will work without a standardized mapping betweenremote candidates are identical to thecomponentslocal and remote candidates of amedia streampair higher up on the priority list. The result is called the check list, and each candidate pair on it is called a check. Each check is also said to have a foundation, which is merely thenumerical valuecombination of thecomponent ID. This allows ICE to be used with media streams with multiple components without developmentfoundations ofstandards around such a mapping. However,the local and remote candidates in the check. Finally, each check in the check list is associated with aspecific mappingstate. There are five potential values that the state can have: Waiting: This check has not beendefined in this specificationperformed, and can be performed as soon as it is the highest priority Waiting check on the check list. In-Progress: A request has been sent forRTP - component ID 1 corresponds to RTP,this check, but the transaction is in progress. Succeeded: This check was already done andcomponent ID of 2 correspondsproduced a successful result. Failed: This check was already done and failed, either never producing any response or producing an unrecoverable failure response. Frozen: This check hasn't been performed, and it can't yet be performed until some other check succeeds, allowing it toRTCP. Likemove into thecandidate ID,Waiting state. First, thecomponent ID is assigned atagent sets all of thetimechecks to the Frozen state. Then, it sets thecandidate isfirstplaced intocheck in theSDP; subsequent offers or answers withincheck list to Waiting. It then finds all of the other checks for the samesession containing that same candidate MUST usemedia stream and with the same componentID used previously. The transport, addrID, but different foundations, andportsets all ofthe a=candidate attribute (all defined in Section 12) are settheir states tothe transport protocol, unicast address and portWaiting. 5.7. Performing Periodic Checks An agent performs two types of checks. The first type are periodic checks. These checks occur periodically, and involve choosing thetranport address. A Fully Qualified Domain Name (FQDN) for a host MAY be usedhighest priority check inplacethe Waiting state from the check list, and performing it. The other type of check is called aunicast address. Intriggered check. This is a check thatcase, when receiving an offer or answer containing an FQDN in an a=candidate attribute, the FQDNislooked up inperformed on receipt of a connectivity check from theDNS using an A or AAAA record, andpeer. This section describes how periodic checks are performed. Once theresulting IP address is used foragent has computed theremainder of ICE processing. The qvaluecheck list as described in Section 5.6, it sets a timer that fires every Ta seconds. This isset to the priority of the candidate, and MUST bethe samefor all components ofvalue used to pace thecandidate.gathering of candidates, as described in Section 4.1. The first timer fires immediately, so that the agentMUST includeperforms atype forconnectivity check thetransport address by populatingmoment thecandidate-types production withoffer/answer exchange has been done, followed by theappropriate value - "local" for local transport addresses, "srflx" for server reflexive candidates, and "relay" for relayed candidates. Ifnext periodic check Ta seconds later. When thetransport address is server reflexive,timer fires, the agent MUSTincludefind therel-addr and rel-port productions containinghighest priority check in theassociated local transport address forcheck list thatserver reflexive transport address. There are environmentsis inwhichthepolicy of an agent is such that it never providesWaiting state. The agent then sends a STUN check from the localtransport addresses in its offers or answers, for fearcandidate ofrevealing internal topologythat check toexternal hosts. In such cases, an agent MAY include a random transport address instead, as long as it isthesame transport addressremote candidate of that check. The procedures forall server reflexive candidates derived fromforming thesame actual local transport address. This is becauseSTUN request for this purpose are described in Section 7.7.1. If none of thetransport addresschecks in therel-addr and rel-port productioncheck list areused byin theICE algorithm itself for correlation purposes. IfWaiting state, but there are checks in thetranport address is relayed,Frozen state, theagent SHOULD includehighest priority check in therel- addr and rel-port productions, containingFrozen state is moved into theassociated server reflexive transport address.Waiting state, and that check is performed. When arelayed addresscheck isobtained from a STUN relay, the associated server reflexive transport addressperformed, its state isthe value from the XOR-MAPPED-ADDRESS that was returnedset to In-Progress. If there are no checks in either thesame STUN response which providedWaiting or Frozen state, then timer Ta is stopped. Performing therelayed address toconnectivity check requires theagent. Though not used directly with ICE,agent to know therel-addr and rel-port attributes are essentialusername fragment forproper functioning of QoS mechanisms, such as those defined by 3gppthe local andPacketcable. The rel-addrremote candidates, andrel-port production MUST NOT be present for a local transport address. All ofthecandidates for a media stream share apasswordthat is usedforsecuringtheSTUN connectivity checks. The password will be used to processremote candidate. For periodic checks, theMESSAGE-INTEGRITY attribute for STUN requestsremote username fragment and password are learned directly from the SDP receivedbyfrom theagent. The password for candidates for different media streams MAY bepeer, and thesame, or MAY be different. This password MUST be chosen randomly with 128 bits of randomness (though it can be longer than 128 bits). This passwordlocal username fragment iscontained inknown by thea=ice-pwd attribute, present as a session or media level attribute. Since each characteragent. 6. Receipt of theice-pwd attribute can represent six bits of randomness,Initial Answer This section describes theice-pwd attribute will always be at least 22 characters long. New passwords MUST be selected for each new session, even ifprocedures that an agent follows when it receives thetransport addressanswer froma previous session is being recycled. The combination of candidate ID and component ID uniquely identify each transport address. As a consequence, each transport address has a unique identifier, called the transport address ID. The transport address ID is formed by concatenating the candidate ID withthecomponent ID, separated bypeer. It verifies that its peer supports ICE, forms thecolon (":").check list and begins performing periodic checks. 6.1. Verifying ICE Support Thetransport address ID is not explicitly encoded inofferer follows theSDP; it is derived fromsame procedures described for thecandidate ID and component ID, which are presentanswerer in Section 5.1. 6.2. Forming theSDP.Check List Theusage of the colon as a separator allowsofferer follows thecandidate ID and component ID to be extracted fromsame procedures described for thetransport address ID, sinceanswerer in Section 5.6. 6.3. Performing Periodic Checks The offerer follows thecolon is not a valid charactersame procedures described for thecandidate ID. The transport address ID gets combined, through further concatenation, with the transport address ID ofanswerer in Section 5.7. 7. Connectivity Checks This section describes how connectivity checks are performed. Connectivity checks are atransport address fromSTUN usage, and theremote candidate (separated again by another colon) to formbehaviors described here meet theusername that is placedguidelines for definitions of new usages as outlined in [11] Note that all ICE implementations are required to be compliant to [11], as opposed to the older [13]. 7.1. Applicability This STUNchecksusage provides a connectivity check betweenthe peers.two peers participating in an offer/answer exchange. Thisallows the STUN messagecheck serves touniquely identify the pairing whosevalidate a pair of candidates for usage of exchange of media. Connectivity checks also allow agents to discover reflexive candidates towards their peers, called peer reflexive candidates. Finally, connectivity checks serve to keep NAT bindings alive. It is fundamental to this STUN usage that the addresses and ports used for media are the same ones used for the Binding Requests and responses. Consequently, it will be necessary to demultiplex STUN traffic from whatever the media traffic is. This demultiplexing ischecking.done using the techniques described in [11]. 7.2. Client Discovery of Server Thetransport address IDclient does not follow the DNS-based procedures defined in [11]. Rather, the remote candidate of the check to be performed isneededused asa unique identifier becausethe IP addresswithinand port of thecandidate fails to provideSTUN server. Note thatuniqueness asthe STUN server is aconsequence of NAT. Consider agents A, B, and C. Alogical entity, andB are within private enterprise 1, which is using 10.0.0.0/8. C is within private enterprise 2, whichisalso using 10.0.0.0/8. As it turns out, B and C both have IP address 10.0.1.1. A sends an offer to C. C,not a physically distinct server inits answer, provides A with its transport addresses. Inthiscase, thatusage. 7.3. Server Determination of Usage The server is10.0.1.1:8866 and 10.0.1.1:8877. Asaware of this usage because itturns out, B is insignaled this port through the offer/answer exchange. Any STUN packets received on this port will be for the connectivity check usage. 7.4. New Requests or Indications This usage does not define any new message types. 7.5. New Attributes This usage defines asession at that same time, and is also using 10.0.1.1:8866 and 10.0.1.1:8877.new attribute, PRIORITY. Thismeansattribute indicates the priority thatBispreparedtoaccept STUN messages on those ports, just as C is. A will sendbe associated with aSTUN request to 10.0.1.1:8866 andpeer reflexive candidate, should one be discovered by this check. It is a 32 bit unsigned integer, andanother to 10.0.1.1:8877. However, these do not go to C as expected. Instead, they go to B. If B just replied to them, A would believe ithasconnectivity to C, whenan attribute type of 0x0024. 7.6. New Error Response Codes This usage does not define any new error response codes. 7.7. Client Procedures This section defines additional procedures for the Binding Request transaction, beyond those described infact it has connectivity to a completely different user, B. To fix this,[11]. 7.7.1. Sending thetransport address ID takes onRequest The agent acting as therole ofclient generates aunique identifier. C provides A with an identifier for its transport address, and A provides oneconnectivity check either periodically, or triggered. In either case, the check is generated by sending a Binding Request from a local candidate, toC. A concatenates these two identifiers (withacolon between)remote candidate. The agent must know the username fragment for both candidates andusestheresultpassword for the remote candidate. A Binding Request serving as a connectivity check MUST utilize a STUN short term credential. Rather than being learned from a Shared Secret request, theusernameshort term credential is exchanged inits STUN query to 10.0.1.1:8866. This STUN query arrives at B. However,the offer/ answer procedures. In particular, the username isunknown to B, and soformed by concatenating therequest is rejected. A treatsusername fragment provided by therejected STUN request as if there were no connectivity to C (which is actually true). Therefore,peer with theerror is avoided. An unfortunate consequenceusername fragment of thenon-uniqueness of IP addressesagent sending the request, separated by a colon (":"). The password isthat, inequal to theabove example, B might not even be an ICE agent. It could be any host, andpassword provided by theport to whichpeer. For example, consider theSTUN packet is directed could be any ephemeral port on that host. If therecase where agent A isan application listening on this socket for packets,the offerer, andit is not prepared to handle malformed packets for whatever protocolagent B isin use,theoperationanswerer. Agent A included a username fragment ofthat application could be affected. Fortunately, since the ports exchanged in SDP are ephemeralAFRAG for its candidates, andusually drawna password of APASS. Agent B provided a username fragment of BFRAG and a password of BPASS. A connectivity check from A to B (and its response of course) utilize thedynamic or registered range, the odds are good that the port is not used to runusername BFRAG:AFRAG and aserver on host B, but rather is the agent side of some protocol. This decreases the probabilitypassword ofhitting a port in-use, dueBPASS. A connectivity check from B to A (and its response) utilize thetransient natureusername AFRAG:BFRAG and a password ofportAPASS. All Binding Requests for the connectivity check usagein this range. However,MUST contain thepossibility of a problem does exist, and network deployers shouldPRIORITY attribute. This MUST beprepared for it. Note that this is not a problem specificset equal toICE; stray packets can arrive at a port at any time for any type of protocol, especially onesthe priority that would be assigned, based on thepublic Internet. As such,algorithm in Section 4.2, to a peer reflexive candidate learned from thisrequirement is just restatingcheck. Such ageneral design guideline for Internet applications - be prepared for unknown packets on any port. The operating candidate, if there is one, is placed intopeer reflexive candidate has a stream ID, component ID and local preference that are equal to them/c lines ofhost candidate from which theSDP. For RTP streams, thischeck isdone by placingbeing sent, but a type preference equal to theRTP address and port intovalue associated with peer reflexive candidates. The Binding Request by an agent MUST include thecUSERNAME andm linesMESSAGE-INTEGRITY attributes. That is, an agent MUST NOT wait to be challenged for short term credentials. Rather, it MUST provide them in theSDP respectively.Binding Request right away. 7.7.2. Processing the Response If theagent is utilizing RTCP, itSTUN transaction generates an unrecoverable failure response or times out, the agent sets the state of the check to Failed. The remainder of this section applies to processing of successful responses (any response from 200 to 299). The agent MUSTencode itscheck that the source IP address and portusingof thea=rtcp attribute as defined in RFC 3605 [1]. If RTCP is not in use,response equals theagent MUST signaldestination IP address and port thatusing b=RS:0the Binding Request was sent to, andb=RR:0 as defined in RFC 3556 [6]. If there is no operating candidate,that theagent MUST include an a=inactive attribute. The mediasource IP address and port of the request match the destination IP address and port that the Binding Response was received on. If these do not match, the agent sets the state of the check to Failed. The processing described in them/c-line is inconsequential, since it won't be used. Encodingremainder ofcandidates may involvethis section MUST NOT be performed. Otherwise, the source transportprotocol specific considerations. There are none for UDP. However, extensions that define usageaddress ofICE with otherthe response matched the destination transportprotocols SHOULD specify any special encoding considerations. Once an offer or answer are sent, anaddress of the request. The agentMUST be preparedchanges the state for this check toreceive both STUN and media packets on each candidate. As discussed in Section 7.13, media packetsSucceeded. Next, the agent sees if the success of this check canbe sentcause other checks to be unfrozen. If the check had acandidate prior to its promotion to operating. 7.4. Forming Candidate Pairs Oncecomponent ID of one, theoffer/answer exchange has completed, both agents will have a set of candidates for each media stream. Eachagentforms a set of candidate pairsMUST change the states foreachall other Frozen checks for the same media streamby combining each of its candidates with each of the candidates of its peer. Candidates can be paired up only if their transport protocols are identical. Each candidate has a number of components, each of which has a transport address. Within a candidate pair, the components themselves are paired up such that transport addresses with theand same foundation, but different componentID are combinedIDs, toform a transport address pair.Waiting. Ifone candidate has more components thantheother, those extra components will not be part of a transport address pair, won't be validated, and will effectively be treated as if they weren't included incomponent ID for thecandidate pair incheck was equal to thefirst place. For example, if an offer/answer exchange took place for a session comprisednumber ofan audio and a video stream, and each agent had two candidates percomponents for the media stream,there would be 8 candidate pairs, 4the agent MUST change the state foraudio and 4all other Frozen checks forvideo. For eachthe first component of different media streams but the8 candidate pairs, there would be two transportsame foundation, to Waiting. Next, the agent checks the mapped addresspairs - one for RTP, and one for RTCP. The relationship between a candidate, candidate pair, transport address,from the STUN response. If the transport addresspair and component are shown in Figure 7. This figure showsdoes not match any of therelationships as seen bylocal candidates that the agentthat ownsknows about, the mapped address representes a new peer reflexive candidate. Its type is equal to peer reflexive. Its base is set equal to the candidatewith candidate ID "L". This candidate has two components with transport addresses Afrom which the STUN check was sent. Its username fragment andB respectively. Thispassword are identical to the candidateis calledfrom which thenative candidate, since itcheck was sent. It is assigned theone owned by the agentpriority value that was placed inquestion. The candidate owned by its peer is calledtheremote candidate. AsPRIORITY attribute of thefigure shows, thererequest. Its foundation isa single candidate pair, and two componentsselected as described ineach candidate.Section 4.1. Thenative candidate has apeer reflexive candidateIDis then added to the list of"L", andlocal candidates known by the agent (though it is not paired with other remotecandidate has a candidate ID of "R". Sincecandidates at this time). In addition, thetwo component IDs are 1 and 2,agent creates a candidate"L" has two transport addresses with transport address IDs of "L:1" and "L:2" respectively. Similarly,pair whose local candidate"R" has two transport addresses with transportequals the mapped addressIDsof"R:1"the response, and"R:2" respectively. Note that thesewhose remote candidateIDs are not actually legal since they are not sufficiently random. However, we use "L" and "R" to keepequals thefigures readable. Furthermore, each transportdestination addresspair is associated with an ID,to which thetransport address pair ID.request was sent. ThisIDisequal tocalled a validated pair, since it has been validated by a STUN connectivity check. The agent will know, either from theconcatenation ofSDP or through thetransport address IDPRIORITY attribute that was present in a STUN request, the priorities of thenative transport address with the transport address IDlocal and remote candidates of theremote transport address, separated byvalidated pair. Based on these priorities, acolon. This means that the identifiers are seen differenlypriority foreach agent. Forthe validated pair itself is computed if it was not already known, using the algorithm in Section 5.6, and the pair is added to the valid list. 7.8. Server Procedures An agentthat ownsMUST be prepared to receive a Binding Request on the base of each candidate"L", there are two transport address pairs. One contains transportit included in its most recent offer or answer. Receipt of a Binding Request on an IP address"L:1"and"R:1", withport that the agent had included in atransport address pair ID of "L:1:R:1".candidate attribute is an indication that the connectivity check usage applies to the request. Theother contains transport address "L:2"agent MUST use a short term credential to authenticate the request and"R:2", withperform atransport address pair IDmessage integrity check. The agent MUST accept a credential if the username consists of"L:2:R:2". Fortwo values separated by a colon, where the first value is equal to the username fragment generated by the agentthat owns candidate "R",in an offer or answer for a session in- progress, and the password is equal to theidentifierspassword forthese two transport address pairs are reversed; it wouldthat username fragment. It is possible (and in fact very likely) that an offeror will receive a Binding Request prior to receiving the answer from its peer. However, the request can be"R:1:L:1" forprocessed without receiving this answer, and a response generated. For requests being received on a relayed candidate, thefirst onesource IP address and"R:2:L:2"port used for STUN processing (namely, generation of thesecond. ............................................... . . . . . ............. ............. . . . tid=L:1 . . tid=R:1 . . . . -- . . -- . . component component. . | A|------------------------| C| . . id=1 id=1 . . -- . Transport . -- . . . . . Address . . . . . . Pair . . . . . . id=L:1:R:1 . . . . . . . . . . . . . . . . . tid=L:2 . . tid=R:2 . . component . . -- . . -- . . id=2 . . | B|------------------------| D| component . . -- . Transport . -- . . id=2 . . . Address . . . . . . Pair . . . . . . id=L:2:R:2 . . . . . . . . . . ............. ............. . . Native Remote . . Candidate Candidate . . id=L id=R . . . . . ............................................... Candidate Pair Figure 7 If a candidate pair was created as a consequence of an offer generated by an agent, then that agentXOR-MAPPED-ADDRESS attribute) issaid to betheofferer of that candidate pairIP address andall of itsport as seen by the relay. That source transport addresspairs. Similarly, the other agent is said towill be present in theanswerer of that candidate pair and allREMOTE-ADDRESS attribute ofits transport address pairs. Asaconsequence, each agent hasSTUN Data Indication message, if the Binding Request was delivered through aparticular role, either offerer or answerer, for each transport address pair. This role is important; whenData Indication. If the Binding Request was not encapsulated in acandidate pairData Indication, that source address is equal tobe promoted to operating,theofferer iscurrent active destination for theoneSTUN relay session. When the agent receives a STUN Binding Request for whichperformsit generates a successful response, theupdated offer. 7.5. Orderingagent checks theCandidate Pairs Recall that when each candidate is encoded into SDP,source transport address of the request. If this transport address does not match any existing remote candidates, itcontainsrepresents aqvalue between 1 and 0, with 1 being the highest priority. Peernew peer reflexivecandidates, learned through the procedures described in Section 7.10 also haveremote candidate. This candidate is given a prioritybetween 0 and 1. For each media stream, the native candidates are ordered based on their qvalues, with higher q-values coming first. Amongst candidates with the same qvalue, they are ordered based on candidate ID, using reverse ASCII sort order. For example,equal to thecandidate with candidate ID "lagDx" sorts beforePRIORITY attribute from thecandidate with ID "bad79", and bothrequest. The type ofthose followthe candidatewith ID "m8zz". The usage of a reverse ASCII sort orderisimportant; as discussed in Section 13, it allows peer-derived candidatesequal tobe preferred over native ones. The result of these ordering rules will be an ordered list of candidates. The first candidate in this listpeer reflexive. Its foundation isgiven a sequence number of 1,set to an arbitrary value, different from thenext is given a sequence number of 2, and so on. This same procedure is donefoundation fortheall other remote candidates. Theresult is that each candidate pair has two sequence numbers, oneusername fragment for this candidate is equal to thenative candidate, and one forbottom half (the part after theremote candidate. First, allcolon) of thecandidate pairs for whom the smaller ofusername in thetwo sequence numbers equals 1 are taken first. Then, all of thoseBinding Request that was just received. The password forwhom the smaller of the two sequence numbers equals 2 arethis username fragment is takennext, and so on. Amongst those pairs that sharefrom thesame valueSDP from the peer. If agent has not yet received this SDP (a likely case fortheir smaller sequence number, they are ordered bythelarger of their two sequence numbers (smallest first). Amongst those pairs that shareofferer in thesame valueinitial offer/answer exchange), it MUST wait fortheir smaller sequence numberthe SDP to be received, and then proceed with rest of thesame value for their larger sequence number,processing described in thelargerremainder ofthe twothis section. This candidateIDs in each pair are selected, andis then added to thepairs are ordered in reverse ASCII orderlist of remote candidates. However, it is not paired with any local candidates. Next, thecandidate ID, largest first.agent MUST generate a triggered check in the reverse directon if it has not already sent such a check. Theresulting ordering oftriggered check has a local candidatepairs is calledequal to the candidatepair priority ordered list. As an example, consider two agents, Aon which the STUN request was received, andB. One offers two candidates foramedia stream withremote candidateIDs of "g9g9" and "8888", with q-values of 1.0 and 0.8 respectively.equal to the source transport address where the request came from (which may be a newly formed peer reflexive candidate). Theother answers with three candidates with candidate IDs of "h8h8", "6565"agent knows the priorities for the local and"klkl", with q-valuesremote candidates of0.3, 0.2this check, and0.1 respectively. The following table showsso can compute therank ordering ofpriority for thesix candidate pairs. The column labeled "Max SN"check itself. If there is already a check on thelarger of the two sequence numbers in the candidate pair,check list with this same local and"Min SN"remote candidates, and the state of that check is Waiting or Frozen, its state is changed to In- Progress and theminimum. The column labeled "Max Cand. ID"check is performed. If there was already a check on thevalue ofcheck list with this same local and remote candidates, and its state was In-Progress, thelargeragent SHOULD generate an immediate retransmit of thetwo candidate IDsBinding Request. This is to facilitate rapid completion of ICE when both agents are behind NAT. If there was a check in thecandidate pair. Order A A A B B B Max Cand. Cand. Cand. Cand. Cand. Cand. Max Min Cand. ID q-value SN ID q-value SN SN SN ID --------------------------------------------------------------------- 1 g9g9 1.0 1 h8h8 0.3 1 1 1 h8h8 2 8888 0.8 2 h8h8 0.3 1 2 1 h8h8 3 g9g9 1.0 1 6565 0.2 2 2 1 g9g9 4 g9g9 1.0 1 klkl 0.1 3 3 1 klkl 5 8888 0.8 2 6565 0.2 2 2 2 8888 6 8888 0.8 2 klkl 0.1 3 3 2 klkl The candidate pair priority orderedlist already and its state was Succeeded or Failed, nothing further isthen used to obtain an ordered list of transport address pairs,done. If there was no matching check onwhichtheagent will, in order, attempt to send STUN connectivity checks. Thischeck list,calledit is inserted into thetransport address paircheckordered list,list based on its priority, its state isvery similarset to In-Progress, and thecandidate pair priority ordered list, but differs in two important respects. Firstly,check is performed. 7.9. Security Considerations for Connectivity Check Security considerations for thecandidate pairs matchingconnectivity check are discussed in Section 15. 8. Completing theoperating candidateICE Checks When a pair(there can actually be more than one) get promotedis added to thetop ofvalid list, and thelist. This allowsagent was theoperating candidate pairofferor in the most recent offer/answer exchange, the agent MUST check tobesee if there is a pair on the validatedfirst. Secondly, manylist for each component of each media stream. If there is, thechecks would be redundant,offeror MUST stop timer Ta, anda filtering algorithm is used to eliminate these redundant checks. Ordering of candidates may involve transport protocol specific considerations. There are noneMUST cease retransmitting any Binding Requests forUDP. However, extensions that define usage of ICE with other transport protocols SHOULD specifytransactions in progress. It MUST ignore anyspecial ordering considerations. To formresponses which may subsequently arrive to transactions previously in progress. The offeror MUST generate an updated offer as described in Section 9. It does this regardless of whether thetransport address pair check ordered list,highest priority pairs in thecandidatecheck listis first modified by takingmatch the current in-use candidatepairs correspondingpairs. When a pair is aded to theoperating candidate pair,valid list, andpromoting them to the top of the list. A candidate pair matchestheoperating candidate pair when its native and remote transport address matchagent was thenative and remote transport addressesanswerer in them/c-line, respectively. In unusual circumstances, there may be more than one such candidate pair. In such a case, they should be promoted such thatmost recent offer/answer exchange, thehigher priorityagent MAY begin sending media using that candidatepairs appear first.pair, as described in Section 11.1. In addition,itif there ispossible that none of thea candidatepairs matchpair on theoperating candidate pair. In that case, no candidate pairs are promoted. Withinvalid list for eachcandidate pair there will be a set of transport address pairs, one for each component ID. Those pairs are ordered bycomponentID. The result is an absolute orderingofall transport address pairs for aeach media stream,sorted first bytheorder of their candidate pairs (withanswerer MUST stop timer Ta, and MUST cease retransmitting any Binding Requests for transactions in progress. It MUST ignore any responses which may subsequently arrive to transactions previously in progress. Note that only agent that was theexception ofanswerer in theoperating candidate), followed bymost recent offer/ answer exchange gets to send media right away. The offeror must wait for a subsequent offer/answer exchange if theordervalid candidates don't match those in the m/c-line. OPEN ISSUE: It is possible that higher priority checks may still succeed, if we allowed things to continue. This can happen for several reasons. First, an in-progress check oftheir component IDs.higher priority had some packet loss and thus hasn't completed. Timer Tws was meant to handle this (I removed this timer from -10 to simplify). More interestingly, higher priority checks may have not been done because a triggered check of lower priority succeeded. Thisordering is used ashappens in cases where thestartnumber ofthe transport address pair check ordering. The next stepchecks at each agent are assymetric. It is possible toremove redundant transport addresses. Starting atfix both of these problems by delaying thetopcompletion of thelist,ICE procedures for a bit more time. This adds complexity and latency. The basic algorithm would be this. You take theagent moves down from one transport addresslowest priority pairtoin the valid list. You keep doing checks as long as there are higher priority checks on thenext.list in the Waiting state. If there are none, you wait atransport address pair under consideration has the same remote transport address as a previous pair, based on transport address pair ID comparisons, and the native transport address from that previous pair has the same origination transport address as the one under consideration (based on IP addressbrief time (say 50ms) andport comparison), the one under consideration is removed from the list. The origination transport address is the address that thethen consider ICE finished. 9. Subsequent Offer/Answer Exchanges An agentwould send from in order to emit a packet with that native transport address as a source transport address. ForMAY generate alocal transport address,subsequent offer at any time. However, theorigination transport address is equal to that local transport address. For a server reflexive transport address,rules in Section 7.7.2 will cause theorigination transport address is equalofferer to generate an updated offer when thelocal transport address from which it was derived. For relayed addresses, packets are emitted by explicitly sending them through the relay. Consequently, the origination transport address is equal tocandidates in therelayed address. Aftervalid list are not all in-use. 9.1. Generating the Offer When an agenthas gone throughgenerates an updated offer, theentire list,set of candidate attributes to include depend on theresultstate of ICE processing. If ICE is "done", which occurs when thetransport addressvalid list includes a candidate paircheck ordered list. The pairs that get removed are redundant sincefor each component of each media stream, the agentwould send a STUN connectivity check using the same source and destination addresses asMUST include aprevious check. Consequently, the connectivity check will provide no information to the remote agent exceptcandidate attribute for each local candidate amongst thetransport address pair ID its associated with. These turn outpairs in the valid list (including peer reflexive candidates), and SHOULD NOT include any others. This will cause STUN keepalives to beunnecesary due tosent for theSTUN processing rules outlined below. 7.6. Performingin-use candidates, and thats it. If, however, theConnectivity Checks Connectivity checks are a STUN usage defined in [12]. They are performed by sending peer-to-peer STUN Binding Requests. These checks result invalid list does not yet include atransport addresscandidate pairprogressing through a state machine that captures the progress of the connectivity checks. The specific state machine and the proceduresfor each component of each media stream, theconnectivity checks are specific toagent SHOULD include all current candidates, including any peer reflexive candidates it has learned since thetransport protocol.last offer or answer it sent. Thisspecification defines rules for UDP. The state machine processing describedMAY include candidates it did not offer previously, but which it has gathered since the last offer/answer exchange. If a candidate was sent inthis section MUSTa previous offer/answer exchange, it SHOULD have the same priority. For a peer reflexive candidate, the priority SHOULD befollowedthe same as determined byagents. Extensions to ICE that describe other transport protocolsthe processing in Section 7.7.2. The foundation SHOULDdescribebe thestate machinesame. The username fragments andthe procedurespasswords forconnectivity checks. The seta media stream SHOULD remain the same as the previous offer or answer. Population ofstatesthe m/c-lines also depends on the state of ICE processing. If, for atransport address pair visited byparticular media stream, theofferer and answerer are depicted graphically in Figure 9. Note that this state machine existsvalid list has candidate pairs for alltransport address pairs, including ones pruned fromof thetransport address pair check ordered list. | |Start | | V +------------+ +-----------------| | | | | | +----| Waiting |----------------+ | | | | | | | | | | | Miss | +------------+ | | ---- | | | Match Res| - | | Selected | Match Req ---------| | | --------. | ------- - | | | Send Req Match Req | Send Req | | V --------- | | Match Res | +------------+ Re-Xmit | | --------- | | | Req | | - | | | | | +------c----| Testing |-----------+ | | | | | | | | | | | | | | | | | | +------------+ | | | | | | | | | | | | Error or | | | | | | Miss | | Timer Tr | | | | ----- | | -------- V V | V - V V Send Req +------------+ | +------------+ +------------+ +-----| | +--->| | | | | | Recv- | | | | Send- | | | Valid |------->| Invalid |<-------| Valid | | | | | | | | +---->| | Error, | | Error, | | +------------+ Miss +------------+ Miss +------------+ | ----- ^ ----- | | - | Error, - | | | Miss | | | ----- | | | - | | +------------+ | | | | | | | | | +-------------->| Valid |<-------------+ Match Req | | Match Res --------- | | --------- - +------------+ - | ^ | | | | +-------+ Timer Tr -------- Send Req Figure 9 The state machine has six states - Waiting, Testing, Recv-Valid, Send-Valid, Valid and Invalid.components of that media stream, those pairs are used. In particular, theWaiting state,m/c-line would be constructed by from theagent is waiting to send or receive a connectivity check for the pair. In the Testing state, the agent has sent a connectivity check and is awaiting a response. In the Recv-Valid state, the agent knows that its peer can receive packetslocal candidate fromit on this transport address pair.each of those candidate pairs. Inthe Send-Valid state,addition, the agentknows that its peer can send packets to it. In the Valid state,MUST include theagent knowsa=remote-candidates attribute for thatits peer can both sendmedia stream, andreceive packets from it. Initially, all transport address pairs startinclude in it theWaiting state. In this state, the agent waits for one of three events - a chance to send a Binding Request, receipt of a Binding Request, or receipt of a Binding Response. Since there is an instance of the state machineremote candidates for eachtransport address pair, Binding Requests and responses need to be matched toof thespecific state machine for which theypairs that weremeant to apply. As described below,used. If, for a particular media stream, theBinding Request mayvalid list does notbe a matchhave pairs for all of thetransport address pair it was meant to validate. To find the transport address pair it was meant to validate, calledcomponents of thetarget transport address pair,stream, the agentexamininesSHOULD populate theUSERNAME ofm/c-line for that media stream based on theincoming Binding Request.considerations in Section 4.3. TheUSERNAME directly containsagent MUST use thetransport address pair IDsame ice-pwd and ice-ufrag forthe paira media stream as its previous offer or answer. Note that itwas meantis permissible tovalidate. Binding Responses are matcheduse a session-level attribute in one offer, but totheir requests usingprovide theSTUN transaction ID,same password as a media-level attribute in a subsequent offer. This is not a change in password, just a change in its representation. 9.2. Receiving the Offer andthen mappedGenerating an Answer When the answerer generates its answer, it must decide what candidates to include in the answer, and how to populate thetransport address pair from that.m/c- line. For each mediastream, the agent starts a new connectivity check for a transport address pair every Tb*RND seconds. Tb SHOULD scale linearly withstream in thenumber of media streams, so thatoffer, thepace of connectivityagent checksoverall is invariantto see if thenumber of media streams. Consequently,stream contained the remote-candidates attribute. If itis RECOMMENDEDdid, it means thatTb have a default value of N*50ms, where N isthenumber ofofferer believed that ICE processing has completed for that mediastreams. RNDstream. In this case, the remote-candidates attribute contains the candidates that the answerer isa random number chosen uniformly between 0.7 and 1.3, and it helpssupposed toavoid synchronization betweenuse. It is possible that thetransmissionagent doesn't even know ofconnectivity checks for different media streams. On average, if there are N media streams, the checks across all media streamsthese candidates yet; they will bepaced out atdiscovered shortly through atotal of N/Tb checks per second.response to an in-progress check. Thecheck is started foragent MUST populate thefirst transport address pair inm/c-line with thetransport address pair check ordered list that iscandidates from the a=remote-candidates attribute. In addition, it MUST include an a=candidate attribute in its answer for each candidate in theWaiting state. The "Selected" eventa=remote-candidates attribute. If the agent ispassed tonot aware of thestate machine for this transport address pair, causingcandidate yet, it will need tobe moved to the Testing state. The agent then sends a connectivity check usinggenerate aSTUN Binding Request, as outlinedpriority value for it. The type preference inSection 7.7. Once a STUN connectivity check begins, the processing ofthecheck followscomputation is peer-reflexive, and therules for STUN. Specifically, retransmits of STUN requests are done as specified in [12],stream ID andfurthermore,component ID are known from the offer. The agent chooses an arbitrary local preference value if it is multi-homed, since it won't yet know the interface associated with this candidate. If atransaction fails and needs to be retried,media stream does not yet contain the a=remote-candidates attribute, it means thatretry can happen rapidly, as described below. It doesn't "count" againsttheaverage rate limit of 1/Tbofferer believes that ICE checksper second perare still in progress for that media stream. Inaddition,this case, thekeepalives that are generatedanswerer SHOULD include an a=candidate attribute fora valid pair do not count against the rate limit either. The rate limit applies strictly to the startall ofconnectivity checksthe candidates fora transport address pairthathas been newly signaled through an offer/answer exchange. When an agent receives a Binding Request, which permedia stream it knows about (including peer-reflexive candidates). The m/c-line is populated based on theprocessing rules ofconsiderations in Section7.8 produces a succesful response, the agent examines the source transport address4.3. Construction of therequest. Ifice-pwd and ice-ufrag are identical to thenative transport address was relayed, this would beprocedures followed by thesourceofferer, asseen bydescribed in Section 9.1. Note that therelay. Fora=remote-candidates attribute SHOULD NOT be included in theSTUN relay usage, that source transport addressanswer, and if included, will just bepresent inignored by theREMOTE-ADDRESS attribute of a STUN Data Indication message, if the Binding Request was delivered through a Data Indication. If the Binding Request wasofferer, since it is notencapsulatedused ina Data Indication, that source transport address is equal to the current active destination for the STUN relay session. If the source transport address matches the remote transport addressany processing of thetarget transport address pair,answer. 9.3. Updating theBinding Request is considered to be a match forCheck and Valid Lists Once thetarget transport address pair. Consequently, a Match Req event is passedsubsequent offer/answer exchange has completed, each agent needs to compute thestate machine fornew check list resulting from this exchange, and then remove any pairs from thetarget transport address pair.valid list which are no longer usable. Once these adjustments are made, ICE processing continues using these new lists. Each agent recomputes the check list using the procedures described in Section 5.6. If a check on this new check list was also on the previous check list, and its statemachinewasin the WaitingWaiting, In-Progress, Succeeded orTesting state, theFailed, its statemachine moves into the Send-Valid state.is copied over. Ifit was previously in the Waiting state,a check on theagent sendsnew check list does not have aconnectivitystate (because its a new checkofor itsown for the target transport address pair, as outlined in Section 7.7. If itstate wasin the Testing state,not copied over), and itretransmits a Binding Request for the transaction in progress. This retransmissionisone that would not normally occur based onfor theprocedures in [12]. ICE "prods"component with component ID 1 and for theSTUN transactionmedia stream with stream ID 9, its statemachine to send an extra retransmit, in addition to the one whichisscheduledset tobe sent next. This helps speed up bidirectional connectivity verification when one agent is behindWaiting. All other pairs without aNAT with an address and port dependent filtering behavior [32]. If the source transport addresses instate have their state set to Frozen. Next, theBinding Request was not a match foragent goes through theremote transport address,check list, starting with theBinding Request is considered to behighest priority check. If amiss for the target transport address pair. Consequently,check has aMiss event is passed to thestatemachineofthe target transport address pair,Succeeded, and itimmediately moves into the Invalid state. Typically, the source transport address won't match when there washas aNAT betweencomponent ID of 1, then all Frozen checks for thesender and receiver with an addresssame media stream andport dependent mapping property, though theresame foundation whose component IDs areother cases in which this can happen. Though it was a miss for the target transport address pair, the connectivity check maynot one, havebeen a matchtheir state set to Waiting. If, for adifferent transport address pair. To determine this, the agentparticular media stream, there are checksthe source transport addressfor each component of that media stream in theBinding Request against all ofSucceeded state, theother remote transport addressesagent moves the state oftransport address pairsall Frozen checks for thesamefirst component of all other mediastream that usestreams with the sametransport protocol and sharefoundation to Waiting. If a check was on thesame native transport address (basedold check list, but was not ontransport address ID comparison)the new check list, and had a state of In-Progress, thetarget. Of those that match (assuming at least one matches), it refines the setcorresponding STUN transaction is abandoned. No furtherby selecting only thoseretransmits will be sent forwhomtheorigination transport address ofSTUN request, and any response that might be received is ignored. Next, theremote transport address matchesagent prunes theorigination transport address ofvalid list. For each pair on theremote transport addressvalid list, the agent examines each candidate in thetarget transport addresspair.The origination transport address for a remote transport address is obtained from information signaled inIf theSDP,candidate was not peer reflexive, anddepends on the type. For a local transport address,was not present in theorigination address equals that local transport address. For a server reflexive transport address,most recent offer/answer exchange, theorigination addresscandidate pair isobtainedremoved from therelated address information provided in the SDP. For a relayed transport address, the origination transport address qualsvalid list. OPEN ISSUE: This means thatrelayed transport address. For these three types, the type is signaled in the SDP. Foryou cannot forcefully remove a peerderived transport address, the origination address is the same as the origination address of the generating transport address. If therereflexive candidate. This feature wasa match (there can only be either one or zero matches), this match is called the alternate. In many cases, the alternate transport address pair will not bepossible, at much complexity, inthe transport address pair check ordered list; it will have been oneprevious versions of theones pruned. Indeed, thisspec. An alternative iswhy it was pruned - a check on the remaining transport address pairs can servetovalidate it. The state machine for the alternate is passed the Match Req event. Ifremove a peer reflexive candidate if it was not present in theWaiting state, this causes it to move into the Send-Valid state,offer/answer, andawas discovered more than 500ms ago. 10. Keepalives STUN connectivitycheckchecks are also used to keep NAT bindings open once a session isgenerated for the alternate transport address pair. It may have been inunderway. This is accomplished by periodically re- starting theTesting state,check process, as described inwhich case it moves move intothis section. Once theSend-Valid state, andinitial offer/answer exchange has taken place, the agentrestransmits the Binding Request for the transaction in progress. If it was the in the Recv-Valid state, this causes itsets a timer tomove into the Valid state. If no alternate couldfire in Tr seconds. Tr SHOULD befound, it means that a new remote transport addressconfigurable andcorresponding origination transport addressSHOULD havebeen discovered. In this case,a default of 15 seconds. When Tr fires, the agentfollowsMUST reset theprocedures of Section 7.10.1 to create a new transport address pair and state machinestates forit. If the Binding Request didn't generate a success response, an Error event is passed to the state machineall of thetarget, causing it to move into the Invalid state. If the agent receives a successful response to its STUN request, it agent examines the transport addresschecks in theXOR-MAPPED-ADDRESS attribute of the response. This will be a peer reflexive transport address. Ifcheck list using thepeer reflexive transport address matches (based on IP addressprocedures defined in Section 5.6 andport comparison) the native transport address ofthen begin performing periodic checks as described in Section 5.7. By thetarget transport address pair, a Match Res event is passed totime thestate machine oftimer fires for thetarget. Iffirst time, thestate machine was incheck list will include only theTesting state, the state machine moves into the Recv-Valid state. If it was in the Send-Valid state, it moves into the Valid state. If, however, the transport addresses didn't match,in-use candidates. Reperforming these checks will therefore performing aMiss event is passed to the state machineperiod keepalive. OPEN ISSUE: ICE isn't saying anything about what happens if these periodic keepalives should fail. It they do, something really bad has happened, like a NAT reboot or failure. I think we should keep that out ofthe target, and it immediately moves into the Invalid state. Thescope. When an ICE agentchecks the peer reflexive transport address against all of the other native transport addressesis communicating with an agent that is not ICE- aware, keepalives still need to be utilized. Indeed, these keepalives are essential even if neither endpoint implements ICE. As such, this specification defines keepalive behavior generally, fortransport address pairsendpoints that support ICE, and those that do not. All endpoints MUST send keepalives for each media session. These keepalives MUST be sent regardless of whether thesamemedia streamwith the same transport protocolis currently inactive, sendonly, recvonly or sendrecv. The keepalive SHOULD be sent using a format which is supported by its peer. ICE endpoints allow for STUN-based keepalives for UDP streams, andthe same remote transport address (based on comparison of transport address ID)asthe target. Of thosesuch, STUN keepalives MUST be used when an agent is communicating with a peer thatmatch (assuming at least one matches), it refines the set furthersupports ICE. An agent can determine that its peer supports ICE byselecting only those for whomtheorigination transport addresspresence of thenative transport address matches the origination address ofa=candidate attributes for each media session. If thenative transport address in the target transport address pair. The resulting transport address pair (there can be only zero or one) is called the alternate. In many cases, the alternate transport address pair willpeer does notbe insupport ICE, thetransport address pair check ordered list; it will have been onechoice ofthe ones pruned. The state machinea packet format forthe alternatekeepalives ispassed the Match Res event. If it was in the Waiting state, this causes ita matter of local implementation. A format which allows packets tomove into the Recv-Valid state. It may have beeneasily be sent in theTesting state, inabsence of actual media content is RECOMMENDED. Examples of formats whichcase it moves move into the Recv- Valid state. If it was the in the Send-Valid state,readily meet thiscauses it to move into the Valid state.goal are RTP No-Op [27] and RTP comfort noise [23]. Ifno alternate could be found,theBinding Response will create a newpeerreflexive transport address, and the procedures of Section 7.10.2doesn't support any formats that arefollowed to create a new transport address pair and state machineparticularly well suited forit. In any state, if the STUN transaction results in an error, the state machine moves into the Invalid state. A STUN transaction produceskeepalives, an"error" based on the processing in Section 7.7, which indicates which STUN response codes constituteagent SHOULD send RTP packets with an incorrect version number, or some other form of error which would cause them to be discarded by the peer. STUN-based keepalives will be sent periodically every Tr seconds asfar as ICE processing is concerned.described above. Ifa transport address pair is in the Recv-Valid or Valid state,STUN keepalives are not in use (because the peer does not support ICE), an agentMUST generateSHOULD ensure that anew STUN Binding Request transactionmedia packet is sent every Tr seconds.This transaction ensures that NAT bindings for the transport address pair remain open while the candidate is under consideration. The transactionIf one isperformednot sent asoutlined in Section 7.7. These transactions can also be used to keep the NAT bindings alive whena consequence of normal media communications, a keepalive packet using one of thecandidate is promoted to operating, as described in Section 7.12. Trformats discussed above SHOULD beconfigurable, and SHOULD defaultsent. 11. Media Handling 11.1. Sending Media Agents always send media using a candidate pair. An agent will send media to15 seconds. These STUN transactions are processed inthesame way as any other, and can resultremote candidate innew peer derived transport addresses, or can fail and causethetransport addresspairto be invalidated. The candidate pair itself has a state, which is derived from(setting thestates of its transportdestination addresspairs. If at least oneand port of thetransport address pairs in a candidate pair is in the invalid state,packet equal to that remote candidate), and will send it from thestate oflocal candidate. When the local candidatepairisconsidered to be invalid. Ifserver or peer reflexive, media is originated from the base. Media sent from a relayed candidatepair enters this state,is sent through that relay, using procedures defined in [12]. If an agentmoves the state machines for all ofwas theother transport address pairsofferer inthis candidate pair intotheinvalid state as well. This will ensure that connectivity checks never start for those transport address pairs. Furthermore, if checks are alreadymost recent offer/answer exchange, when it sends media, it MUST use the candidates inprogressthe m/c-line forone ofeach media stream. However, it MUST only send media once thosetransport address pairs,candidates also appear in theagent ceases them.valid list. Ifall ofthetransport address pairs making upcandidates in thecandidate pairm/c-line areValid, the candidate pair is considered valid. If all of the transport address pairs making upnot thecandidate pairones that areeither Valid or Recv-Valid, and at least one is Recv-Valid,ultimately selected by ICE, this implies that thecandidate pair is consideredofferer will need tobe Recv-Valid.wait for the subsequent offer/ answer exchange to complete before it can send media. Ifall ofan agent was thetransport address pairs making upanswerer in thecandidate pairmost recent offer/answer exchange, the rules areeither Valid or Send-Valid, and at least one is Send-Valid,different. When thecandidate pair is consideredagent wishes tobe Send- Valid. If all ofsend media, and thetransport address pairs in acandidatepair arepairs in theWaiting state,m/c-lines are also thecandidate pair ishighest priority ones in thewaiting state. If all ofvalid list for each media stream, it uses those candidate pairs. If, however, thetransport addresshighest priority pairs in thecandidate pairvalid list for a media stream areeither innot theWaiting or Testing states, and at least one issame as the ones in theTesting state,m/c-lines, thestate ofagent MUST use thecandidate pair is Testing. Otherwise,highest priority pairs in thestate ofvalid list. However, the agent MUST discontinue using those candidatepair is considered Indeterminate. A candidate itself also has a state. Ifpairs Tlo seconds after the next opportunity its peer would have to send an updated offer. In the case of an answer delivered in acandidate is present200 OK to an offer inat least one valid candidate pair,a SIP INVITE (regardless of whether thatcandidate is said tosame answer appeared in an earlier unreliable provisional response), this would bevalid. If allTlo seconds after receipt of thecandidate pairs containing that candidate are invalid, the candidate itself is invalid. Otherwise, the candidate's state is Indeterminate. 7.7. Sending a Binding Request for Connectivity Checks An agent performs a connectivity check on a transport address pair by sending a STUN Binding Request from its native transport address,ACK. Tlo SHOULD be configurable andsending it toSHOULD have a default of 5 seconds. This time represents theremote transport address. Sending from its native transport address is done by sendingamount of time itfromshould take the offerer to perform itsorigination transport address. As mentioned above,connectivity checks, arrive at theorigination transport address depends onsame conclusion about thetype of transport protocolcandidate pair, and then generate an updated offer. If, after Tlo seconds, no updated offer arrives, thetype of transport address (local, reflexive, or relayed). This specification defines the meaning for UDP. Specifications defining other transport protocols must define what this means for them. For UDP-based local transport addresses,answerer MUST cease sendingfrommedia, and will need to wait for thelocal transport address hasupdated offer. OPEN ISSUE: In previous versions of ICE, once this timer fired, you just sent media to themeaningonewould expect -in therequest is sent such thatm/c-line. This causes thesource IP addressmedia streams to flip back andport equal that of the local transport address. For reflexive transportforth between addresses,it is sent by sending from the associated local transport address usedwhich I am trying toderive that reflexive address. For relayed transport addresses, it is sent byavoid. Since this timer should never go off anyway, I removed this feature. ICE has interactions with jitter buffer adaptation mechanisms. An RTP stream can begin usingSTUN mechanismsone candidate, and switch tosend the request through the STUN relay (using the Send request). Sending the requestanother one, though this happens rarely with ICE. The newer candidate may result in RTP packets taking a different path through theSTUN relay server necesarily requires that the request be sent from the client, using the local transport address usednetwork - one with different delay characteristics. As discussed below, agents are encouraged toderive the relayed transportre-adjust jitter buffers when there are changes in source or destination address.The Binding Request sent by the agent MUST containFurthermore, many audio codecs use theUSERNAME attribute. This attribute MUST be setmarker bit to signal thetransport address pair IDbeginning ofthe corresponding transport address pair as seen by its peer. Thus,a talkspurt, for thefirst transport address pair in Figure 7, ifpurposes of jitter buffer adaptation. For such codecs, it is RECOMMENDED that the sender change the marker bit when an agent switches transmission of media from one candidate pair to another. 11.2. Receiving Media ICE implementations MUST be prepared to receive media on any candidates provided in theleft sends the STUN Binding Request, the USERNAME will have the value R:1:L:1. If themost recent offer/answer exchange. In order to avoid attacks described in Section 15, when an agenton the right sends the STUN Binding Request, the USERNAME will have the value L:1:R:1. To be clear, the USERNAME that is used is NOT the one seen locally, but rather the one as seen byreceives a media packet, and it knows itspeer. The request SHOULD containpeer supports ICE, it MUST verify that it has received a check (for which a successful response was generated) on theMESSAGE- INTEGRITY attribute, computed according to [12]. The key usedsame 5-tuple asinput to the HMAC isthepassword provided byreceived media packet (that is, thepeer for this remote transport address. This password will be identical for all remotesource and destination transport addressesforof thesamemediastream. Note that all ICE implementations are required to be compliant to [12], as opposed to the older [14]. Consequently, all connectivity checks will contain the magic cookie in the STUN header, and causepacket match those of theSTUN server embedded in each ICE implementation to include XOR- MAPPED-ADDRESS attributes incheck). If no such check has succeeded, theresponse, rather than MAPPED- ADDRESS. Once created,agent MUST silently discard theSTUN transactionmedia packet. It islinked to the transport address pair soRECOMMENDED that, whenthe response is received, the state machine on the linked transport address pair can be updated. The STUN transaction will generate eitheran agent receives an RTP packet with atimeout,new source or destination IP address for aresponse. If the response is a 420, 500, or 401,particular media stream, that the agentshould try again as describedre-adjust its jitter buffers. RFC 3550 [20] describes an algorithm in[12] (as mentioned above, it need not wait the roughly Tb seconds to try again). Either initially, or after such a retry, the STUN transaction might produce a non-recoverable failure response or a failure result inapplicable to this usage of STUNSection 8.2 for detecting SSRC collisions andthus unrecoverable. If this happens, an error event is generated into the state machine,loops. These algorithms are based, in part, on seeing different source IP addresses and ports with thetransport address pair enters the invalid state. If the STUN transaction times out,same SSRC. However, when ICE is used, such changes will sometimes occur as theclient SHOULD NOT retry. The only reasonmedia streams switch between candidates. An agent will be able to determine that aretry might succeedmedia stream isif there was severe packet loss duringfrom thedurationsame peer as a consequence of thecheck, or the answer was significantly delayed, also due to packet loss. However,STUNBinding Request transactions run for 9.5 seconds, whichexchange that proceeds media transmission. Thus, if there iswell beyond the typical tolerance forasession establishment. The retrieschange in source IP address and port, but the media packets come from the same peer agent, this SHOULD NOT be treated as an SSRC collision. 12. Usage with SIP 12.1. Latency Guidelines ICE requires apenaltyseries ofadditional traffic, which can be used to launch DoS attacks (see Section 13.4.2). The only reasonSTUN-based connectivity checks tonot followtake place between endpoints. These checks start from theSHOULD NOT is ifanswerer on generation of its answer, and start from theagent has adjustedofferer when it receives theSTUN transaction timersanswer. These checks can take time tobe more aggressive. If the Binding Response is a 200,complete, and as such, theagent SHOULD check forselection of messages to use with offers and answers can effect perceived user latency. Two latency figures are of particular interest. These are theMESSAGE-INTEGRITY attributepost-pickup delay andverify it, as discussed in [12]. Indeed, this check SHOULD be done for all responses. This will result intheresponse being discarded (eventually leadingpost-dial delay. The post-pickup delay refers to the time between when atimeout), ifuser "answers theintegrity check fails. 7.8. Receivingphone" and when any speech they utter can be delivered to the caller. The post-dial delay refers to the time between when aBinding Requestuser enters the destination address forConnectivity Checks Asthe user, and ringback begins as aresultconsequence ofproviding a listhaving succesfully started ringing the phone of the called party. To reduce post-dial delays, it is RECOMMENDED that the caller begin gathering candidatesinprior to actually sending itsoffer or answer, an agent will receive STUN Binding Request messages. An agent MUSTinitial INVITE. This can beprepared to receive STUN Binding Requests on each local transport address from the moment it sends an offer or answerstarted upon user interface cues thatcontainsacandidate with that local transport address. Similarly, it MUST be prepared to receive STUN Binding Requestscall is pending, such as activity on alocal transport addresskeypad or themoment it sendsphone going offhook. If an offeroris received in an INVITE request, the callee SHOULD immediately gather its candidates and then generate an answerthat containsin aderived candidate derived from that local transport address. It can cease listening for STUN messages on that local transport address after sending an updated offer or answer which does not include any candidates with transport addresses thatprovisional response. When reliable provisional responses areequal to or derived from that local transport address. As discussed in [12], sincenot used, theusername and password for STUN requests are exchanged through another mechanism - here, ICE -SDP in theShared Secret Request mechanismprovisional response isnot neededthe answer, andneed not be implemented by agentsthatprovideexact same answer reappears in theconnectivity check usage. One200 OK. To deal with possible losses of thecandidates mayprovisional response, it SHOULD bein use as the operating candidate, or may become promoted to the operating candidate in the next offer/ answer exchange as a consequenceretransmitted until some indication ofa successful validation. Inreceipt. This indication can eithercase, both media and STUN packets will be sent to the transport addresses comprising that candidate, causing both to receive on their associated local transport addresses. The agent MUSTbeable to disambiguate them. This is done trivially by looking for the STUN magic cookie asthrough PRACK [9], or through thevaluereceipt ofthe second 32-bit word in the packet. If present, it identifiesa successful STUNpacket. Processing of theBindingRequest proceeds in two steps. The first is generation of the response, and the secondRequest. Even if PRACK isICE-specific processing. Generation ofnot used, the provisional responsefollows the general procedures of [12], and is independent ofSHOULD be retransmitted using thestate machineryexponential backoff described inSection 7.6. The USERNAME is considered valid if one of[9]. Furthermore, once thecandidate IDs sent in an offer oransweris a prefix of the USERNAME (this will always behas been sent, thecase, even for peer reflexive candidates), andagent SHOULD begin its connectivity checks. Once candidate pairs fortheeach componentindicated in the USERNAME,of a media stream enter theassociated local transport address matchesvalid list, thelocal transport addresscallee can begin sending media onwhich the request was received. The password associated withthatcandidate ID, which was provided by the agentmedia stream. However, prior toits peer, is usedthis point, any media that needs toverify the MESSAGE-INTEGRITY attribute, if one was present in the request. If the USERNAME is not valid, the agent generates a 430. Otherwise, the success response will includebe sent towards theXOR-MAPPED- ADDRESS attribute, which is used for learning new candidates,caller (such asdescribed in Section 7.10. The XOR-MAPPED-ADDRESS attribute is constructed using the source IP address and port of the Binding Request.SIP early media [25] cannot be transmitted. ForBinding Requests received over relayed transport addresses,thisMUST bereason, implementations SHOULD delay alerting thesource IP address and portcalled party until candidates for each component of each media stream have entered theBinding Request when it arrived at the relay, prior to forwarding towards the agent. That source transport address will be present invalid list. In theREMOTE- ADDRESS attributecase of aSTUN Data Indication message, if the Binding Request was delivered through a Data Indication. If the Binding Request was not encapsulated in a Data Indication, that source address is equal to the current active destination for the STUN relay session. The ICE processing involves changes to the state machine for a transport address pair. This processing cannot be done until the initial offer/answer exchange has completed. As a consequence, if the offerer received a Binding Request that generated a success response, but had not yet received the answer to its offer, it waits for the answer, and when it arrives, then performs the ICE processing. The agent takes the entire contents of the USERNAME, and compares them against the transport address pair identifiers as seen by that agent for each transport address pair. If there is no match, nothing is done - this should never happen for compliant implementations. If there is a match, the resulting transport address pair is called the matching transport address pair. The state machine for the matching transport address pair is then updated based on the receipt of a STUN Binding Request, and the resulting actions described in Section 7.6 are undertaken. An agent will continue to receive periodic STUN connectivity checks on a local transport address as long as it had listed that transport address, or one derived from it, in an a=candidate attribute in its most recent offer or answer and the transport address is for UDP. Whether STUN keepalives are used for other transport protocols is defined by the specifications for that transport protocol. The agent processes any such transactions according to this section. It is possible that a transport address pair that was previously valid may become invalidated as a result of a subsequent failed STUN transaction. 7.9. Promoting a Candidate to Operating As a consequence of the connectivity checks, each agent will change the states for each transport address pair, and consequently, for the candidate pairs. When a candidate pair enters the valid state, and the agent is in the role of offerer for that candidate pair, the agent follows the logic in this section. The rules only apply to the offerer of a candidate pair in order to eliminate the possibility of both agents simultaneously offering an update to promote a candidate to operating. The agent locates the candidate pair in the candidate pair priority ordered list. If it is the highest priority candidate pair, the agent SHOULD send an updated offer immediately as described in Section 7.11.1. If it is not the highest priority candidate pair, and the states of all lower priority candidate pairs are Invalid, the agent SHOULD send an updated offer immediately. If it is not the highest priority candidate pair, and the state of at least one of the lower priority candidate pairs is Indeterminate, the agent does nothing. Tests have yet to begin for higher priority candidate pairs. If it is not the highest priority candidate pair, and none of the lower priority candidate pairs have a state of Indeterminate, the agents starts a timer, called the wait-state timer, but only if this timer is not already running. The timer is set to fire in Tws seconds. Tws SHOULD be configurable, and SHOULD have a default of Tws = max(0, 200ms - N*Tb), where N is the number of components for the candidates for this media stream. The 200ms allows for a single STUN retransmission (which takes 100ms) and an RTT of 100ms. This timer allows for a higher priority connectivity check to complete, in the event its STUN Binding Request was lost or delayed in the network. Note that the timer goes to zero as the number of components increases. If, prior to the wait-state timer firing, another connectivity check completes and a candidate pair is validated, there is no need to reset or cancel the timer. Once the timer fires, the agent SHOULD issue an updated offer as described in Section 7.11.1. This updated offer will use the highest priority candidate pair in Valid state when the timer fires. 7.10. Learning New Candidates from Connectivity Checks ICE makes use of reflexive addresses, which are addresses that inform an agent of its transport address as seen by another host. An initial offer or answer generated by an agent includes server reflexive addresses, which are learned from a configured or discovered STUN server in the network. However, the connectivity checks themselves can inform an agent of reflexive addresses, and in particular, ones that are reflexive towards its peer. These are called peer reflexive candidates. A new peer reflexive candidate is typically observed when two agents are separated by a NAT with the address-dependent or address and port dependent mapping properties [32]. However, in unusual topologies, peer reflexive candidates can be observed even when there are only NATs with the endpoint independent mapping property. Because STUN and the media packets are sent on the same port, regardless of the filtering properties of the NAT (whether endpoint independent, address dependent, or address and port dependent), this reflexive address can be used by the peer for sending STUN and media packets back towards the agent. To obtain and use these peer reflexive transport addresses, ICE agents MUST perform the additional processing on the receipt of STUN Binding Requests and responses described in the following two subsections. These procedures are not just applied in the (hopefully increasingly rare) case of address and port dependent mapping NATs. They are also needed for behave-compliant NATs [32]. 7.10.1. On Receipt of a Binding Request The procedures in this section are followed when an agent receives a STUN Binding Request matched to a target transport address pair whose source transport address (where the source is the one seen by the relay for requests received on a relayed transport address) doesn't match any of the existing remote transport addresses, or where the source matches, but the origination transport address does not. This source address and its associated origination transport address become a new remote transport address. To use it, that source transport address needs to be associated with a candidate (called a peer-derived candidate). In this case, however, the candidate isn't signaled through an offer/answer exchange; it is constructed dynamically from information in the STUN request. Like all other candidates, the peer-derived candidate has a candidate ID. The candidate ID is derived from the candidate IDs of the target candidate pair. In particular, the candidate ID is constructed by concatenating the remote candidate ID with the native candidate ID (without the colon). The password for the new candidate equals that of the remote candidate ID in the target candidate pair (note that, this password would be the same for all remote candidates for the same media line). When the STUN Binding Request is received, the agent constructs the candidate ID for the peer reflexive candidate, and checks to see if that candidate exists. It may already exist if it had been constructed as a consequence of a previous application of this logic on receipt of a Binding Request from a different remote transport address of the same new peer reflexive candidate. If there is not yet a peer reflexive candidate with that candidate ID, the agent creates it, and assigns it the newly computed candidate ID. The priority of the peer-derived candidate is set to the priority of its generating candidate. The generating candidate is the one that the new peer derived candidate comes from - the remote candidate in the target candidate. Note that, at this time, the peer derived candidate has no transport addresses in it. The remote candidate is then paired up with a native candidate. However, unlike the procedures of Section 7.5, which pair up each remote candidate with each native candidate, this peer reflexive candidate is only paired up with a the native candidate from the candidate pair from which it was derived. This creates a new candidate pair. This new candidate pair is inserted into the candidate pair priority ordered list based on the ordering rules defined in Section 7.5. Note that no entries are added to the transport address pair check ordered list. Recall that, for each candidate pair, one agent plays the role of offerer, and the other of answerer. For a peer-reflexive candidate, the role is identical to that of its generating candidate. Newly created or not, the agent extracts the component ID from the matching transport address pair, and sees if a transport address with that same component ID exists in the peer reflexive candidate. If it does, the agent does nothing further. This can happen in unusual cases when there is a NAT reboot in the middle of a STUN transaction, causing two requests in the same transaction two produce two different transport addresses. If there is no transport address with the same component ID in the peer reflexive candidate, the agent adds a transport address to the peer reflexive candidate. This transport address is equal to the source IP address and port from the incoming STUN Binding Request (and in the case of Binding Request received on a relayed transport address, the one seen by the relay), and has a transport protocol equal to that of the incoming STUN request. It is assigned the component ID equal to the component ID in the target transport address pair. This new transport address will have a transport address ID, equal to the concatenation of the candidate ID for this new candidate, and the component ID, separated by a colon. The type of the transport address is considered to be peer reflexive, though this is never signaled through SDP and so there is no candidate-types value defined for it. Recall that each transport address is associated with an origination transport address. For server reflexive candidates, the origination transport address is signaled through SDP. For peer reflexive transport addresses, it is inherited from the origination transport address of the generating transport address. If the generating transport address was a local transport address, then the origination transport address is that transport address. If the generating transport address was server reflexive, the origination transport address is the related transport address that was signaled for that server reflexive candidate. If the generating transport address was relayed, the origination transport address is the relayed transport address itself. Whether and how other candidate attributes defined by extensions are inherited depends on the extension. The newly added transport address is paired up with the native transport address with the same component ID. Initially, the peer reflexive candidate will start with a single transport address a transport address pair. More are added as the connectivity checks for the original candidate pair take place. Figure 10 provides a pictorial representation of the peer reflexive candidate (the one with id=RL) and its pairing with the native candidate with ID L. The candidate with ID R is the generating candidate. The peer reflexive candidate is effectively an alternate for that generating candidate, but is only paired with a specific native candidate. Note that, for a particular generating candidate, there can be many peer derived candidates, up to one for each native candidate. Also note that candidate IDs with values "L" and "R" and "RL" are not actually permitted, since all candidate IDs must be at least four characters long. These shortened candidate IDs are used to keep the figure readable. ............. ............. . tid=L:1 . . tid=R:1 . component. -- . id=L:1:R:1 . -- .component id=1 . | A|-------------------------| C| . id=1 . -- -------+ . -- . . . | . . Generating . . | . . Candidate . tid=L:2 . | . tid=R:2 . component. -- . | id=L:2:R:2 . -- .component id=2 . | B|-------C-----------------| D| . id=2 . -- -----+ | . -- . .............| | ............. Native | | Remote Candidate | | Candidate id=L | | id=R | | | | ............. | | . tid=RL:1 . | | id=L:1:RL:1 . -- .component | +-----------------| C| . id=1 | . -- . | . . Peer Derived | . . Candidate | . tid=RL:2 . | id=L:2:RL:2 . -- .component +-------------------| D| . id=2 . -- . ............. Remote Candidate id=RL Figure 10 The new transport address pair has a state machine associated with it. The state that is entered, and actions to take as a consequence, are specific to the transport protocol. For UDP, the procedures are defined here. Extensions that define processing for other transport protocols SHOULD describe the behavior. For UDP, the state machine enters the Send-Valid state. Effectively, the Binding Request just received "counts" as a validation in this direction, even though it was formally done for a different transport address pair. In addition, the agent generates a Binding Request for the new transport address pair, as described in Section 7.7. Processing of the response follows the logic described in Section 7.6. As with all candidate pairs, the state of this new candidate pair is derived from the states of its transport address pairs. Until the number of transport address pairs in the candidate pair equals the transport address pair count of the candidate pair from which it is derived, the state of the candidate pair is Indeterminate. Once they are equal, the state is derived just like any other candidate pair. 7.10.2. On Receipt of a Binding Response The procedures on receipt of a Binding Response are nearly identical to those for receipt of a Binding Request as described above. The procedures in this section are followed when an agent receives a STUN Binding Response matched to a transport address pair whose XOR- MAPPED-ADDRESS doesn't match any of the existing native transport addresses. The XOR-MAPPED-ADDRESS becomes a new native transport address. To use it, the XOR-MAPPED-ADDRESS needs to be associated with a candidate (called a peer-derived candidate). In this case, however, the candidate isn't signaled through an offer/answer exchange; it is constructed dynamically from information in the STUN response. Like all other candidates, the peer-derived candidate has a candidate ID. The candidate ID is derived from the candidate IDs of the target candidate pair. In particular, the candidate ID is constructed by concatenating the native candidate ID with the remote candidate ID (without the colon). The password for the new candidate equals that of the native candidate ID in the matching candidate pair (note that, this password would be the same for all native candidates for the same media line). When the Binding Response is received, the agent constructs the candidate ID that represents the peer reflexive candidate, and checks to see if that candidate exists. It may already exist if it had been constructed as a consequence of a previous application of this logic on receipt of a Binding Response for a different transport address pair of the same candidate pair. If there is not yet a peer reflexive candidate with that candidate ID, the agent creates it, and assigns it the newly computed candidate ID. The priority of the peer-derived candidate is set to the priority of its generating candidate - the native candidate in the target transport address pair. Note that, at this time, the peer derived candidate has no transport addresses in it. The native candidate is then paired up with a remote candidate. However, unlike the procedures of Section 7.5, which pair up each native candidate with each remote candidate, this peer reflexive candidate is only paired up with the remote candidate from the target candidate pair. This creates a new candidate pair. This new candidate pair is inserted into the candidate pair priority ordered list based on the ordering rules defined in Section 7.5. Note that no entries are added to the transport address pair check ordered list. Recall that, for each candidate pair, one agent plays the role of offerer, and the other of answerer. For a peer-reflexive candidate, the role is identical to that of its generating candidate. Newly created or not, the agent extracts the component ID from the target transport address pair, and sees if a transport address with that same component ID exists in the peer reflexive candidate. If it does, the agent does nothing further. This can happen in unusual cases when there is a NAT reboot in the middle of a STUN transaction, causing two requests in the same transaction two produce two different transport addresses. If there is no transport address with the same component ID in the peer reflexive candidate, the agent adds a transport address to the peer reflexive candidate. This transport address is equal to the XOR-MAPPED-ADDRESS from the incoming STUN Binding Response, and has a transport protocol equal to the one used for the Binding Response. It is assigned the component ID equal to the component ID in the matching transport address pair. This transport address will have a transport address ID, equal to the concatenation of the candidate ID for this new candidate, and the component ID, separated by a colon. The type of the transport address is considered to be peer reflexive, though this is never signaled through SDP and so there is no candidate-types value defined for it. Recall that each transport address is associated with an origination transport address. For server reflexive candidates, the origination transport address is signaled through SDP. For peer reflexive transport addresses, it is inherited from the origination transport address of the generating transport address. If the generating transport address was a local transport address, then the origination transport address is that transport address. If the generating transport address was server reflexive, the origination transport address is the related transport address that was signaled for that server reflexive candidate. If the generating transport address was relayed, the origination transport address is the relayed transport address itself. Whether and how other candidate attributes defined by extensions are inherited depends on the extension. The newly added transport address is paired up with the remote transport address with the same component ID. Initially, the peer reflexive candidate will start with a single transport address a transport address pair. More are added as the connectivity checks for the original candidate pair take place. The new transport address pair has a state machine associated with it. The state that is entered, and actions to take as a consequence, are specific to the transport protocol. For UDP, the procedures are defined here. Extensions that define processing for other transport protocols SHOULD describe the behavior. For UDP, the state machine enters the Recv-Valid state. Effectively, the Binding Response just received "counts" as a validation in this direction, even though it was formally done for a different candidate pair. The peer will likely generate a Binding Request for this candidate pair; processing of the request follows the logic described in Section 7.6. As with all candidate pairs, the state of this new candidate pair is derived from the states of its transport address pairs. Until the number of transport address pairs in the candidate pair equals the transport address pair count of the candidate pair from which it is derived, the state of the candidate pair is Indeterminate. Once they are equal, the state is derived just like any other candidate pair. 7.11. Subsequent Offer/Answer Exchanges An agent MAY issue an updated offer at any time. This updated offer may be sent for reasons having nothing to do with ICE processing (for example, the addition of a video stream in a multimedia session), or it may be due to a change in ICE-related parameters. For example, if an agent acquires a new candidate after the initial offer/answer exchange, it may seek to add it. However, agents SHOULD follow the logic described in Section 7.9 to determine when to send an updated offer as a consequence of promoting a candidate to operating. If there are any aspects of this processing that are specific to the transport protocol, those SHOULD be called out in ICE extensions that define operation with other transport protocols. There are no additional considerations for UDP. 7.11.1. Sending of a Subsequent Offer The offer MAY contain a new operating candidate in the m/c line. This candidate SHOULD be the native candidate from the highest priority candidate pair in the candidate pair priority ordered list whose state is Valid. If there are no candidate pairs in this state, the highest one whose state is Send-Valid or Recv-Valid SHOULD be used. If there are no candidate pairs in these states, the candidate pair that is most likely to work with this peer, as described in Section 7.2, SHOULD be used. The candidate is encoded into the m/c line in an updated offer as described in Section 7.3. Note that, while peer-derived candidates never appear in a=candidate attributes (only their generating candidates appear there), a peer-derived candidate can appear in the m/c line if it has been selected for usage for media. If the candidate pair whose native candidate was encoded into the m/c-line was Valid, Send-Valid or Recv-Valid, the agent MUST include an a=remote-candidate attribute into the offer. This attribute MUST contain the candidate ID of the remote candidate in the candidate pair. It is used by the recipient of the offer in selecting its candidate for the answer. Because the native candidate in the m/c- line will typically be Valid, Send-Valid or Recv-Valid in every offer after the initial one, the a=remote-candidate attribute will typically be used in all subsequent offers. The meaning of a=candidate attributes within a subsequent offer have the same meaning as they do in an initial offer. They are a request for the peer to attempt (or continue to attempt if the candidate was provided previously) a connectivity check using STUN from each of its own candidates. When an updated offer is sent, there are several dispositions regarding the candidates: retained: A candidate is retained if the candidate ID for the candidate is included in the new offer, and matches the candidate ID for a candidate in the previous offer or answer from the agent. In this case, all of the information about the candidate - its qvalue and components, and the IP addresses, ports, and transport protocols of its components, MUST be the same as the previous offer or answer from the agent. If the agent wants to change them, this is accomplished by changing the candidate ID as well. That will have the effect of removing the old candidate and adding a new one with the updated information. removed: A candidate is removed if its candidate ID appeared in a previous offer or answer, and that candidate ID is not present in the new offer. added: A candidate is added if its candidate ID appeared in the new offer, but was not present in a previous offer or answer from that agent. The following rules are used to determine the disposition of the each of the current native candidates in the new offer: o If a candidate is invalid, and all peer reflexive candidates generated from it are invalid as well, it SHOULD be removed. o If the candidate in the m/c-line is valid, all other lower priority candidates SHOULD be removed. This has the effect of stopping connectivity checks of other candidates. This SHOULD would not be followed if an agent wanted to keep a candidate ready for usage if, for some reason, the operating candidate later become invalid. o If the candidate in the m/c-line is valid, and it is not peer reflexive, that candidate MUST be retained. If the candidate in the m/c-line is peer reflexive, its generating candidate MUST be retained, even if it is itself invalid. o If the candidate in the m/c-line has not been validated, all other candidates that are not invalid, or candidates for whom their derived candidates are not invalid, SHOULD be retained. o Peer reflexive candidates MUST NOT be added; they continue to be used as long as their generating candidate was retained. Peer derived candidates are learned exclusively through the STUN connectivity checks. A new candidate MAY be added. This can happen when the candidate is a new one, learned since the previous offer/answer exchange, and it has a higher priority than the currently operating candidate. It can also occur when an agent wishes to restart checks for a transport address it had tried previously. Effectively, changing the candidate ID value in an updated offer will "restart" connectivity checks for that candidate. If a candidate is removed, the agent takes the following steps once the offer is sent: 1. The agent eliminates any candidate pairs whose native candidate equalled the candidate that was removed. Equality is based on comparison of candidate IDs. 2. The agent eliminates any candidate pairs that had a native candidate that is a peer reflexive candidate generated from the candidate that was removed. 3. The candidate pairs that are eliminated are removed from the candidate pair priority ordered list. Their corresponding transport address pairs are removed from the transport address pair check ordered list. As a consequence of this, if connectivity checks had not yet begun for the candidate pair, they won't. If a transport address pair had been pruned from the transport address pair check ordered list because it was redundant with one of the transport address pairs which was just removed, that transport address pair is added back to the list. 4. If connectivity checks were already in progress for transport addresses in a candidate pair that was removed, the agent SHOULD immediately terminate them. No further retransmissions take place, and no further transactions from that candidate will be made. 5. If the removed candidate was a relayed candidate, the agent SHOULD de-allocate its transport addresses from the STUN relay if it is not using those resources elswhere. If a local candidate was removed, and all of its derived candidates were also removed (including any peer reflexive candidates), local operating system resources for each of the transport addresses in the local candidate SHOULD be de-allocated, as long as it is not using those resources elsewhere. The resources may be in use elsewhere if they were included in an initial offer which generated multiple answers (as can happen with SIP forking). In such a case, a subsequent offer which removes the candidate will not imply its removal with the other branches; each becomes a separate offer/answer relationship. Subsequent offers MUST contain a=ice-pwd attributes that specify the password for the candidates for each media stream. If any of the candidates for a particular m-line are the same as the previous offer, the ICE password for that m-line MUST be the same. If all of the candidates for a particular m-line are different from the previous offer, the ICE password for that m-line MAY be different. Note that it is permissible to use a session-level attribute in one offer, but to provide the same password as a media-level attribute in a subsequent offer. This is not a change in password, just a change in its representation. 7.11.2. Receiving the Offer and Sending an Answer To generate the answer, the answerer has to decide which transport addresses to include in the m/c line, and which to include in candidate attributes. The first step in the process is to look for the a=remote-candidate attribute in the offer. The a=remote-candidate exists to eliminate a race condition between the updated offer and the response to the STUN Binding Request that moved a candidate into the Valid state. This race condition is shown in Figure 11. On receipt of message 5, agent A can move its transport address pair state machine into the Valid state. It sends a STUN response to the request (message 6), but this is lost. Agent A proceeds with an updated offer (message 7), which is received at agent B. As far as agent B is concerned, the transport address pair is still in the Send-Valid state. It will move into the Valid state only on receipt of the STUN response in message 10. Thus, upon receipt of the offer, agent B cannot determine which candidate to include in its answer. To eliminate this condition, the identity of the validated candidate is included in the offer itself. Note, however, that the answerer will not send media until it has received this STUN response. Agent A Network Agent B |(1) Offer | | |------------------------------------------>| |(2) Answer | | |<------------------------------------------| |(3) STUN Req. | | |------------------------------------------>| |(4) STUN Res. | | |<------------------------------------------| |(5) STUN Req. | | |<------------------------------------------| |(6) STUN Res. | | |-------------------->| | | |Lost | |(7) Offer | | |------------------------------------------>| |(8) Answer | | |<------------------------------------------| |(9) STUN Req. | | |<------------------------------------------| |(10) STUN Res. | | |------------------------------------------>| Figure 11 If the a=remote-candidate attribute is present, the agent examines the transport addresses in the m/c-line of the offer. It compares these with the transport addresses in the remote candidates of all candidate pairs. If there is no match, no further processing of the a=remote-candidate attribute is done. If there is at least one match, the agent compares the native candidate ID of each matching pair with the value of the a=remote-candidate attribute. If there is a match, that candidate pair is selected. For each transport address pair in that candidate pair, if the state of the transport address pair is Send-Valid, the agent considers the state to be Valid just for the purpose of constructing the answer. In particular, it will impact selection of the candidate for the m/c-line and the set of additional candidates to include or exclude from the answer. However, the actual state MUST remain Send-Valid. This state will be used to determine when it is safe to send media. Keeping it at Send- Valid is necessary to prevent against DoS attacks. Note that the a=remote-candidate attribute SHOULD NOT be included in the answer, and if included, will just be ignored by the offerer, since it is not used in any processing of the answer. Rules for choosing transport addresses for the m/c-line are as follows. The agent examines the transport addresses in the m/c-line of the offer. It compares these with the transport addresses in the remote candidates of candidate pairs whose states are Valid. If there is a matching candidate pair in that state, the pair with the highest priority MUST be chosen, and the native candidate from that pair used as the operating candidate. If there were no matching candidate pairs in the Valid state (possibly because the transport addresses in the m/c-line in the offer didn't match any of the remote candiadtes), the candidate that is most likely to work with this peer, as described in Section 7.2, SHOULD be used. Note that this candidate may be Valid as a consequence of being temporarily changed to such by the a=remote-candidate attribute. Like the offerer, the answerer can decide, for each of its candidates, whether they are retained or removed. The same rules defined in Section 7.11.1 for determining their disposition apply to the answerer. Similarly, if a candidate is removed, the same rules in Section 7.11.1 regarding removal of canididate pairs and freeing of resources apply. As with selection of the candidate for the m/c- line, the state of one of the candidates may be Valid as a consequence of being temporarily changed to such by the a=remote- candidate attribute. Once the answer is sent, the answerer will have the set of native and remote candidates before this offer/answer exchange, and the set of native and remote candidates afterwards. A peer derived candidate continues to be used as long as its generating parent continues to be used. The agent then pairs up the native and remote candidates which were added or retained. This leads to a set of current candidate pairs. If a candidate pair existed previously, but as a consequence of the offer/answer exchange, it no longer exists, the agent takes the following steps: 1. The candidate pair is removed from the candidate pair priority ordered list. Their corresponding transport address pairs are removed from the transport address pair check ordered list. As a consequence of this, if connectivity checks had not yet begun for the candidate pair, they won't. If a transport address pair had been pruned from the transport address pair check ordered list because it was redundant with one of the transport address pairs which was just removed, that transport address pair is added back to the list. 2. If connectivity checks were already in progress for that candidate pair, the agent SHOULD immediately terminate any STUN transactions in progress from that candidate. No further retransmissions take place, and no further transactions from that candidate will be made. 3. If the agent receives a STUN Binding Request for that candidate pair, however, processing occurs as defined in Section 7.8. If a candidate pair existed previously, and continues to exist, no changes are made; any STUN transactions in progress for that candidate pair continue, it remains on the candidate pair priority ordered list, and its transport address pairs remain on the transport address pair check ordered list. If a candidate pair is new (because either its native candidate is new, or its remote candidate is new, or both), the agent takes the role of answerer for this candidate pair. The new candidate pair is inserted into the candidate pair priority ordered list, and the transport address pair check ordered list is rederived. STUN connectivity checks will start for them based on the logic described in Section 7.6. 7.11.3. Receiving the Answer Once the answer is received, the answerer will have the set of native and remote candidates before this offer/answer exchange, and the set of native and remote candidates afterwards. It then follows the same logic described in Section 7.11.2, pairing up the candidate pairs, removing ones that are no longer in use, and beginning of processing for ones that are new. 7.12. Binding Keepalives Once a candidate is promoted to operating, and media begins flowing, it is still necessary to keep the bindings alive at intermediate NATs for the duration of the session. Normally, the media stream packets themselves (e.g., RTP) meet this objective. However, several cases merit further discussion. Firstly, in some RTP usages, such as SIP, the media streams can be "put on hold". This is accomplished by using the SDP "sendonly" or "inactive" attributes, as defined in RFC 3264 [4]. RFC 3264 directs implementations to cease transmission of media in these cases. However, doing so may cause NAT bindings to timeout, and media won't be able to come off hold. Secondly, some RTP payload formats, such as the payload format for text conversation [31], may send packets so infrequently that the interval exceeds the NAT binding timeouts. Thirdly, if silence suppression is in use, long periods of silence may cause media transmission to cease sufficiently long for NAT bindings to time out. To prevent these problems, ICE implementations MUST continue to list their operating candidate in a=candidate lines for UDP-based media streams. As a consequence of this, STUN packets will be transmitted periodically independently of the transmission (or lack thereof) of media packets. These will be received on the same IP address and port as the media streams. The agent determines whether the packet is media or STUN by looking for the magic cookie in bits 32-63 of the data. If present, it indicates that the packet is STUN, and if not, indicates that it is media. This provides a media independent, RTP independent, and codec independent solution for keeping the NAT bindings alive. However, an ICE implementation MUST be prepared for the transport address received in an m/c-line to not correspond to any a=candidate attributes. If an ICE implementation is communciating with one that does not support ICE, keepalives MUST still be sent. Indeed, these keepalives are essential even if neither endpoint implements ICE. As such, this specification defines keepalive behavior generally, for endpoints that support ICE, and those that do not. All endpoints MUST send keepalives for each media session. These keepalives MUST be sent regardless of whether the media stream is currently inactive, sendonly, recvonly or sendrecv. The keepalive SHOULD be sent using a format which is supported by its peer. ICE endpoints allow for STUN-based keepalives for UDP streams, and as such, STUN keepalives MUST be used when an agent is communicating with a peer that supports ICE. An agent can determine that its peer supports ICE by the presence of the a=candidate attributes for each media session. If the peer does not support ICE, the choice of a packet format for keepalives is a matter of local implementation. A format which allows packets to easily be sent in the absence of actual media content is RECOMMENDED. Examples of formats which readily meet this goal are RTP No-Op [28] and RTP comfort noise [24]. If the peer doesn't support any formats that are particularly well suited for keepalives, an agent SHOULD send RTP packets with an incorrect version number, or some other form of error which would cause them to be discarded by the peer. STUN-based keepalives will be sent periodically every Tr seconds as a consequence of the rules in in Section 7.7. If STUN keepalives are not in use (because the peer does not support ICE), an agent SHOULD ensure that a media packet is sent every Tr seconds. If one is not sent as a consequence of normal media communications, a keepalive packet using one of the formats discussed above SHOULD be sent. 7.13. Sending Media When an agent receives an offer and sends an answer, or when it receives an answer to an offer it sent, it begins connectivity checks. If there is a candidate that corresponds to the m/c-line, these checks will include validation of the operating candidate pair. In that case, an agent SHOULD NOT send media on the operating candidate pair until that candidate pair has reached the Valid or Recv-Valid state. This is to help prevent a denial-of-service attack, described in Section 13. Once the operating candidate pair reaches the Valid or Recv-Valid state, an agent MAY start sending media to that candidate pair. If there is no candidate that corresponds to the m/c-line, the m/c-line cannot be validated, and media is sent to it as described in RFC 3264 [4]. Under normal conditions, there will be a candidate for the m/c-line. Indeed - ICE itself requires that an agent include one. However, actual SIP deployments have seen usage of network intermediaries which manipulate the m/c-line of offers and answers. Should such elements ignore the candidate attributes, it would manifest itself like an agent which did not include a candidate for the m/c-line. For this reason, this use case is explicitly supported by ICE. Offer/answer exchanges are used with protocols, like SIP, which require media to be sent "early", from the answerer to the offer, prior to completion of the initial offer/answer exchange. It is highly desirable (and sometimes necessary) for this early media to use the candidate pair ultimately selected by ICE connectivity checks. For this reason, ICE provides an early media mechanism that allows for a candidate pair to be used in one direction prior to its promotion to operating in a subsequent offer/answer exchange. Note that, with ICE, early media pertains to media sent to a candidate pair until its promotion to operating in a subsequent offer/answer exchange. This is a broader definition than is used in [26], which defines early media as media sent prior to acceptance of a call. As a consequence of the connectivity checks, an agent will change the states for each transport address pair, and consequently, for the candidate pairs. When a candidate pair becomes Valid or Recv-Valid, and there is a candidate pair for the m/c-line, and the candidate pair is not equal to the operating candidate pair, and the agent is in the role of answerer for that candidate pair, the agent checks the position of that pair in the candidate pair priority ordered list. If it is the first, the agent selects this candidate pair for early media. If this candidate pair is not the first on the candidate pair priority ordered list, but is higher priority than the operating candidate pair, and the early media wait-state timer has not yet been set, the agent sets this timer to Tws seconds. Though the early media wait state timer has the same value as the wait state timer described in Section 7.9, these are different timers and indeed are set by different entites. The early media wait state timer allows for a higher priority connectivity check to complete, in the event its STUN Binding Request or Response was lost or delayed in the network. If, prior to the early media wait-state timer firing, another connectivity check completes and a candidate pair enters the Valid or Recv-Valid states, there is no need to reset or cancel the timer. Once the timer fires, the agent SHOULD select the highest priority candidate pair in the Valid or Recv-Valid state for which the agent has the role of answerer, and use that candidate pair for early media. ICE processing will ensure that, under almost all circumstances, the candidate pair selected by the answerer for early media will also be the one selected by the offerer for eventual promotion to operating. The early media state implies that the answerer knows that this candidate pair is to be used, but the offerer doesn't know yet that it will eventually be validated. It is for this reason that the candidate pair can be used for early media. If a candidate pair is selected for early media, an agent MAY send media on that candidate pair, even if it is not the same as the operating candidate pair. However, to deal with cases in which the offerer and answerer do not agree on the eventual selection of this candidate for promotion to operating (a rare but possible case), the agent MUST discontinue using the candidate pair for sending media Tlo seconds after the next opportunity its peer would have to send an updated offer. In the case of an answer delivered in a 200 OK to an offer in a SIP INVITE (regardless of whether that same answer appeared in an earlier unreliable provisional response), this would be Tlo seconds after receipt of the ACK. Tlo SHOULD be configurable and SHOULD have a default of 5 seconds. This time represents the amount of time it should take the offerer to perform its connectivity checks, arrive at the same conclusion about the viability of the early candidate, and then generate an updated offer promoting it to operating. If, after Tlo seconds, no updated offer arrives, the answerer MUST cease using the early candidate. Media MAY be sent to the operating candidate pair if it is in the Valid or Recv-Valid state. If an updated offer does arrive prior to the expiration of the timer, the agent MUST execute the procedures in Section 7.11.2, which will result in the selection of a candidate for the m/c-line in the answer. At that point, the procedures of this section SHOULD be restarted by the answerer. This implies that the operating candidate pair, if Valid or Recv-Valid, will be used. If a higher priority candidate pair subsequently enters the Valid or Recv-Valid state, it may end up being used as an early candidate. To use a candidate pair, whether it is early or operating, media is sent to the IP addresses and ports of the components in the remote candidate, and sends that media from the IP addresses and ports of the components in the native candidate. Transport addresses are paired up based on component ID. For example, if a remote candidate has two components R1 and R2, and the native candidate has two components L1 and L2, media packets are sent from L1 to R1 and from L2 to R2. This provides a property known as symmetry. This symmetric behavior MUST be followed by an agent even if its peer in the session doesn't support ICE. The definition of sending media "from" a particular transport address depends on the type of transport address. In the case of a server reflexive transport address, this means that the RTP packets are sent from the local transport address used to obtain the STUN address. In the case of a relayed transport address, this means that media packets are sent through the relay server (for STUN relays, this would be using the Send request). For local transport addresses, media is sent from that local transport address. For peer reflexive transport addresses, media is sent from the local transport address used to obtain the reflexive address. ICE has interactions with jitter buffer adaptation mechanisms. An RTP stream can begin using one candidate, and switch to another one. The newer candidate may result in RTP packets taking a different path through the network - one with different delay characteristics. As discussed below, agents are encouraged to re-adjust jitter buffers when there are changes in source or destination address. Furthermore, many audio codecs use the marker bit to signal the beginning of a talkspurt, for the purposes of jitter buffer adaptation. For such codecs, it is RECOMMENDED that the sender change the marker bit when an agent switches transmission of media from one candidate pair to another. 7.14. Receiving Media ICE implementations MUST be prepared to receive media on a candidate pair if it is in the role of offerer for that candidate pair, even if that candidate pair is not currently operating. This is a consequence of the early media mechanism described in the previous section. If an agent determines that its peer supports ICE (an offerer knows this when the answer contains a=candidate attributes), it SHOULD discard any media packets received on a candidate pair prior to the candidate pair entering the Send Valid state. This helps eliminate certain attacks, as discussed in Section 13. Note that, in cases of forking, an agent may get multiple answers to its offer, each for a different peer. Consequently, if would only discard media packets received on a candidate pair once it has determined that all forked targets support ICE. It is RECOMMENDED that, when an agent receives an RTP packet with a new source or destination IP address for a particular media stream, that the agent re-adjust its jitter buffers. RFC 3550 [21] describes an algorithm in Section 8.2 for detecting SSRC collisions and loops. These algorithms are based, in part, on seeing different source IP addresses and ports with the same SSRC. However, when ICE is used, such changes will naturally occur as the media streams switch between candidates. An agent will be able to determine that a media stream is from the same peer as a consequence of the STUN exchange that proceeds media transmission. Thus, if there is a change in source IP address and port, but the media packets come from the same peer agent, this SHOULD NOT be treated as an SSRC collision. 8. Guidelines for Usage with SIP SIP [2] makes use of the offer/answer model, and is one of the primary targets for usage of ICE. SIP allows for offer/answer exchanges to occur in many different combinations of messages, including INVITE/200 OK and 200 OK/ACK. When support for reliable provisional responses (RFC 3262 [11]) and UPDATE (RFC 3311 [25]) are added, additional combinations of messages that can be used for offer/answer exchanges are added. As such, this section provides some guidance on good ways to make use of SIP with ICE. ICE requires a series of STUN-based connectivity checks to take place between endpoints. These checks start from the answerer on generation of its answer, and start from the offerer when it receives the answer. These checks can take time to complete, and as such, the selection of messages to use with offers and answers can effect perceived user latency. Two latency figures are of particular interest. These are the post-pickup delay and the post-dial delay. The post-pickup delay refers to the time between when a user "answers the phone" and when any speech they utter can be delivered to the caller. The post-dial delay refers to the time between when a user enters the destination address for the user, and ringback begins as a consequence of having succesfully started ringing the phone of the called party. To reduce post-dial delays, it is RECOMMENDED that the caller begin gathering candidates prior to actually sending its initial INVITE. This can be started upon user interface cues that a call is pending, such as activity on a keypad or the phone going offhook. To reduce post-pickup delays, ICE allows for media to be sent from the answerer to the offerer on a candidate pair, prior to its promotion to operating. However, this requires the answerer to have generated its answer and sent it. In most cases, it will require this answer to be received by the offerer. The reason is that connectivity checks or RTP packets from the answerer to the offerer will not be forwarded by NATs towards the offerer until the offerer has established a permission in the NAT by generating a packet towards the answerer. For this reason, if an offer is received in an INVITE request, the UAS SHOULD immediately gather its candidates and then generate an answer in a provisional response. When reliable provisional responses are not used, the SDP in the provisional response is the answer, and that exact same answer reappears in the 200 OK. To deal with possible losses of the provisional response, it SHOULD be retransmitted until some indication of receipt. This indication can either be through PRACK [11], or through the receipt of a STUN Binding Request with a correct username and password. Even if PRACK is not used, the provisional response SHOULD be retransmitted using the exponential backoff described in [11]. Furthermore, once the answer has been sent, the agent SHOULD begin its connectivity checks. Once a candidate reaches the Valid or Recv-Valid state, the UAS has a known-valid path for media packets towards the UAC. This point is called the connected point in ICE. Once the UAS reaches the connected point, media can be sent from the UAS towards the UAC without any additional delays. However, between the receipt of the INVITE and the connected point, any media that needs to be sent towards the caller (such as SIP early media [26] cannot be transmitted. For this reason, implementations MAY choose to delay alerting the called party until the connected point is reached. In the case of a PSTN gateway, this would mean that the setup message into the PSTN is delayed until the connected point. Doing this increases the post-dial delay, but has the effect of eliminating 'ghost rings'. Ghost rings are cases where the called party hears the phone ring, picks up, but hears nothing and cannot be heard. This technique works without requiring support for, or usage of, preconditions [7], since its a localized decision. It also has the benefit of guaranteeing that not a single packet of early media will get clipped. If an agent chooses to delay local alerting in this way, it SHOULD generate a 180 response once alerting begins. A slight variation of this approach is to wait for a connectivity check to succeed to a higher priority candidate pair than the operating one. This allows for the agent to only ever send media, early or otherwise, to a single candidate, which will work better with jitter buffers, at the expense of even greater post-dial delays. Note that, prior to the promotion of a candidate pair to operating, the offerer will not be able to send using the candidate pair. When used with SIP, if the initial offer is sent in the INVITE, and the answer is sent in both the provisional and final 200 OK response, the offerer will not be able to send media until it sends a re-INVITE and receives the 200 OK response to that re-INVITE. This can take several hundred milliseconds. If this latency is an issue (it is generally not considered an issue for voice systems), reliable provisional responses [11] MAY be used, in which case an UPDATE [25] can be used to send an updated offer prior to the call being answered. As discussed in Section 13, offer/answer exchanges SHOULD be secured against eavesdropping and man-in-the-middle attacks. To do that, the usage of SIPS [2] is RECOMMENDED when used in concert with ICE. 9. Interactions with Forking SIP allows INVITE requests carrying offers to fork, which means that they are delivered to multiple user agents. Each of those user agents then provides an answer to the offer in the INVITE. The result is that a single offer generated by the UAC produces multiple answers. ICE interacts very well with forking. Indeed, ICE fixes some of the problems associated with forking. Once the offer/answer exchange has completed, the UAC will have an answer from each UAS that received the INVITE. The ICE connectivity checks that ensue will carry transport address pair IDs that correlate each of those checks (and thus their corresponding IP addresses and ports) with a specific remote user agent. As these checks happen before any media is transmitted, ICE allows a UAC to disambiguate subsequent media traffic by looking at the source IP address and port, and then correlate that traffic with a particular remote UA. When SIP is used without ICE, the incoming media traffic cannot be disambiguated without an additional offer/answer exchange. 10. Interactions with Preconditions Because ICE involves multiple addresses and pre-session activities, its interactions with preconditions merits further discussion. Quality of Service (QoS) preconditions, which are defined in RFC 3312 [7] and RFC 4032 [8], apply only to the IP addresses and ports listed in the m/c lines in an offer/answer. If ICE changes the address and port where media is received, this change is reflected in the m/c lines of a new offer/answer. As such, it appears like any other re- INVITE would, and is fully treated in RFC 3312 and 4032, which applies without regard to the fact that the m/c lines are changing due to ICE negotiations ocurring "in the background". However, usage of early candidates with QoS preconditions is NOT RECOMMENDED, since QoS will only be reserved for the candidate pair in the m/c-line. An agent SHOULD only send to the operating candidate (once it enters the Valid or Recv-Valid states) if QoS preconditions are used for a media session. ICE also has (purposeful) interactions with connectivity preconditions [27]. Those interactions are described there. 11. Examples This section provides two examples. One is a very basic example, and the other is more elaborate. A common configuration and setup is used in both cases. Two agents, L and R, are using ICE. Both agents have a single IPv4 interface. For agent L, it is 10.0.1.1, and for agent R, 192.0.2.1. Both are configured with a single STUN server each (indeed, the same one for each), which is listening for STUN requests at an IP address of 192.0.2.2 and port 3478. This STUN server supports both the Binding Discovery usage and the Relay usage. Agent L is behind a NAT, and agent R is on the public Internet. The public side of the NAT has an IP address of 192.0.2.3. To facilitate understanding, transport addresses are listed using variables that have mnemonic names. This format of the anem is entity-type-seqno, where entity refers to the entity whose interface the transport address is on, and is one of "L", "R", "STUN", or "NAT". The type is either "PUB" for transport addresses that are public, and "PRIV" for transport addresses that are private. Finally, seq-no is a sequence number that is different for each transport address of the same type on a particular entity. Each variable has an IP address and port, denoted by varname.IP and varname.PORT, respectively, where varname is the name of the variable. In addition, candidate IDs are also listed using variables that have mnemonic names. Agent L uses candidate ID L1 for its local candidate, L2 for its server reflexive candidate, and L3 for its relayed candidate. Agent R uses R1 for its local candidate and R2 for its relayed candidate. The password is LPASS for each candidate from agent L, and RPASS for each candidate from agent R. The STUN server has advertised transport address STUN-PUB-1 (which is 192.0.2.2:3478) for both the binding discovery usage and the relay usage. In the call flow itself, STUN messages are annotated with several attributes. The "S=" attribute indicates the source transport address of the message. The "D=" attribute indicates the destination transport address of the message. The "MA=" attribute is used in STUN Binding Response messages, STUN Binding Response messages carried in a STUN Send Request or Data Indication, and in a Allocate Response, and refers to the reflexive transport address derived from the XOR-MAPPED-ADDRESS attribute. The "RA=" attribute is used in STUN Data Indications, and refers to the value of the REMOTE-ADDRESS attribute. The "U=" attribute is used in STUN Requests, and corresponds to the STUN USERNAME. The "DA=" attribute is used in STUN Send requests, and refers to the value of the DESTINATION- ADDRESS attribute. The "R=" attribute is used in Allocate responses, and it indicates the value of the RELAY-ADDRESS attribute. The call flow examples omit STUN authentication operations. 11.1. Basic Example In this example, the NAT has an endpoint independent mapping property and an address dependent filtering property. Neither agent is using the STUN relay usage, only the binding discovery usage. As a consequence, agent L will end up with two candidates - a local candidate and a server reflexive candidate. Agent R will have one - a local candidate (the reflexive candidate will be identical to the local one, and thus discarded). The agents are seeking to communicate using a single RTP-based voice stream. RTCP is not used. As a consequence, each candidate has one component. L NAT STUN R |RTP STUN alloc. | | |(1) STUN Req | | | |S=$L-PRIV-1 | | | |D=$STUN-PUB-1 | | | |------------->| | | | |(2) STUN Req | | | |S=$NAT-PUB-1 | | | |D=$STUN-PUB-1 | | | |------------->| | | |(3) STUN Res | | | |S=$STUN-PUB-1 | | | |D=$NAT-PUB-1 | | | |MA=$NAT-PUB-1 | | | |<-------------| | |(4) STUN Res | | | |S=$STUN-PUB-1 | | | |D=$L-PRIV-1 | | | |MA=$NAT-PUB-1 | | | |<-------------| | | |(5) Offer | | | |------------------------------------------->| | | | |RTP STUN alloc. | | |(6) STUN Req | | | |S=$R-PUB-1 | | | |D=$STUN-PUB-1 | | | |<-------------| | | |(7) STUN Res | | | |S=$STUN-PUB-1 | | | |D=$R-PUB-1 | | | |MA=$R-PUB-1 | | | |------------->| |(8) answer | | | |<-------------------------------------------| | |(9) Bind Req | | | |S=$R-PUB-1 | | | |D=$NAT-PUB-1 | | | |<----------------------------| | |Dropped | | |(10) Bind Req | | | |S=$L-PRIV-1 | | | |D=$R-PUB-1 | | | |------------->| | | | |(11) Bind Req | | | |S=$NAT-PUB-1 | | | |D=$R-PUB-1 | | | |---------------------------->| | |(12) Bind Res | | | |S=$R-PUB-1 | | | |D=$NAT-PUB-1 | | | |MA=$NAT-PUB-1 | | | |<----------------------------| |(13) Bind Res | | | |S=$R-PUB-1 | | | |D=$L-PRIV-1 | | | |MA=$NAT-PUB-1 | | | |<-------------| | | |RTP flows | | | | |(14) Bind Req | | | |S=$R-PUB-1 | | | |D=$NAT-PUB-1 | | | |<----------------------------| |(15) Bind Req | | | |S=$R-PUB-1 | | | |D=$L-PRIV-1 | | | |<-------------| | | |(16) Bind Res | | | |S=$L-PRIV-1 | | | |D=$R-PUB-1 | | | |MA=$R-PUB-1 | | | |------------->| | | | |(17) Bind Res | | | |S=$NAT-PUB-1 | | | |D=$R-PUB-1 | | | |MA=$R-PUB-1 | | | |---------------------------->| | | | |RTP flows Figure 12 First, agent L obtains a server reflexive transport address for its RTP packets (messages 1-4). Recall that the NAT has the address and port independent mapping property. Here, it creates a binding of NAT-PUB-1 for this UDP request, and this becomes the server reflexive transport address for RTP, the sole component of its server reflexive candidate. With its two candidates, agent L prioritizes them, choosing the local candidate as highest priority, followed by the server reflexive candidate. It chooses its server reflexive candidate as the operating candidate, and encodes it into the m/c-line. The resulting offer (message 5) looks like (lines folded for clarity): v=0 o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP s= c=IN IP4 $NAT-PUB-1.IP t=0 0 a=ice-pwd:$LPASS m=audio $NAT-PUB-1.PORT RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:$L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT typ local a=candidate:$L2 1 UDP 0.7 $NAT-PUB-1.IP $NAT-PUB-1.PORT typ srflx raddr $L-PRIV-1.IP rport $L-PRIV-1.PORT The offer, with the variables replaced with their values, will look like (lines folded for clarity): v=0 o=jdoe 2890844526 2890842807 IN IP4 10.0.1.1 s= c=IN IP4 192.0.2.3 t=0 0 a=ice-pwd:asd88fgpdd777uzjYhagZg m=audio 45664 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:8hhY 1 UDP 1.0 10.0.1.1 8998 typ local a=candidate:Bzo8 1 UDP 0.7 192.0.2.3 45664 typ srflx raddr 10.0.1.1 rport 8998 This offer is received at agent R. Agent R will gather its server reflexive transport address (messages 6-7). Since R is not behind a NAT, this address is identical to its local transport address, and was obtained from its local transport address, and thus does not represent a separate candidate. It therefore ends up with a single local candidate with a single component for RTP. Its resulting answer looks like: v=0 o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP s= c=IN IP4 $R-PUB-1.IP t=0 0 a=ice-pwd:$RPASS m=audio $R-PUB-1.PORT RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:$R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT typ local With the variables filled in: v=0 o=bob 2808844564 2808844564 IN IP4 192.0.2.1 s= c=IN IP4 192.0.2.1 t=0 0 a=ice-pwd:YH75Fviy6338Vbrhrlp8Yh m=audio 3478 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:9uB6 1 UDP 1.0 192.0.2.1 3478 typ local Next, agents L and R form candidate pairs, the candidate pair priority ordered list and transport address pair check ordered list. The candidate pair priority ordered list will have two entries, and be identical for L and R. The highest priority one will be the one containing L2 and R1 (since its the operating candidate pair), and the second one will be L1 and R1. The transport address pair check ordered list initially starts with two entries. For agent L, this will be L2:1:R1:1 and L1:1:R1:1. However, after the trimming operation, agent L will remove the second transport address pair, since it shares the same origination transport address as the first (L-PRIV-1 for both). However, R will keep both transport address pairs. Agent R begins its connectivity check (message 9) for transport address pair L2:1:R1:1 (note that, from its perspective, the transport address pair has the ID R1:1:L2:1, and this ID would appear in the USERNAME of STUN requests it receives). Since the NAT has a filtering policy of address dependent, the connectivity check is discarded. When agent L gets the answer, it begins its connectivity check for L2:1:R1:1 (messages 10-13), which succeed, placing the transport address pair and resulting candidate pair into the Recv-Valid state. L can now send media to R. When agent R receives the connectivity check (message 11), it is a match for the transport address pair, and the state of the transport address pair moves to Send-Valid. Agent R begins its connectivity checks (messages 14-17). When the check arrives at the NAT (message 14), it is permitted to pass since a permission was created towards R-PUB-1 as a consequence of message 10. This check arrives at agent L, which generates a success response (message 16), and updates the state of the transport address pair to Valid. This response arrives at agent R, which also updates the state of the transport address pair to Valid. Now, media can flow from agent R to agent L as well. 11.2. Advanced Example In this more advanced example, The NAT has address and port dependent mapping and filtering properties. Both agents use the STUN relay usage in addition to the binding discovery usage. As a consequence, agent L will end up with three candidates - a local candidate, a relayed candidate, and a server reflexive candidate. Agent R will have two - a local candidate and a relayed candidate (the server reflexive candidate will equal the local candidate and thus not be used). The agents are seeking to communicate using a single RTP- based voice stream, but are using RTCP. As a consequence, each candidate has two components - one for RTP and one for RTCP. L NAT STUN R | | | | | | | | | | | | |RTP Alloc. | | | | | | | | | | | | | | | |(1) Alloc Req | | | |S=L-PRIV-1 | | | |D=STUN-PUB-1 | | | |------------->| | | | | | | | | | | | |(2) Alloc Req | | | |S=NAT-PUB-1 | | | |D=STUN-PUB-1 | | | |------------->| | | |(3) Alloc Res | | | |S=STUN-PUB-1 | | | |D=NAT-PUB-1 | | | |R=STUN-PUB-2 | | | |MA=NAT-PUB-1 | | | |<-------------| | |(4) Alloc Res | | | |S=STUN-PUB-1 | | | |D=L-PRIV-1 | | | |R=STUN-PUB-2 | | | |MA=NAT-PUB-1 | | | |<-------------| | | | | | | | | | | | | | | |RTCP Alloc. | | | |Ta secs. later| | | | | | | | | | | | | | | |(5) Alloc Req | | | |S=L-PRIV-2 | | | |D=STUN-PUB-1 | | | |------------->| | | | | | | | | | | | |(6) Alloc Req | | | |S=NAT-PUB-2 | | | |D=STUN-PUB-1 | | | |------------->| | | |(7) Alloc Res | | | |S=STUN-PUB-1 | | | |D=NAT-PUB-2 | | | |R=STUN-PUB-3 | | | |MA=NAT-PUB-2 | | | |<-------------| | |(8) Alloc Res | | | |S=STUN-PUB-1 | | | |D=L-PRIV-2 | | | |R=STUN-PUB-3 | | | |MA=NAT-PUB-2 | | | |<-------------| | | | | | | | | | | | | | | | | | | |(9) Offer | | | |------------------------------------------->| | | | | | | | | | | | | | | | | | | | |RTP Alloc. | | | | | | | | | | | | | | |(10) Alloc Req| | | |S=R-PUB-1 | | | |D=STUN-PUB-1 | | | |<-------------| | | |(11) Alloc Res| | | |S=STUN-PUB-1 | | | |D=R-PUB-1 | | | |R=STUN-PUB-4 | | | |MA=R-PUB-1 | | | |------------->| | | | | | | | | | | | | | | | |RTCP Alloc. | | | |Ta secs. later | | | | | | | | | | | | | | |(12) Alloc Req| | | |S=R-PUB-2 | | | |D=STUN-PUB-1 | | | |<-------------| | | |(13) Alloc Res| | | |S=STUN-PUB-1 | | | |D=R-PUB-2 | | | |R=STUN-PUB-5 | | | |MA=R-PUB-2 | | | |------------->| | | | | | | | | | | | | | | | | |(14) answer | | | |<-------------------------------------------| | | | | | | | | | | | | | | | |Validate | | | |STUN-PUB-4 to STUN-PUB-2 | | | | | | | | | | |(15) Send Ind | | | |S=R-PUB-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-2 | | | |<-------------| | | | | | | |Bind Req. | | | |S=STUN-PUB-4 | | | |D=STUN-PUB-2 | | | |U=L3:1:R2:1 | | | | | | | | | | | | | | | | | | | | | | | |Discard | | | | | | | | | | | | | | | | | |Validate | | | |STUN-PUB-2 to STUN-PUB-4 | | | | | | | | | | |(16) Send Ind | | | |S=L-PRIV-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-4 | | | |------------->| | | | | | | | |(17) Send Ind | | | |S=NAT-PUB-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-4 | | | |------------->| | | | | | | | |Bind Req. | | | |S=STUN-PUB-2 | | | |D=STUN-PUB-4 | | | |U=R2:1:L3:1 | | | | | | | | | | | |(18) Data Ind | | | |S=STUN-PUB-1 | | | |D=R-PUB-1 | | | |RA=STUN-PUB-2 | | | |------------->| | | |(19) Send Ind | | | |S=R-PUB-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-2 | | | |MA=STUN-PUB-2 | | | |<-------------| | | | | | | |Bind Res. | | | |S=STUN-PUB-4 | | | |D=STUN-PUB-2 | | | |MA=STUN-PUB-2 | | | | | | |(20) Data Ind | | | |S=STUN-PUB-1 | | | |D=NAT-PUB-1 | | | |RA=STUN-PUB-4 | | | |MA=STUN-PUB-2 | | | |<-------------| | |(21) Data Ind | | | |S=STUN-PUB-1 | | | |D=L-PRIV-1 | | | |RA=STUN-PUB-4 | | | |MA=STUN-PUB-2 | | | |<-------------| | | | | | | | | | | | | | | | | | |Validate | | | |STUN-PUB-4 to STUN-PUB-2 | | | | | | | | | | |(22) Send Ind | | | |S=R-PUB-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-2 | | | |<-------------| | | | | | | |Bind Req. | | | |S=STUN-PUB-4 | | | |D=STUN-PUB-2 | | | |U=L3:1:R2:1 | | | | | | | | | | |(23) Data Ind | | | |S=STUN-PUB-1 | | | |D=NAT-PUB-1 | | | |RA=STUN-PUB-4 | | | |<-------------| | | | | | |(24) Data Ind | | | |S=STUN-PUB-1 | | | |D=L-PRIV-1 | | | |RA=STUN-PUB-4 | | | |<-------------| | | |(25) Send Ind | | | |S=L-PRIV-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-4 | | | |MA=STUN-PUB-4 | | | |------------->| | | | |(26) Send Ind | | | |S=NAT-PUB-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-4 | | | |MA=STUN-PUB-4 | | | |------------->| | | | | | | | |Bind Res. | | | |S=STUN-PUB-2 | | | |D=STUN-PUB-4 | | | |MA=STUN-PUB-4 | | | | | | | |(27) Data Ind | | | |S=STUN-PUB-1 | | | |D=R-PUB-1 | | | |RA=STUN-PUB-2 | | | |MA=STUN-PUB-4 | | | |------------->| | | | | | | | | | | | | | | | |Validate | | | |STUN-PUB-5 to STUN-PUB-3 | | | | | | | | | | |(28) Send Ind | | | |S=R-PUB-2 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-3 | | | |<-------------| | | | | | | |Bind Req. | | | |S=STUN-PUB-5 | | | |D=STUN-PUB-3 | | | |U=L3:2:R2:2 | | | | | | | | | | | | | | | | | | | | | | | |Discard | | | | | | | | | | | | | | | | | |Validate | | | |STUN-PUB-3 to STUN-PUB-5 | | | | | | | | | | |(29) Send Ind | | | |S=L-PRIV-2 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-5 | | | |------------->| | | | | | | | |(30) Send Ind | | | |S=NAT-PUB-2 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-5 | | | |------------->| | | | | | | | |Bind Req. | | | |S=STUN-PUB-3 | | | |D=STUN-PUB-5 | | | |U=R2:2:L3:2 | | | | | | | | | | | |(31) Data Ind | | | |S=STUN-PUB-1 | | | |D=R-PUB-2 | | | |RA=STUN-PUB-3 | | | |------------->| | | |(32) Send Ind | | | |S=R-PUB-2 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-3 | | | |MA=STUN-PUB-3 | | | |<-------------| | | | | | | |Bind Res. | | | |S=STUN-PUB-5 | | | |D=STUN-PUB-3 | | | |MA=STUN-PUB-3 | | | | | | |(33) Data Ind | | | |S=STUN-PUB-1 | | | |D=NAT-PUB-2 | | | |RA=STUN-PUB-5 | | | |MA=STUN-PUB-3 | | | |<-------------| | |(34) Data Ind | | | |S=STUN-PUB-1 | | | |D=L-PRIV-2 | | | |RA=STUN-PUB-5 | | | |MA=STUN-PUB-3 | | | |<-------------| | | | | | | | | | | | | | | | | | |Validate | | | |STUN-PUB-5 to STUN-PUB-3 | | | | | | | | | | |(35) Send Ind | | | |S=R-PUB-2 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-3 | | | |<-------------| | | | | | | |Bind Req. | | | |S=STUN-PUB-5 | | | |D=STUN-PUB-3 | | | |U=L3:2:R2:2 | | | | | | | | | | |(36) Data Ind | | | |S=STUN-PUB-1 | | | |D=NAT-PUB-2 | | | |RA=STUN-PUB-5 | | | |<-------------| | | | | | |(37) Data Ind | | | |S=STUN-PUB-1 | | | |D=L-PRIV-2 | | | |RA=STUN-PUB-5 | | | |<-------------| | | |(38) Send Ind | | | |S=L-PRIV-2 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-5 | | | |MA=STUN-PUB-5 | | | |------------->| | | | |(39) Send Ind | | | |S=NAT-PUB-2 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-5 | | | |MA=STUN-PUB-5 | | | |------------->| | | | | | | | |Bind Res. | | | |S=STUN-PUB-3 | | | |D=STUN-PUB-5 | | | |MA=STUN-PUB-5 | | | | | | | |(40) Data Ind | | | |S=STUN-PUB-1 | | | |D=R-PUB-2 | | | |RA=STUN-PUB-3 | | | |MA=STUN-PUB-5 | | | |------------->| | | | | | | | | | | | | | | | | |RTP flows | | | | | | | | | | | |(41) Send Ind | | | |S=L-PRIV-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-4 | | | |------------->| | | | | | | | |(42) Send Ind | | | |S=NAT-PUB-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-4 | | | |------------->| | | | | | | | | | | | |RTP | | | |S=STUN-PUB-2 | | | |D=STUN-PUB-4 | | | | | | | | | | | |(43) Data Ind | | | |S=STUN-PUB-1 | | | |D=R-PUB-1 | | | |RA=STUN-PUB-2 | | | |------------->| | | | | | | | | | | | | | | | | | | | |RTP flows | | | | | | | | | | |(44) Send Ind | | | |S=R-PUB-1 | | | |D=STUN-PUB-1 | | | |DA=STUN-PUB-2 | | | |<-------------| | | | | | | | | | | |RTP | | | |S=STUN-PUB-4 | | | |D=STUN-PUB-2 | | | | | | | | | | |(45) Data Ind | | | |S=STUN-PUB-1 | | | |D=NAT-PUB-1 | | | |RA=STUN-PUB-4 | | | |<-------------| | | | | | |(46) Data Ind | | | |S=STUN-PUB-1 | | | |D=L-PRIV-1 | | | |RA=STUN-PUB-4 | | | |<-------------| | | | | | | | | | | | | | | |Validate | | | |L-PRIV-1 to R-PUB-1 | | | | | | | | | | |(47) Bind Req.| | | |S=L-PRIV-1 | | | |D=R-PUB-1 | | | |U=R1:1:L1:1 | | | |------------->| | | | | | | | |(48) Bind Req.| | | |S=NAT-PUB-3 | | | |D=R-PUB-1 | | | |U=R1:1:L1:1 | | | |---------------------------->| | | | | | |(49) Bind Res.| | | |S=R-PUB-1 | | | |D=NAT-PUB-3 | | | |MA=NAT-PUB-3 | | | |<----------------------------| | | | | |(50) Bind Res.| | | |S=R-PUB-1 | | | |D=L-PRIV-1 | | | |MA-NAT-PUB-3 | | | |<-------------| | | | | | | | | | | | | | | | | | |Validate | | | |R-PUB-1 to L-PRIV-1 | | | | | | | | | |(51) Bind Req.| | | |S=R-PUB-1 | | | |D=L-PRIV-1 | | | |U=L1:1:R1:1 | | | |<----------------------------| | | | | | | | | | | | | | | | | | |Discard | | | | | | | | | | | | | | | | | | | | | |Validate | | | |R-PUB-2 to L-PRIV-2 | | | | | | | | | |(52) Bind Req.| | | |S=R-PUB-2 | | | |D=L-PRIV-2 | | | |U=L1:2:R1:2 | | | |<----------------------------| | | | | | | | | | | | | | | | | | |Discard | | | | | | | | | | | | | | | | | | |Validate | | | |L-PRIV-2 to R-PUB-2 | | | | | | | | | | |(53) Bind Req.| | | |S=L-PRIV-2 | | | |D=R-PUB-2 | | | |U=R1:2:L1:2 | | | |------------->| | | | | | | | |(54) Bind Req.| | | |S=NAT-PUB-4 | | | |D=R-PUB-2 | | | |U=R1:2:L1:2 | | | |---------------------------->| | | | | | |(55) Bind Res.| | | |S=R-PUB-2 | | | |D=NAT-PUB-4 | | | |MA=NAT-PUB-4 | | | |<----------------------------| | | | | |(56) Bind Res.| | | |S=R-PUB-2 | | | |D=L-PRIV-2 | | | |MA=NAT-PUB-4 | | | |<-------------| | | | | | | | | | | | | | | | | | |Validate | | | |R-PUB-1 to NAT-PUB-3 | | | | | | | | | |(57) Bind Req.| | | |S=R-PUB-1 | | | |D=NAT-PUB-3 | | | |U=L1R1:1:R1:1 | | | |<----------------------------| | | | | |(58) Bind Req.| | | |S=R-PUB-1 | | | |D=L-PRIV-1 | | | |U=L1R1:1:R1:1 | | | |<-------------| | | | | | | |(59) Bind Res.| | | |S=L-PRIV-1 | | | |D=R-PUB-1 | | | |MA=R-PUB-1 | | | |------------->| | | | | | | | |(60) Bind Res.| | | |S=NAT-PUB-3 | | | |D=R-PUB-1 | | | |MA=R-PUB-1 | | | |---------------------------->| | | | | | | | | | | | | | | | |Validate | | | |R-PUB-2 to NAT-PUB-4 | | | | | | | | | |(61) Bind Req.| | | |S=R-PUB-2 | | | |D=NAT-PUB-4 | | | |U=L1R1:2:R1:2 | | | |<----------------------------| | | | | |(62) Bind Req.| | | |S=R-PUB-2 | | | |D=L-PRIV-2 | | | |U=L1R1:2:R1:2 | | | |<-------------| | | | | | | |(63) Bind Res.| | | |S=L-PRIV-2 | | | |D=R-PUB-2 | | | |MA=R-PUB-2 | | | |------------->| | | | | | | | |(64) Bind Res.| | | |S=NAT-PUB-4 | | | |D=R-PUB-2 | | | |MA=R-PUB-2 | | | |---------------------------->| | | | | | | | | | | | | | | | | |(65) Offer | | | |------------------------------------------->| | | | | | | | | | | | | | | | | |(66) Answer | | | |<-------------------------------------------| | | | | | | | | | | | | | | | | | | | | | | | | Figure 17 First, agent L obtains both server reflexivePSTN gateway, this would mean that the setup message into the PSTN is delayed until this point. Doing this increases the post-dial delay, but has the effect of eliminating 'ghost rings'. Ghost rings are cases where the called party hears the phone ring, picks up, but hears nothing andrelayed transport addresses forcannot be heard. This technique works without requiring support for, or usage of, preconditions [6], since itsRTP packets, usingaSTUN Allocate request, whichlocalized decision. It also has the benefit of guaranteeing that not a single packet of media willprovideget clipped, so that post-pickup delay is zero. If an agent chooses to delay local alerting in this way, it SHOULD generate a 180 response once alerting begins. Based on the rules in Section 11.1, the offerer will not be able to send media until the highest priority valid candidates match the m/c- line. When used with SIP, if the initial offer is sent in the INVITE, and the answer is sent in both the provisional and final 200 OK response, the offerer will generally not be able to send media until it sends a re-INVITE and receives the 200 OK response to that re-INVITE. This can take several hundred milliseconds. If this latency is an issue (it is generally not considered an issue for voice systems), reliable provisional responses [9] MAY be used, in which case an UPDATE [24] can be used to send an updated offer prior to the call being answered. As discussed in Section 15, offer/answer exchanges SHOULD be secured against eavesdropping and man-in-the-middle attacks. To do that, the usage of SIPS [3] is RECOMMENDED when used in concert withboth typesICE. 12.2. Interactions with Forking ICE interacts very well with forking. Indeed, ICE fixes some ofaddresses (messages 1-4). Recall that the NAT hastheaddress and port dependent mapping property. Here, it createsproblems associated with forking. Without ICE, when abinding of NAT-PUB-1 for this UDP request,call forks andthis becomes the server reflexive transport address for RTP. The relayed transport address is STUN-PUB-2, allocated bytheSTUN server. Agent L repeats this process for RTCP (messages 5-8) Ta seconds later, and obtains NAT-PUB-2 as its server reflexive transport address for RTCP and STUN-PUB-3 for its relayed transport address.caller receives multiple incoming media streams, it cannot determine which media stream corresponds to which callee. Withits three candidates, agent L prioritizes them, choosingICE, this problem is resolved. The connectivity checks which occur prior to transmission of media carry username fragments, which in turn are correlated to a specific callee. Subsequent media packets which arrive on thelocal candidatesame 5-tuple ashighest priority, followed bytheserver reflexive candidate, followed byconnectivity check will be associated with that same callee. Thus, therelayed candidate. It chooses its relayed candidatecaller can perform this correlation as long asthe operating candidate, and encodesitinto the m/c- line. The resulting offer (message 17) looks like: v=0 o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP s= c=IN IP4 $STUN-PUB-2.IP t=0 0 a=ice-pwd:$LPASS m=audio $STUN-PUB-2.PORT RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=rtcp:$STUN-PUB-3.PORT a=candidate:$L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT a=candidate:$L1 2 UDP 1.0 $L-PRIV-2.IP $L-PRIV-2.PORT a=candidate:$L2 1 UDP 0.7 $NAT-PUB-1.IP $NAT-PUB-1.PORT a=candidate:$L2 2 UDP 0.7 $NAT-PUB-2.IP $NAT-PUB-2.PORT a=candidate:$L3 1 UDP 0.3 $STUN-PUB-2.IP $STUN-PUB-2.PORT a=candidate:$L3 2 UDP 0.3 $STUN-PUB-3.IP $STUN-PUB-3.PORT This offer ishas received an answer. Section 11.2 introduces a requirement for agents receiving media; namely, that media should be discarded until a check has been receivedat agent R. Agent Rfrom that peer. Unfortunately, this mechanism doesn't work well in forking situations where a subset of the recipients are not ICE-aware. Those recipients willgather its server reflexivenot send checks, andrelayed transport addresses for RTPmedia froman Allocate request (messages 10-11). Since the server reflexive transport address matches its local transport address, no separate candidatethem will be discarded. OPEN ISSUE: Obviously this isused for it. The agent then gathers its server reflexivean issue. Need to either remove this feature of ICE or find a way to make it work better in forking situations. 12.3. Interactions with Preconditions Quality of Service (QoS) preconditions, which are defined in RFC 3312 [6] andrelayed transportRFC 4032 [7], apply only to the IP addressesfor RTCP (messages 12-13). It prioritizesand ports listed in thelocal candidate with higher priority thanm/c lines in an offer/answer. If ICE changes therelayed candidate,address andselects the relayed candidate asport where media is received, this change is reflected in theoperating candidate. Its resulting answer looks like: v=0 o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP s= c=IN IP4 $STUN-PUB-4.IP t=0 0 a=ice-pwd:$RPASS m=audio $STUN-PUB-4.PORT RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=rtcp:$STUN-PUB-5.PORT a=candidate:$R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT a=candidate:$R1 2 UDP 1.0 $R-PUB-2.IP $R-PUB-2.PORT a=candidate:$R2 1 UDP 0.3 $STUN-PUB-4.IP $STUN-PUB-4.PORT a=candidate:$R2 2 UDP 0.3 $STUN-PUB-5.IP $STUN-PUB-5.PORT Next, agents Lm/c lines of a new offer/answer. As such, it appears like any other re- INVITE would, andR form candidate pairsis fully treated in RFC 3312 and 4032, which apply without regard to thetransport address pair check ordered list. This list will start withfact that thetwo components inm/c lines are changing due to ICE negotiations ocurring "in the background". Indeed, an agent SHOULD NOT indicate that Qos preconditions have been met until thecurrently operating candidate pair - relayed candidates. Agent R begins itsICE checks(message 15). It will check connectivity betweenhave completed and selected theoperatingcandidatepair, startingpairs to be used for media. ICE also has (purposeful) interactions with connectivity preconditions [26]. Those interactions are described there. OPEN ISSUE: Are these preconditions really needed with ICE? ICE provides a connectivity precondition on its own using thefirst component, which is STUN-PUB-4 for agent Rmechanisms described above. 12.4. Interactions with Third Party Call Control ICE works with Flows I andSTUN-PUB-2 for agent L. The state machine for that transport address pair moves toIV as described in [16]. Flow I works without theTesting state. Since this is a relayed transport address for agent R, it utilizescontroller supporting or being aware of ICE. Flow IV will work as long as theSTUN Send Indication to delivercontroller passes along theBinding Request. The DESTINATION-ADDRESS is STUN-PUB-2. The STUN serverICE attributes without alteration. Flow III may disrupt ICE processing, since it willextractdistort thecontent ofstream ID values used in theSend indication, whichcomputation of priorities. When there is but aSTUN Binding Request, and deliver it tosingle media stream, Flow III will work as long as thedestination, STUN-PUB-4. This requestcontroller passes through the ICE attributes unmodified. Flow II is fundamentally incompatible with ICE; each agent will believe itself to besent fromtherelayed address allocatedanswerer and thus never generate a re-INVITE. OPEN ISSUE: Its really too bad flow III doesn't work with multimedia; should consider ways toR, which is STUN-PUB-4. As both interfacesmake it work. There areon the STUN server, this message is sentseveral ways. The flows for continued operation, as described in Section 7 of RFC 3725, require additional behavior of ICE implementations toitself (and thussupport. In particular, if an agent receives a mid-dialog re-INVITE that contains no offer, it MUST go through thelackprocess of gathering candidates, prioritizing them and generating an offer, as if this was an initial offer for amessage number in the sequence diagram above). Notesession. Furthermore, that list of candidates SHOULD include theUSERNAME inones currently in-use. 13. Grammar This specification defines four new SDP attributes - theBinding Request"candidate", "remote-candidates", "ice-ufrag" and "ice-pwd" attributes. The candidate attribute isL3:1:R2:1, which represents thea media-level attribute only. It contains a transport addresspair ID. This message gets discarded by the STUN server since, as of yet, there are no permissions establishedforthe STUN-PUB-2 allocation. However, it did have the side effect of establishingapermission on the STUN-PUB-4 binding, allowing incoming packetscandidate that can be used for connectivity checks. The syntax of this attribute is defined using Augmented BNF as defined in RFC 4234 [8]: candidate-attribute = "candidate" ":" foundation SP component-id SP transport SP priority SP connection-address SP ;from RFC 4566 port ;port fromSTUN-PUB-2. Once L gets the offer, it will attempt to validate the firstRFC 4566 [SP cand-type] [SP rel-addr] [SP rel-port] *(SP extension-att-name SP extension-att-value) foundation = 1*ice-char component-id = 1*DIGIT transportaddress pair in= "UDP" / transport-extension transport-extension = token ; from RFC 3261 priority = 1*DIGIT cand-type = "typ" SP candidate-types candidate-types = "host" / "srflx" / "prflx" / "relay" / token rel-addr = "raddr" SP connection-address rel-port = "rport" SP port extension-att-name = byte-string ;from RFC 4566 extension-att-value = byte-string ice-char = ALPHA / DIGIT / "+" / "/" The foundation is composed of one or more ice-char. The component-id is a positive integer, which identifies thetransport address pair check ordered list,specific component for whichwill betheoperatingtransport address is a candidate. It MUST start at 1 and MUST increment by 1 for each component of a particular candidate. Thestate machineconnect-address production is taken from RFC 4566 [10], allowing forthisIPv4 addresses, IPv6 addresses and FQDNs. The port production is also taken from RFC 4566 [10]. The token production is taken from RFC 3261 [3]. The transportaddress pair moves intoproduction indicates theTesting state. Like agent R did, it will usetransport protocol for theSTUN Send Indicationcandidate. This specification only defines UDP. However, extensibility is provided tosend a STUN Binding Request from its relayedallow for future transportaddress, STUN-PUB-2,protocols toSTUN-PUB-4 (message 16). This packet traverses the NAT (message 17) and arrives atbe used with ICE, such as TCP or theSTUN server.Datagram Congestion Control Protocol (DCCP) [28]. TheSTUN server will unwrapcand-type production encodes thecontentstype of candidate. This specification defines thepacketvalues "host", "srflx", "prflx" andsend them from STUN-PUB-2 to STUN-PUB-4. It will also, as a consequence, add a permission"relay" forSTUN-PUB-4.host, server reflexive, peer reflexive and relayed candidates, respectively. Thecontentsset ofthe packet are a STUN Binding Request with USERNAME R2:1:L3:1 (note how thiscandidate types is extensible for theflipfuture. Inclusion of theUSERNAME in the Binding Request sent by agent R). Thiscandidate type isalso a packet fromoptional. The rel-addr and rel-port productions convey information theSTUN serverrelated transport addresses. Rules for inclusion of these values is described in Section 4.4. The a=candidate attribute can itself be extended. The grammar allows for new name/value pairs toitself. However, now,be added at thepacketend of the attribute. An implementation MUST ignore any name/value pairs it doesn't understand. The syntax of the "remote-candidates" attribute is defined using Augmented BNF as defined in RFC 4234 [8]. The remote-candidates attribute isnot discarded, asapermission had been installed asmedia level attribute only. remote-candidate-att = "remote-candidates" ":" remote-candidate 0*(SP remote-candidate) remote-candidate = component-ID SP connection-address SP port The attribute contains aconsequenceconnection-address and port for each component. The ordering ofthe "suicide packet" from agent R (a suicide packetcomponents is irrelevant. However, apacket that has no hopevalue MUST be present for each component oftraversingafar end NAT, but serves the purposemedia stream. The syntax ofenabling a permission in a near end NAT so that a packet fromthepeer"ice-pwd" and "ice-ufrag" attributes are defined as: ice-pwd-att = "ice-pwd" ":" password ice-ufrag-att = "ice-ufrag" ":" ufrag password = 22*ice-char ufrag = 4*ice-char The "ice-pwd" and "ice-ufrag" attributes canbe returned).appear at either the session-level or media-level. When present in both, the value in the media-level takes precedence. Thus, theSTUN server will relayvalue at thereceived STUN request towards agent R (message 18). Thissession level isdelivered aseffectively aSTUN Data Indication. Notice how the REMOTE- ADDRESS is STUN-PUB-2; this is important as it will be useddefault that applies toconstruct the STUN Binding Response. Agent R will receive the Data Indication,all media streams, unless overriden by a media-level value. 14. Example Two agents, L andunwrap its contents to find the Binding Request. The state machine for this transport address pairR, are using ICE. Both agents have a single IPv4 interface. For agent L, it iscurrently in the Testing state. It therefore moves into the Send-Valid state,10.0.1.1, andit generatesfor agent R, 192.0.2.1. Both are configured with aBinding Response. However, the XOR-MAPPED-ADDRESS insingle STUN server each (indeed, theBinding Responsesame one for each), which isconstructed using the sourcelistening for STUN requests at an IP address of 192.0.2.2 and portthat were seen by the3478. This STUN serverwhensupports both the BindingRequest arrived at STUN-PUB-4, which is the looped message between messages 17Discovery usage and18. This source addressthe Relay usage. Agent L isSTUN-PUB-2, whichbehind a NAT, and agent R is on thevaluepublic Internet. The NAT has an endpoint independent mapping property and an address dependent filtering property. The public side of theREMOTE-ADDRESS attribute in message 18. Thus, the STUN Binding Response will contain STUN-PUB-2 inNAT has an IP address of 192.0.2.3. To facilitate understanding, transport addresses are listed using variables that have mnemonic names. The format of theXOR-MAPPED-ADDRESS, andname is entity-type-seqno, where entity refers tobe sent to STUN-PUB-2. To sendtheresponse, agent R takesentity whose interface theSTUN Binding Responsetransport address is on, andencapsulates it in a STUN Send indication, setting the DESTINATION-ADDRESS to STUN-PUB-2. Thisisshown in message 19.one of "L", "R", "STUN", or "NAT". TheSTUN server will receive this Send Indication,type is either "PUB" for transport addresses that are public, andunwrap its contents to find the STUN Binding Response. It sends it to the value"PRIV" for transport addresses that are private. Finally, seq-no is a sequence number that is different for each transport address of theDESTINATION-ADDRESS attribute, and sends it from the relayed address allocated to R, which is STUN-PUB-4. This, once again, results insame type on alooped message to itself,particular entity. Each variable has an IP address andit arrives at STUN-PUB-2. Now, however, thereport, denoted by varname.IP and varname.PORT, respectively, where varname isa permission installed for STUN-PUB-4.the name of the variable. The STUN serverwill therefore forwardhas advertised transport address STUN-PUB-1 (which is 192.0.2.2:3478) for both thepacket tobinding discovery usage and the relay usage. However, neither agentL. To do so, it constructs ais using the relay usage. In the call flow itself, STUNData Indication containingmessages are annotated with several attributes. The "S=" attribute indicates thecontentssource transport address of thepacket. It sets the REMOTE-ADDRESS tomessage. The "D=" attribute indicates thesourcedestination transport address of therequest it received (STUN-PUB-4),message. The "MA=" attribute is used in STUN Binding Response messages andforwards itrefers to the mapped address. The call flow examples omit STUN authentication operations and RTCP, and focus on RTP for a single media stream. L NAT STUN R |RTP STUN alloc. | | |(1) STUN Req | | | |S=$L-PRIV-1 | | | |D=$STUN-PUB-1 | | | |------------->| | | | |(2) STUN Req | | | |S=$NAT-PUB-1 | | | |D=$STUN-PUB-1 | | | |------------->| | | |(3) STUN Res | | | |S=$STUN-PUB-1 | | | |D=$NAT-PUB-1 | | | |MA=$NAT-PUB-1 | | | |<-------------| | |(4) STUN Res | | | |S=$STUN-PUB-1 | | | |D=$L-PRIV-1 | | | |MA=$NAT-PUB-1 | | | |<-------------| | | |(5) Offer | | | |------------------------------------------->| | | | |RTP STUN alloc. | | |(6) STUN Req | | | |S=$R-PUB-1 | | | |D=$STUN-PUB-1 | | | |<-------------| | | |(7) STUN Res | | | |S=$STUN-PUB-1 | | | |D=$R-PUB-1 | | | |MA=$R-PUB-1 | | | |------------->| |(8) answer | | | |<-------------------------------------------| | |(9) Bind Req | | | |S=$R-PUB-1 | | | |D=L-PRIV-1 | | | |<----------------------------| | |Dropped | | |(10) Bind Req | | | |S=$L-PRIV-1 | | | |D=$R-PUB-1 | | | |------------->| | | | |(11) Bind Req | | | |S=$NAT-PUB-1 | | | |D=$R-PUB-1 | | | |---------------------------->| | |(12) Bind Res | | | |S=$R-PUB-1 | | | |D=$NAT-PUB-1 | | | |MA=$NAT-PUB-1 | | | |<----------------------------| |(13) Bind Res | | | |S=$R-PUB-1 | | | |D=$L-PRIV-1 | | | |MA=$NAT-PUB-1 | | | |<-------------| | | |(14) Offer | | | |------------------------------------------->| |(15) Answer | | | |<-------------------------------------------| | |(16) Bind Req | | | |S=$R-PUB-1 | | | |D=$NAT-PUB-1 | | | |<----------------------------| |(17) Bind Req | | | |S=$R-PUB-1 | | | |D=$L-PRIV-1 | | | |<-------------| | | |(18) Bind Res | | | |S=$L-PRIV-1 | | | |D=$R-PUB-1 | | | |MA=$R-PUB-1 | | | |------------->| | | | |(19) Bind Res | | | |S=$NAT-PUB-1 | | | |D=$R-PUB-1 | | | |MA=$R-PUB-1 | | | |---------------------------->| |RTP flows | | | Figure 9 First, agent L(message 20). This traverses the NAT (message 21) and arrives at agent L. As a consequence of the receipt of a Binding Response, the state machine for this transport address pair moves to the Recv-Valid state. The agent also examines the XOR-MAPPED-ADDRESS of the STUN response. It indicates STUN-PUB-2. This is the same as the native transport address of this transport address pair, and thus doesn't representobtains anew transport address that might have been learned. Because of the receipt of message 18, the transport address pair movedhost candidate fromTesting to Send-Valid, causing R to attemptits local interface (not shown), and from that, sends aretransmission of itsSTUN Binding Requestthat was lost (the contents of message 15 that were discarded byto the STUN serverduetolack of permission). This time, however,get apermissionserver reflexive candidate (messages 1-4). Recall that the NAT hasbeen installed andtheretransmission will work. So,address and port independent mapping property. Here, itsendscreates a binding of NAT-PUB-1 for this UDP request, and this becomes theBinding Request again (message 22, identical to message 15). Thisserver reflexive candidate for RTP. Agent L sets a type preference of 9 for the host candidate and 5 for the server reflexive. The local preference islooped by9. Based on this, the priority of the host candidate is 9909 and for theSTUNserverto itself again, but this time therereflexive candidate is 5909. The host candidate is assigned apermission in place when it arrives at STUN-PUB-2. As such,foundation of 1, and therequest is forwarded towards agent L this time, inserver reflexive, aSTUN Data Indicationfoundation of 2. It chooses its server reflexive candidate as the in-use candidate, and encodes it into the m/c-line. The resulting offer (message23).5) looks like (lines folded for clarity): v=0 o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP s= c=IN IP4 $NAT-PUB-1.IP t=0 0 a=ice-pwd:asd88fgpdd777uzjYhagZg a=ice-ufrag:8hhY m=audio $NAT-PUB-1.PORT RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:1 1 UDP 9909 $L-PRIV-1.IP $L-PRIV-1.PORT typ local a=candidate:2 1 UDP 5909 $NAT-PUB-1.IP $NAT-PUB-1.PORT typ srflx raddr $L-PRIV-1.IP rport $L-PRIV-1.PORT The offer, with the variables replaced with their values, will look like (lines folded for clarity): v=0 o=jdoe 2890844526 2890842807 IN IP4 10.0.1.1 s= c=IN IP4 192.0.2.3 t=0 0 a=ice-pwd:asd88fgpdd777uzjYhagZg a=ice-ufrag:8hhY m=audio 45664 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:1 1 UDP 9909 10.0.1.1 8998 typ local a=candidate:2 1 UDP 5909 192.0.2.3 45664 typ srflx raddr 10.0.1.1 rport 8998 Thistraverses the NAT (message 24) and arrivesoffer is received at agentL.R. AgentL extracts the contents of the request, which areR will obtain aSTUN Binding Request. This causes the state machine to movehost candidate, and fromRecv-Valid to Valid. It generatesit, obtain aSTUN Binding Response, and sets the XOR-MAPPED-ADDRESS based on the value of the REMOTE- ADDRESS in message 24 (STUN-PUB-4). This Binding Response is sent to STUN-PUB-4, whichserver reflexive candidate (messages 6-7). Since R isaccomplished throughnot behind aSTUN Send Indication (message 25). This Send Indication traverses the NAT (message 26) andNAT, this candidate isreceived byidentical to its host candidate, and they share theSTUN server. Its contents are decapsulated,same base. It therefore discards this candidate andsent to STUN-PUB-4, which is againends up with aloop onsingle host candidate. With identical type and local preferences as L, thesame host. This packetpriority for this candidate isthen sent towards agent R in9909. It chooses aData Indication (message 27). The contentsfoundation of 1 for its single candidate. Its resulting answer looks like: v=0 o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP s= c=IN IP4 $R-PUB-1.IP t=0 0 a=ice-pwd:YH75Fviy6338Vbrhrlp8Yh a=ice-ufrag:9uB6 m=audio $R-PUB-1.PORT RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:1 1 UDP 9909 $R-PUB-1.IP $R-PUB-1.PORT typ local With theDATA Indication are extracted,variables filled in: v=0 o=bob 2808844564 2808844564 IN IP4 192.0.2.1 s= c=IN IP4 192.0.2.1 t=0 0 a=ice-pwd:YH75Fviy6338Vbrhrlp8Yh a=ice-ufrag:9uB6 m=audio 3478 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:1 1 UDP 9909 192.0.2.1 3478 typ local Agents L andthe agent sees a successful Binding Response. It therefore moves the state machine from the Send-Valid state to the Valid state. At this point, the transport addressR both pairis inup theValid state for both agents. Approximately Tb seconds after agent R sent message 15,candidates. They both initially have two. However, agentRL willstart checks forprune thenext transport addresspairincontaining itstransport addressserver reflexive candidate, resulting in just one. At agent L, this paircheck ordered list. This is the second component(the check) has a local candidate ofthe same$L_PRIV_1 and remote candidatepair, used for RTCP. This sequence, messages 28 through 40, are identical to the ones for RTP, but differ only in the specific transport addresses. Once that validation happens, the second transport address pairof $R_PUB_1, and hasbeen validated. Thea candidate pairmoves into the valid state, and both candidatespriority of 99099909.039. At agent R, there areconsidered valid.two checks. Theoperatinghighest priority has a local candidate of $R_PUB_1 and remote candidate of $L_PRIV_1 and hasnow been validated,a priority of 99099909.039, andmedia can begin to flow. It will do so through the STUN server; indeed, it is relayed "twice" throughtheSTUN server. Even though there issecond has asingle STUN server, it is logically acting as two separate STUN servers. Indeed, had Llocal candidate of $R_PUB_1 and remote candidate of $NAT_PUB_1 and priority 59099909.75. Agent Rused two separate STUN servers, media would be relayed through both STUN servers in a trapezoid configuration. The actual media flows are shown as well. It is important to note that, sincebegins its connectivity check (message 9) for theICE checks have not yet concluded onfirst pair (between the two host candidates). The host candidatethat will ultimately be used, no STUN Set Active Destinations have been sent. Asfrom agent L is private and behind aconsequence, media thatdifferent NAT, and thus this check issent throughdiscarded. When agent L gets theSTUN servers has to be sent using STUN Send indications.answer, it performs its one and only connectivity check (messages 10-13). Thisintroduces some overhead, but is a transient condition. In message 41,will succeed. This causes agent Lsends an RTP packettoagent R usingcreate aSend indication. Itnew pair, whos local candidate issent to STUN-PUB-4. This traversesfrom theNAT (message 42),mapped address in the binding response (NAT-PUB-1 from message 13) andarrives atwhose remote candidate is theSTUN server. Itdestination of the request (R-PUB-1 from message 10). This isdecapsulated, loopedadded toitself,the valid list. At this point, agent L examines the valid list andarrives at STUN-PUB-4. From there, itsees that there isencapsulated inaData Indication and sent to agent R (message 43). Incandidate there for each component of each media stream (which is just RTP for thereverse direction, agent R will sendsingle audio stream). It therefore considers ICE checks complete and sends anRTP packet using a STUN Send indicationupdated offer (message42), and send it to STUN-PUB-2.14). Thisis received byoffer serves only to remove theSTUN server, decapsulated,candidate that was not selected andsent to STUN-PUB-2 from STUN-PUB-4.indicate the remote candidates; the m/c-line remains unchanged. Thisis again a loop withinoffer looks like: v=0 o=jdoe 2890844528 2890842809 IN IP4 10.0.1.1 s= c=IN IP4 192.0.2.3 t=0 0 a=ice-pwd:asd88fgpdd777uzjYhagZg a=ice-ufrag:8hhY m=audio 45664 RTP/AVP 0 a=remote-candidates 1 192.0.2.1 3478 a=rtpmap:0 PCMU/8000 a=candidate:2 1 UDP 5909 192.0.2.3 45664 typ srflx raddr 10.0.1.1 rport 8998 Agent R can construct thesame host, arriving at STUN-PUB-4. The contentsanswer. Since the remote-candidates listed in the offer match the ones that agent R had already selected for the m/c-line in the previous answer, there is no change there. Its answer therefore looks like: v=0 o=bob 2808844565 2808844566 IN IP4 192.0.2.1 s= c=IN IP4 192.0.2.1 t=0 0 a=ice-pwd:YH75Fviy6338Vbrhrlp8Yh a=ice-ufrag:9uB6 m=audio 3478 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:1 1 UDP 9909 192.0.2.1 3478 typ local Upon receipt of thepacket are sent tocheck from agent Lthrough a STUN Data Indication (message 45), which traverses the NAT(message46) to arrive at agent L. Since this call flow is already long enough, RTCP packet transmission is not shown. Approximately Tb seconds after it sends message 29,11), agentL goesR will generate its triggered check. This check happens to match the nexttransport address pair inone on itstransport address paircheckorderedlistthat is in the Waiting state. This will be the RTP candidate for the top priority- from its host candidatepair, which is L-PRIV-1 on agent L and R-PUB-1 onto agentR.L's server reflexive candidate. Thisis a local candidate for each agent. To perform the check,check (messages 16-19) will succeed. Consequently, agentL sendsR constructs aSTUN Binding Request from L-PRIV-1 to R-PUB-1 (message 47). Notenew candidate pair using theUSERNAME of R1:1:L1:1, which identifies this transportmapped addresspair. This traverses the NAT (message 48). Sincefrom theNAT hasresponse as theaddress and port dependent mapping property,local candidate (R-PUB-1) andthis is a newthe destinationIP address,of theNAT allocates a new transport address on its public side, NAT-PUB-3, and places this inrequest (NAT-PUB-1) as thesource IP address and port.remote candidate. Thispacket arrives at agent R. Agent R finds a matching transport addresspairin the Waiting state. The state machine transitions to the Send-Valid state. It sends the Binding response, with a XOR- MAPPED-ADDRESS indicating NAT-PUB-3 (message 49), which traverses the NAT and arrives at agent L (message 50). Agent R, in addition to sending the response, will also send a Binding Request. Itisimportant to remember that this Binding Request is sentadded to theremote address invalid list. Since this pair matches thetransport addresspair(L-PRIV-1), and NOT toin thesource IP address and portm/c-lines, agent R can send media as well. 15. Security Considerations There are several types ofthe Binding Request (NAT-PUB-3); that will happen later.attacks possible in an ICE system. This section considers these attacks and their countermeasures. 15.1. Attacks on Connectivity Checks An attacker might attemptis shown in message 51. However, sinceto disrupt theL-PRIV-1 is private,STUN connectivity checks. Ultimately, all of these attacks fool an agent into thinking something incorrect about thepacket is discarded inresults of thenetwork. Now, asconnectivity checks. The possible false conclusions an attacker can try and cause are: False Invalid: An attacker can fool aconsequencepair ofreceiving message 48, agent R will have constructedagents into thinking apeer-derived candidate. The candidate ID for thiscandidate pair isL1R1, andinvalid, when itinitially containsisn't. This can be used to cause an agent to prefer asingle transport address pair, NAT-PUB-3 and R-PUB-1. However,different candidate (such as one injected by the attacker), or to disrupt a call by forcing all candidates to fail. False Valid: An attacker can fool a pair of agents into thinking a candidateisn't yet usable untilpair is valid, when it isn't. This can cause an agent to proceed with a session, but then not be able to receive any media. False Peer-Reflexive Candidate: An attacker can cause an agent to discover a new peer reflexive candidate, when it shouldn't have. This can be used to redirect media streams to a DoS target or to the attacker, for eavesdropping or othercomponent gets added. Similarly,purposes. False Valid on False Candidate: An attacker has already convinced an agentL will have constructed the same peer-derived candidate, with the samethat there is a candidateID and the same transportwith an addresspair. Some Tb seconds after sending message 28,that doesn't actually route to that agentR will move(for example, by injecting a false peer reflexive candidate or false server reflexive candidate). It must then launch an attack that forces the agents to believe that this candidate is valid. Of thenext transport address pairvarious techniques for creating faked STUN messages described in [11], many are not applicable for thetransport address pair check ordered list whose state is Waiting. This is the RTCP componentconnectivity checks. Compromises of STUN servers are not much ofthe highest priority candidate pair. It will attemptaconnectivity check, from R-PUB-2 to L-PRIV-2 (message 52). Since L-PRIV-1 is private, this message is discarded. Some Tb seconds after sending message 47, agent L will move toconcern, since thenext transport address pairSTUN servers are embedded in endpoints and distributed throughout thetransport address pair check ordered list whose state is Waiting. Thisnetwork. Thus, compromising the STUN server is equivalent to comprimising theRTCP componentendpoint, and if that happens, far more problematic attacks are possible than those against ICE. Similarly, DNS attacks are usually irrelevant since STUN servers are not typically discovered via DNS, they are signaled via IP addresses embedded in SDP. Injection ofthe highest priority candidate pair. It will attempt a connectivity check, from L-PRIV-2 to R-PUB-2 (message 53), which operates nearly identically to messages 47-50,fake responses and relaying modified requests all can be handled in ICE with theexception ofcountermeasures discussed below. To force thespecific addresses. Here,false invalid result, theNAT will create a new bindingattacker has to wait for theRTCP, NAT-PUB-4, and this transport address is new for both participants. On receiptconnectivity check from one ofthis Binding Request at agent R (message 54), agent R constructs the candidate ID forthepeer-derived candidate, L1R1, and findsagents to be sent. When italready exists. As such, this new transport address is added, and the peer-derived candidate becomes complete and usable. Agent L does the same thing on receipt of message 56. This candidate will haveis, thesame priority as its generating candidate L1 (1.0), and is paired upattacker needs to inject a fake response withR1 (also at priority 1.0). Since L1R1 has the same priorityan unrecoverable error response, such asL1 itself,a 600. However, since theordering algorithmcandidate is, inSection 7.5 will usefact, valid, thereverse ASCII sort order oforiginal request may reach thecandidate ID iselfpeer agent, and result in a success response. The attacker needs todetermine order. L1R1 is larger than L1, so that the peer-derived candidate will come beforeforce this packet or itsgenerating candidate. Asresponse to be dropped, through aconsequence,DoS attack, layer 2 network disruption, or other technique. If it doesn't do this, thepeer-derived candidate pairsuccess response willhave a higher priority than its generating candidate, and appear just before it inalso reach thecandidate pair priority ordered list. Asoriginator, alerting it to aconsequence, after agent R sendspossible attack. Fortunately, this attack is mitigated completely through the STUN message55integrity mechanism. The attacker needs to inject a fake response, andcompletes the peer-derived candidate, it will movein order for this response to be processed, thetwo transport addresses inattacker needs thepeer derived candidate intopassword. If theSend-Valid state, and send a Binding Request for each in rapid succession (agent Loffer/answer signaling is secured, the attacker will not havemoved both intotheRecv-Valid state upon receipt of message 56).password. Forcing the fake valid result works in a similar way. Thefirst of these connectivity checks areagent needs to wait for theRTP component,Binding Request fromR-PUB-1each agent, and inject a fake success response. The attacker won't need toNAT-PUB-3 (message 57). Note the USERNAME in the STUN Binding Request, L1R1:1:R1:1, which identifiesworry about disrupting thepeer-derived transport address pair. This will succesfully traverseactual response since, if theNAT andcandidate is not valid, it presumably wouldn't bedelivered to agent L (message 58). The receipt of this request movesreceived anyway. However, like thestate machine forfake invalid attack, thistransport address pair from Recv- Valid to Valid, and a Binding Responseattack issent (message 59). This passesmitigated completely through theNATSTUN message integrity andarrives at agent R (message 60). This causes its state machine to enteroffer/answer security techniques. Forcing theValid state as well. Thefalse peer reflexivetransport address, R-PUB-1, is not newcandidate result can be done either with fake requests or responses, or with replays. We consider the fake requests and responses case first. It requires the attacker to send a Binding Request to one agentRwith a source IP address andthus does not result inport for thecreation of a new peer-derivedfalse candidate.Messages 61 through 64 showIn addition, the attacker must wait for a Binding Request from the other agent, and generate a fake response with a XOR-MAPPED-ADDRESS attribute containing the false candidate. Like thesame basic flow for RTCP. Upon receipt ofother attacks described here, this attack is mitigated by the STUN message64, both transport address pairs are Valid at both agents, causingintegrity mechanisms and secure offer/answer exchanges. Forcing the false peerderivedreflexive candidateto become valid. Timer Twsresult with packet replays isset at agent L,different. The attacker waits until one of the agents sends a check. It intercepts this request, andfires without any higher priority candidate pairs becoming validated. Atreplays it towards the other agentR, media can now be sent on this candidate pairwith a faked source IP address. It must also prevent the original request fromanswerer (agent R) to offerer (agent L). Agent L sends an updated offerreaching the remote agent, either by launching a DoS attack topromotecause thepeer-derived candidatepacket tooperating. This offer (message 65) looks like: v=0 o=jdoe 2890844526 2890842808 IN IP4 $L-PRIV-1.IP s= c=IN IP4 $NAT-PUB-3.IP t=0 0 a=ice-pwd:$LPASS m=audio $NAT-PUB-3.PORT RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=rtcp:$NAT-PUB-4.PORT a=remote-candidate:R1 a=candidate:$L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT a=candidate:$L1 2 UDP 1.0 $L-PRIV-2.IP $L-PRIV-2.PORT There are several important thingsbe dropped, or forcing it tonote in this offer. Firstly, note howbe dropped using layer 2 mechanisms. The replayed packet is received at them/c-line now contains NAT-PUB-3other agent, andNAT-PUB-4,accepted, since thepeer derived transport addresses it learned throughintegrity check passes (the integrity check cannot and does not cover theICE processing. Secondly, note how there remainssource IP address and port). It is then responded to. This response will contain acandidate encoded intoXOR- MAPPED-ADDRESS with thea=candidate attributes. This is candidate L1, NOT candidate L1R1. Recallfalse candidate, and will be sent to that false candidate. The attacker must then intercept it and relay it towards thepeer-derived candidates are never encoded into the SDP. Rather, their generating candidate is encoded. Thisoriginator. The other agent willcause keepalivesthen initiate a connectivity check towards that false candidate. This validation needs totake place forsucceed. This requires thegenerating candidate ifattacker to force a false valid(though its not) and anyon a false candidate. Injecting ofits derived candidates, whichfake requests or responses to achieve this goal iswhat we want. Finally, noticeprevented using theinclusionintegrity mechanisms of STUN and thea=remote-candidate attribute. Since agent L doesn't know whether agent R received messages 60 or 64, it doesnt know whetheroffer/answer exchange. Thus, this attack can only be launched through replays. To do that, thestate ofattacker must intercept thecandidate is Send-Valid or Valid at agent R. So,check towards this false candidate, and replay ithas to tell agent R that, in case its Send-Valid, to please usetowards the other agent. Then, itanyway. The answer generated by agent R looks like: v=0 o=bob 2808844564 2808844565 IN IP4 $R-PUB-1.IP s= c=IN IP4 $R-PUB-1.IP t=0 0 a=ice-pwd:$RPASS m=audio $R-PUB-1.PORT RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=rtcp:$R-PUB-2.PORT a=candidate:$R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT a=candidate:$R1 2 UDP 1.0 $R-PUB-2.IP $R-PUB-2.PORT With this, media can now flow directly between endpoints. The removal ofmust intercept therelayed candidates fromresponse and replay that back as well. This attack is very hard to launch unless theoffer/answer exchange will causeattacker themself is identified by theSTUN relay allocations to be removed. 12. Grammarfake candidate. Thisspecification defines three new SDP attributes -is because it requires the"candidate", "remote-candidate"attacker to intercept and"ice-pwd" attributes. The candidate attribute is a media-level attribute only. It contains a transport address for a candidate thatreplay packets sent by two different hosts. If both agents are on different networks (for example, across the public Internet), this attack can beused for connectivity checks. There may be multiple candidate attributes in a media block. There is no requirement that a=candidate attribute which indicate components forhard to coordinate, since it needs to occur against two different endpoints on different parts of the network at the same time. If the attacker themself is identified by the fake candidateappear one right aftertheother or in component ID order. The syntax of this attributeattack isdefined using Augmented BNF as defined in RFC 4234 [9]: candidate-attribute = "candidate" ":" candidate-id SP component-id SP transport SP qvalue SP ;qvalue from RFC 3261 connection-address SP ;from RFC 4566 port ;port from RFC 4566 [SP cand-type] [SP rel-addr] [SP rel-port] *(SP extension-att-name SP extension-att-value) transport = "UDP" / transport-extension transport-extension = token ; from RFC 3261 candidate-id = 1*base64-char base64-char = ALPHA / DIGIT / "+" / "/" component-id = 1*DIGIT cand-type = "typ" SP candidate-types candidate-types = "local" / "srflx" / "relay" / token rel-addr = "raddr" SP connection-address rel-port = "rport" SP port extension-att-name = byte-string ;from RFC 4566 extension-att-value = byte-string The candidate-ideasier to coordinate. However, if SRTP is usedto group together[21], thetransport addresses for a particular candidate. It MUSTattacker will not beconstructed with at least 24 bits of randomness. It MUST have the same value for all transport addresses withinable to play thesame candidate. It MUST have a different value for transport addresses within different candidates formedia packets, they will only be able to discard them, effectively disabling thesamemediastream. The candidate-id uses a syntaxstream for the call. However, this attack requires the agent to disrupt packets in order to block the connectivity check from reaching the target. In that case, if the goal isdefinedtobe equaldisrupt the media stream, its much easier to just disrupt it with thebase64 alphabet [3], which allowssame mechanism, rather than attack ICE. 15.2. Attacks on Address Gathering ICE endpoints make use of STUN for gathering candidates rom a STUN server in thecandidate-idnetwork. This is corresponds tobe generated by performing a base64 encodingthe Binding Discovery usage of STUN described in [11]. As arandomly generated value (note, however,consequence, the attacks against STUN itself thatthis does not meanare described in that specification can still be used against thecandidate-id or password is base64 decodedbinding discovery usage whenuse inutilized with ICE. However, the additional mechanisms provided by ICE actually counteract such attacks, making binding discovery with STUNmessages). In addition, if contentmore secure when combined with ICE than without ICE. Consider an attacker which isbase64 encodedable togenerate the candidate-id, it MUST NOT be paddedprovide an agent with'='. Section 2.2 of RFC 3548 indicates that some base64 usages do not require padding, and it requests that such usages call outa faked mapped address in a STUN Binding Request thatfact. ICEisone such usage.used for address gathering. This isbecausethedata is never decoded. The component-id isprimary attack primitive described in [11]. This address will be used as apositive integer, which identifies the specific component ofserver reflexive candidate in thecandidate. It MUST start at 1 and MUST increment by 1ICE exchange. For this candidate to actually be used foreach component ofmedia, the attacker must also attack the connectivity checks, and in particular, force aparticularfalse valid on a false candidate.The addr productionThis attack istaken from [10], allowing for IPv4 addresses, IPv6 addressesvery hard to launch if the false address identifies a third party, andFQDNs. The port production is taken from RFC 4566 [5]. The token productionistaken from RFC 3261 [2]. The transport production indicatesprevented by SRTP if it identifies thetransport protocol forattacker themself. If thecandidate. This specification only defines UDP.attacker elects not to attack the connectivity checks, the worst it can do is prevent the server reflexive candidate from being used. However,extensibilityif the peer agent has at least one candidate that isprovided to allow for future transport protocols toreachable by the agent under attack, the STUN connectivity checks themselves will provide a peer reflexive candidate that can be usedwith ICE, such as TCP or the Datagram Congestion Control Protocol (DCCP) [30]. The cand-type production encodesfor thetypeexchange oftransport address. This specification definesmedia. Peer reflexive candidates are generally preferred over server reflexive candidates. As such, an attack solely on thevalues "local" forSTUN address gathering will normally have no impact on alocal transport address, "srflx"session at all. 15.3. Attacks on the Offer/Answer Exchanges An attacker that can modify or disrupt the offer/answer exchanges themselves can readily launch a variety of attacks with ICE. They could direct media to a target of a DoS attack, they could insert themselves into the media stream, and so on. These are similar to the general security considerations fora server reflexive transport address,offer/answer exchanges, and"relay"the security considerations in RFC 3264 [4] apply. These require techniques fora relayed transport address. The set of candidate types is extensiblemessage integrity and encryption for offers and answers, which are satisfied by thefuture. Note that there is no value defined for peer reflexive transport addresses. ThisSIPS mechanism [3] when SIP isbecause these transport addresses are never carried inused. As such, theSDP itself; they are learned implicitly through connectivity checks. Inclusionusage of SIPS with ICE is RECOMMENDED. 15.4. Insider Attacks In addition to attacks where thecandidate typeattacker isoptional. The rel-addr and rel-port productions convey information on related transport addresses. Foraserver reflexive transport address,third party trying to insert fake offers, answers or stun messages, there are several attacks possible with ICE when therel-addrattacker is an authenticated andrel-port containvalid participant in theassociated local transport address. For a relayed transport address,ICE exchange. 15.4.1. The Voice Hammer Attack The voice hammer attack is an amplification attack. In this attack, therel-addrattacker initiates sessions to other agents, andrel-port containincludes theserver reflexive transportIP addresstowards the relay. If rel- addr is present, rel-port MUST be present,andif rel-port is present, rel-addr MUST be present. Ifport of a DoS target in thecandidate typem/c-line of their SDP. This causes substantial amplification; a single offer/answer exchange can create a continuing flood of media packets, possibly at high rates (consider video sources). This attack is not specific to ICE, but ICE can help provide remediation. Specifically, if ICE is"local", rel-addr and rel-port MUST NOT be present. Ifused, thecandidate type is "srflx" or "relayed", both rel-addr and rel-port MUST be present. The a=candidate attribute can itself be extended. The grammar allows for new name/value pairsagent receiving the malicious SDP will first peform connectivity checks tobe added attheendtarget ofthe attribute. An implementation MUST ignore any name/value pairsmedia before sending itdoesn't understand. The syntax ofthere. If this target is a third party host, the"remote-candidate" attributechecks will not succeed, and media isdefined using Augmented BNF as defined in RFC 4234 [9]: remote-candidate-att = "remote-candidate" ":" candidate-id This attribute MUST be presentnever sent. Unfortunately, ICE doesn't help if its not used, in which case an attacker could simply send the offerwhenwithout thecandidateICE parameters. However, in environments where them/c-line is partset ofa candidate pairclients are known, and limited to ones thatis insupport ICE, thevalidserver can reject any offers orpartially valid state.answers that don't indicate ICE support. 15.4.2. STUN Amplification Attack ThesyntaxSTUN amplification attack is similar to the voice hammer. However, instead of voice packets being directed to the"ice-pwd" attributetarget, STUN connectivity checks are directed to the target. This attack isdefined as: ice-pwd-att = "ice-pwd" ":" password password = 1*base64-characcomplished by having the offerer send an offer with a large number of candidates, say 50. The"ice-pwd" attribute can appear at eitheranswerer receives thesession-level or media-level. When present in both,offer, and starts its checks, which are directed at thevaluetarget, and consequently, never generate a response. The answerer will start a new connectivity check every 50ms, and each check is a STUN transaction consisting of 9 retransmits of a message 65 bytes in length (plus 28 bytes for themedia-level takes precedence. Thus,IP/UDP header) that runs for 7.9 seconds, for a total of 105 bytes/ second per transaction on average. In thevalueworst case, there can be 158 transactions in progress atthe session level is effectively a default that applies to all media streams, unless overridenonce (7.9 seconds divided by 50ms), for amedia-level value. It MUST have at least 128 bitstotal ofrandomness. Like the candidate ID, its syntax132 kbps, just for STUN requests. It istaken fromimpossible to eliminate thebase64 alphabet, allowingamplification, but thepassword tovolume can begenerted fromreduced through abase64 encodingvariety ofa 128 bit value. In addition, if content is base64 encoded to generateheuristics. For example, agents can limit thecandidate ID, it MUST NOT be padded with '='. 13. Security Considerations There are several typesnumber ofattacks possiblecandidates they'll accept in anICE system. This section considers these attacks and their countermeasures. 13.1. Attacks on Connectivity Checks An attacker might attempt to disruptoffer or answer, they can increase theSTUN-based connectivity checks. Ultimately, allvalue of Ta, or exponentially increase Ta as time goes on. All of theseattacks fool an agent into thinking something incorrect aboutultimately trade off theresults oftime for theconnectivity checks. The possible false conclusionsICE exchanges to complete, with the amount of traffic that gets sent. OPEN ISSUE: Need better remediation for this. Especially anattacker can try and cause are: False Invalid: An attacker can fool a pairissue if we reduce Ta to be as fast as media packets themselves, in which case this attack is as equally devastating as the voice hammer. 16. IANA Considerations This specification defines four new SDP attributes per the procedures ofagents into thinking aSection 8.2.4 of [10]. The required information for the registrations are included here. 16.1. candidatepairAttribute Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. Attribute Name: candidate Long Form: candidate Type of Attribute: media level Charset Considerations: The attribute isinvalid, when it isn't.not subject to the charset attribute. Purpose: Thiscan beattribute is usedto cause an agent to prefer a different candidate (such aswith Interactive Connectivity Establishment (ICE), and provides oneinjected by the attacker), or to disrupt a call by forcing all candidates to fail. False Valid: An attacker can fool a pairofagents into thinking amany possible candidatepair is valid, when it isn't. This can causeaddresses for communication. These addresses are validated with anagentend-to-end connectivity check using Simple Traversal Underneath NAT (STUN). Appropriate Values: See Section 13 of RFC XXXX [Note toproceedRFC-ed: please replace XXXX witha session, but thenthe RFC number of this specification]. 16.2. remote-candidates Attribute Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. Attribute Name: remote-candidates Long Form: remote-candidates Type of Attribute: media level Charset Considerations: The attribute is notbe able to receive any media. False Peer-Derived Candidate: An attacker can cause an agentsubject todiscover a new peer-derived candidate, when it shouldn't have.the charset attribute. Purpose: Thiscan beattribute is used with Interactive Connectivity Establishment (ICE), and provides the identity of the remote candidates that the offerer wishes the answerer toredirect media streams to a DoS target oruse in its answer. Appropriate Values: See Section 13 of RFC XXXX [Note to RFC-ed: please replace XXXX with theattacker, for eavesdroppingRFC number of this specification]. 16.3. ice-pwd Attribute Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. Attribute Name: ice-pwd Long Form: ice-pwd Type of Attribute: session orother purposes. False Valid on False Candidate: An attacker has already convinced an agent that theremedia level Charset Considerations: The attribute isa candidate with an address that doesn't actually routenot subject tothat agent (for example, by injecting a false peer-derived candidate or false STUN-derived candidate). It must then launch an attack that forcestheagents to believe that this candidatecharset attribute. Purpose: This attribute isvalid. Ofused with Interactive Connectivity Establishment (ICE), and provides thevarious techniques for creating fakedpassword used to protect STUNmessages described in [12], many are not applicable for theconnectivity checks.CompromisesAppropriate Values: See Section 13 ofSTUN servers are not muchRFC XXXX [Note to RFC-ed: please replace XXXX with the RFC number ofa concern, sincethis specification]. 16.4. ice-ufrag Attribute Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. Attribute Name: ice-ufrag Long Form: ice-ufrag Type of Attribute: session or media level Charset Considerations: The attribute is not subject to theSTUN servers are embedded in endpointscharset attribute. Purpose: This attribute is used with Interactive Connectivity Establishment (ICE), anddistributed throughout the network. Thus, compromisingprovides theSTUN server is equivalentfragments used tocomprimisingconstruct theendpoint, and if that happens, far more problematic attacks are possible than those against ICE. Similarly, DNS attacks are irrelevant since STUN servers are not discovered via DNS, they are signaled via SIP. Injection of fake responses and relaying modified requests all can be handledusername inICESTUN connectivity checks. Appropriate Values: See Section 13 of RFC XXXX [Note to RFC-ed: please replace XXXX with thecountermeasures discussed below. To force the false invalid result, the attackerRFC number of this specification]. 17. IAB Considerations The IAB hasto wait forstudied theconnectivity check for oneproblem of "Unilateral Self Address Fixing", which is theagentsgeneral process by which a agent attempts tobe sent. When it is,determine its address in another realm on theattacker needs to injectother side of afake response withNAT through a collaborative protocol reflection mechanism [19]. ICE is anunrecoverable error response, such asexample of a600. This attack only needs to be launched against oneprotocol that performs this type of function. Interestingly, theagents in order to invalidate the candidate pair. However, since the candidate is, in fact, valid, the original request may reach the peer agent,process for ICE is not unilateral, but bilateral, andresult inthe difference has asuccess response. The attacker needs to force this packet or its response tosignficant impact on the issues raised by IAB. Indeed, ICE can bedropped, throughconsidered aDoS attack, layer 2 network disruption, or other technique. If it doesn't do this, the success response will also reachB-SAF (Bilateral Self-Address Fixing) protocol, rather than an UNSAF protocol. Regardless, theoriginator, alerting it toIAB has mandated that any protocols developed for this purpose document apossible attack.specific set of considerations. Thiswill cause the agent to abandon the candidate, whichsection meets those requirements. 17.1. Problem Definition From RFC 3424 any UNSAF proposal must provide: Precise definition of a specific, limited-scope problem that is to be solved with thedesired result in any case. Fortunately,UNSAF proposal. A short term fix should not be generalized to solve other problems; thisattackismitigated completely through the STUN message integrity mechanism.why "short term fixes usually aren't". Theattacker needs to injectspecific problems being solved by ICE are: Provide afake response, and in ordermeans forthis responsetwo peers tobe processed, the attacker needsdetermine thepassword. Ifset of transport addresses which can be used for communication. Provide a means for resolving many of theoffer/answer signalinglimitations of other UNSAF mechanisms by wrapping them in an additional layer of processing (the ICE methodology). Provide a means for a agent to determine an address that issecured,reachable by another peer with which it wishes to communicate. 17.2. Exit Strategy From RFC 3424, any UNSAF proposal must provide: Description of an exit strategy/transition plan. The better short term fixes are theattackerones that willnot have the password. Forcingnaturally see less and less use as thefake valid result worksappropriate technology is deployed. ICE itself doesn't easily get phased out. However, it is useful even in asimilar way. The agent needsglobally connected Internet, to serve as a means for detecting whether a router failure has temporarily disrupted connectivity, for example. ICE also helps prevent certain security attacks which have nothing to do with NAT. However, what ICE does is help phase out other UNSAF mechanisms. ICE effectively selects amongst those mechanisms, prioritizing ones that are better, and deprioritizing ones that are worse. Local IPv6 addresses can be preferred. As NATs begin towait for the Binding Request from each agent,dissipate as IPv6 is introduced, server reflexive andinject a fake success response. The attacker won't needrelayed candidates (both forms of UNSAF mechanisms) simply never get used, because higher priority connectivity exists toworry about disrupting the actual response since, if the candidate is not valid, it presumably wouldn't be received anyway. However, likethefake invalid attack, this attack is mitigated completely throughnative host candidates. Therefore, theSTUN message integrityservers get used less andoffer/answer security techniques. Forcingless, and can eventually be remove when their usage goes to zero. Indeed, ICE can assist in thefalse peer-derived candidate resulttransition from IPv4 to IPv6. It can bedone either with fake requests or responses,used to determine whether to use IPv6 or IPv4 when two dual-stack hosts communicate withreplays. We consider the fake requests and responses case first.SIP (IPv6 gets used). Itrequires the attacker to sendcan also allow aBinding Requestnetwork with both 6to4 and native v6 connectivity toone agentdetermine which address to use when communicating with asource IP address and port for the false transport address. In addition, the attackerpeer. 17.3. Brittleness Introduced by ICE From RFC3424, any UNSAF proposal mustwait for a Binding Requestprovide: Discussion of specific issues that may render systems more "brittle". For example, approaches that involve using data at multiple network layers create more dependencies, increase debugging challenges, and make it harder to transition. ICE actually removes brittleness from existing UNSAF mechanisms. In particular, traditional STUN (as described in RFC 3489 [13]) has several points of brittleness. One of them is theother agent, and generate a fake response withdiscovery process which requires aXOR-MAPPED-ADDRESS attribute.agent to try and classify the type of NAT it is behind. Thisattackprocess isbest launched against a candidate pairerror-prone. With ICE, that discovery process islikely to be invalid, sosimply not used. Rather than unilaterally assessing theattacker doesnt need to contend withvalidity of theactual responsesaddress, its validity is dynamically determined by measuring connectivity tothe reala peer. The process of determining connectivitychecks. Like the other attacks described here, this attackismitigated by thevery robust. Another point of brittleness in traditional STUNmessage integrity mechanismsandsecure offer/answer exchanges. Forcing the false peer-derived candidate result with packet replaysany other unilateral mechanism isdifferent. The attacker waits until oneits absolute reliance on an additional server. ICE makes use ofthea server for allocating unilateral addresses, but allows agentssendsto directly connect if possible. Therefore, in some cases, the failure of aBinding RequestSTUN server would still allow foronea call to progress when ICE is used. Another point of brittleness in traditional STUN is that it assumes that thetransportSTUN server is on the public Internet. Interestingly, with ICE, that is not necessary. There can be a multitude of STUN servers in a variety of addresspairs. It then intercepts this request, and replays it towardsrealms. ICE will discover theotherone that has provided a usable address. The most troubling point of brittleness in traditional STUN is that it doesn't work in all network topologies. In cases where there is a shared NAT between each agentwith a faked source IP address. It must also prevent the original request from reachingand theremote agent, eitherSTUN server, traditional STUN may not work. With ICE, that restriction is removed. Traditional STUN also introduces some security considerations. Fortunately, those security considerations are also mitigated bylaunching a DoS attack to cause the packet to be dropped, or forcing itICE. Consequently, ICE serves tobe dropped using layer 2 mechanisms. The replayed packet is received atrepair the brittleness introduced in otheragent, and accepted, since the integrity check passes (the integrity check cannotUNSAF mechanisms, and does notcoverintroduce any additional brittleness into thesource IP address and port). It is then responded to. This response will containsystem. 17.4. Requirements for aXOR-MAPPED- ADDRESS with the false transport address. It is passedLong Term Solution From RFC 3424, any UNSAF proposal must provide: Identify requirements for longer term, sound technical solutions -- contribute to thethis false address. The attacker must then interceptprocess of finding the right longer term solution. Our conclusions from STUN remain unchanged. However, we feel ICE actually helps because we believe it can be part of the long term solution. 17.5. Issues with Existing NAPT Boxes From RFC 3424, any UNSAF proposal must provide: Discussion of the impact of the noted practical issues with existing, deployed NA[P]Ts andrelay it towardsexperience reports. A number of NAT boxes are now being deployed into theoriginator. The other agent will then initiatemarket which try and provide "generic" ALG functionality. These generic ALGs hunt for IP addresses, either in text or binary form within aconnectivity check towards that transport address.packet, and rewrite them if they match a binding. Thisvalidation needsinterferes with traditional STUN. However, the update tosucceed.STUN [11] uses an encoding which hides these binary addresses from generic ALGs. Since [11] is required for all ICE implementations, this NAPT problem does not impact ICE. Existing NAPT boxes have non-deterministic and typically short expiration times for UDP-based bindings. This requiresthe attackerimplementations toforce a false valid onsend periodic keepalives to maintain those bindings. ICE uses afalse candidate. Injectingdefault offake requests or responses15s, which is a very conservative estimate. Eventually, over time, as NAT boxes become compliant toachievebehave [30], thisgoal is prevented using the integrity mechanisms of STUNminimum keepalive will become deterministic and well-known, and theoffer/answer exchange. Thus, this attackICE timers canonlybelaunched through replays. To do that, the attacker must intercept the Binding Request towards this false transport address, and replay it towards the other agent. Then, it must intercept the response and replay that back as well. This attack is very hard to launch unless the attacker themself is identified by the fake transport address. This is because it requires the attackeradjusted. Having a way tointerceptdiscover andreplay packets sent by two different hosts. If both agents are on different networks (for example, acrosscontrol thepublic Internet), this attack canminimum keepalive interval would behardfar better still. 18. Acknowledgements The authors would like tocoordinate, since it needsthank Flemming Andreasen, Rohan Mahy, Dean Willis, Eric Cooper, Dan Wing, Douglas Otis, Tim Moore, and Francois Audet for their comments and input. A special thanks goes tooccur against two different endpoints on different partsBill May, who suggested several of thenetwork at the same time. Ifconcepts in this specification, Philip Matthews, who suggested many of theattacker themself is identified bykey performance optimizations in this specification, Eric Rescorla, who drafted thefake transport address,text in theattack is easier to coordinate. However, if SRTP is used [22],introduction, and Magnus Westerlund, for doing several detailed reviews on theattacker will not be ablevarious revisions of this specification. 19. References 19.1. Normative References [1] Bradner, S., "Key words for use in RFCs toplay the media packets, they will only be ableIndicate Requirement Levels", BCP 14, RFC 2119, March 1997. [2] Huitema, C., "Real Time Control Protocol (RTCP) attribute in Session Description Protocol (SDP)", RFC 3605, October 2003. [3] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [4] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002. [5] Casner, S., "Session Description Protocol (SDP) Bandwidth Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 3556, July 2003. [6] Camarillo, G., Marshall, W., and J. Rosenberg, "Integration of Resource Management and Session Initiation Protocol (SIP)", RFC 3312, October 2002. [7] Camarillo, G. and P. Kyzivat, "Update todiscard them, effectively disablingthemedia streamSession Initiation Protocol (SIP) Preconditions Framework", RFC 4032, March 2005. [8] Crocker, D. and P. Overell, "Augmented BNF forthe call. However, this attack requires the agent to disrupt packetsSyntax Specifications: ABNF", RFC 4234, October 2005. [9] Rosenberg, J. and H. Schulzrinne, "Reliability of Provisional Responses in Session Initiation Protocol (SIP)", RFC 3262, June 2002. [10] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006. [11] Rosenberg, J., "Simple Traversal Underneath Network Address Translators (NAT) (STUN)", draft-ietf-behave-rfc3489bis-04 (work inorder to block the connectivity checkprogress), July 2006. [12] Rosenberg, J., "Obtaining Relay Addresses fromreaching the target. In that case, if the goal is to disrupt the media stream, its much easier to just disrupt it withSimple Traversal of UDP Through NAT (STUN)", draft-ietf-behave-turn-01 (work in progress), June 2006. 19.2. Informative References [13] Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, "STUN - Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs)", RFC 3489, March 2003. [14] Senie, D., "Network Address Translator (NAT)-Friendly Application Design Guidelines", RFC 3235, January 2002. [15] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A., and A. Rayhan, "Middlebox communication architecture and framework", RFC 3303, August 2002. [16] Rosenberg, J., Peterson, J., Schulzrinne, H., and G. Camarillo, "Best Current Practices for Third Party Call Control (3pcc) in thesame mechanism, rather than attack ICE. 13.2. Attacks onSession Initiation Protocol (SIP)", BCP 85, RFC 3725, April 2004. [17] Borella, M., Lo, J., Grabelsky, D., and G. Montenegro, "Realm Specific IP: Framework", RFC 3102, October 2001. [18] Borella, M., Grabelsky, D., Lo, J., and K. Taniguchi, "Realm Specific IP: Protocol Specification", RFC 3103, October 2001. [19] Daigle, L. and IAB, "IAB Considerations for UNilateral Self- AddressGathering ICE endpoints make useFixing (UNSAF) Across Network Address Translation", RFC 3424, November 2002. [20] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 3550, July 2003. [21] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004. [22] Carpenter, B. and K. Moore, "Connection ofSTUNIPv6 Domains via IPv4 Clouds", RFC 3056, February 2001. [23] Zopf, R., "Real-time Transport Protocol (RTP) Payload forgathering addresses from a STUN serverComfort Noise (CN)", RFC 3389, September 2002. [24] Rosenberg, J., "The Session Initiation Protocol (SIP) UPDATE Method", RFC 3311, October 2002. [25] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing Tone Generation in thenetwork. This is corresponds to the binding acquisition use case discussedSession Initiation Protocol (SIP)", RFC 3960, December 2004. [26] Andreasen, F., "Connectivity Preconditions for Session Description Protocol Media Streams", draft-ietf-mmusic-connectivity-precon-02 (work inSection 10.1 of [12]. As a consequence, the attacks against STUN itself that are describedprogress), June 2006. [27] Andreasen, F., "A No-Op Payload Format for RTP", draft-ietf-avt-rtp-no-op-00 (work inSection 12 [12] can still be used against the STUN address gathering operations that occurprogress), May 2005. [28] Kohler, E., Handley, M., and S. Floyd, "Datagram Congestion Control Protocol (DCCP)", RFC 4340, March 2006. [29] Hellstrom, G. and P. Jones, "RTP Payload for Text Conversation", RFC 4103, June 2005. [30] Audet, F. and C. Jennings, "NAT Behavioral Requirements for Unicast UDP", draft-ietf-behave-nat-udp-07 (work inICE. However, the additional mechanisms provided by ICE actually counteract such attacks, making binding acquisition with STUN more secure when combined with ICE than without ICE. Consider an attacker which is able to provide an agent with a faked XOR-MAPPED-ADDRESSprogress), June 2006. [31] Jennings, C. and R. Mahy, "Managing Client Initiated Connections ina STUN Binding Request that is used for address gathering. This istheprimary attack primitive describedSession Initiation Protocol (SIP)", draft-ietf-sip-outbound-04 (work inSection 12progress), June 2006. Appendix A. Design Motivations ICE contains a number of[12]. This address willnormative behaviors which may themselves beused as a STUN derived candidate in the ICE exchange. For this candidatesimple, but derive from complicated or non-obvious thinking or use cases which merit further discussion. Since these design motivations are not neccesary toactually be usedunderstand formedia, the attacker must also attack the connectivity checks, andpurposes of implementation, they are discussed here inparticular, force a false valid on a false candidate. This attack is very hardan appendix tolaunch ifthefalse address identifies a third party, andspecification. This section isprevented by SRTP if it identifies the attacker themself. If the attacker elects notnon-normative. A.1. Applicability toattack the connectivity checks, the worst it can do is prevent the STUN-derived address from being used. However, if the peer agent has at least one addressGateways and Servers Section 4.1 discusses procedures for gathering candidates, including host, server reflexive and relayed. In thatis reachable by thesection, recommendations are given for when an agentunder attack, the STUN connectivity checks themselves will provide a STUN-derived addressshould obtain each of these three types. In particular, for agents embedded in PSTN gateways, media servers, conferencing servers, and so on, ICE specifies that an agent canbe used for the exchange of media. Peer derived candidates are preferred overstick with just host candidates, since it has a public IP address. This leads to an important question - why would such an endpoint even bother with ICE? If it has a public IP address, what additional value do thecandidate theyICE procedures bring? There aregenerated frommany, actually. First, doing so greatly facilitates NAT traversal forthis reason. As such, an attack solely on the STUNclients that connect to it. Consider a PC softphone behind a NAT whose mapping policy is addressgathering will normally have no impact onand port dependent. The softphone initiates a callat all. 13.3. Attacks on the Offer/Answer Exchanges An attacker that can modify or disrupt the offer/answer exchanges themselves can readily launchthrough avariety of attacks withgateway that implements ICE.They could direct mediaThe gateway doesn't obtain any server reflexive or relayed candidates, but it implements ICE, and consequently, is prepared to receive STUN connectivity checks on its host candidates. The softphone will send atarget of a DoS attack, they could insert themselves into the media stream, and so on. These are similarSTUN connectivity check to thegeneral security considerations for offer/answer exchanges, and the security considerations in RFC 3264 [4] apply. These require techniques for message integrity and encryption for offers and answers,gateway, whichare satisfied bypasses through theSIPS mechanism [2] when SIP is used. As such,intervending NAT. This causes theusage of SIPS with ICE is RECOMMENDED. 13.4. Insider Attacks In additionNAT toattacks where the attacker isallocate athird party trying to insert fake offers, answers or stun messages, there are several attacks possible with ICE whennew binding for theattackersoftphone. The connectivity isan authenticatedreceived by the gateway, andvalid participant inwill cause it gateway to send a check back to theICE exchange. 13.4.1. The Voice Hammer Attack The voice hammer attack is an amplification attack. Insoftphone, at thisattack,newly created candidate. A successful response confirms that this candidate is usable, and theattacker initiates sessionsgateway can send media immediately toother agents, and includestheIP addresssoftphone. This allows direct media transmission between the gateway andport of a DoS target insoftphone, without them/c-line of their SDP. This causes substantial amplification; a single offer/answer exchange can createneed for relays, even though the softphone was behind acontinuing flood'bad' NAT. Second, implementation ofmedia packets, possibly at high rates (consider video sources). This attack is not speificthe STUN connectivity checks allows for NAT bindings along the way toICE, but ICE can help provide remediation. Specifically, if ICEbe kept open. Keeping these bindings open isused,essential for continued communications between theagent receivinggateway and softphone. Third, ICE prevents a fairly destructive attack in multimedia systems, called themalicious SDP will first peformvoice hammer. The STUN connectivitycheckscheck used by an ICE endpoint allows it to be certain that the target of mediabefore sending it there. Ifpackets is, in fact, the same entity that requested the packets through the offer/answer exchange. See Section 15 for a more complete discussion on thistarget isattack. A.2. Pacing of STUN Transactions STUN transactions used to gather candidates and to verify connectivity are paced out at an approximate rate of one new transaction every Ta seconds, where Ta has athird party host,default of 50ms. Why are these transactions paced, and why was 50ms chosen as default? Sending of these STUN requests will often have the effect of creating bindings on NAT devices between thechecks will not succeed,client andmedia is never sent. Unfortunately, ICE doesn't help if its not used, in which case an attacker could simply sendtheoffer withoutSTUN servers. Experience has shown that many NAT devices have upper limits on theICE parameters. However, in environments whererate at which they will create new bindings. Furthermore, transmission of these packets on thesetnetwork makes use ofclients are known,bandwidth andlimitedneeds toonesbe rate limited by the agent. As a consequence, the pacing ensures thatsupport ICE,theserver can reject any offers or answersNAT devices does not get overloaded and thatdon't indicate ICE support. 13.4.2. STUN Amplification Attack The STUN amplification attacktraffic issimilar to the voice hammer. However, insteadkept at a reasonable rate. Another aspect ofvoice packets being directed tothetarget,STUNconnectivity checks are directedrequests is their bandwidth usage. In ICE, each STUN request contains the STUN 20 byte header, in addition to thetarget.USERNAME, MESSAGE-INTEGRITY and PRIORITY attributes. The USERNAME attribute contains a 4-byte attribute overhead, plus the username value itself. Thisattackusername isaccomplished by havingtheofferer send an offer withconcatenation of the two fragments, plus alarge numbercolon. Each fragment is supposed to be at least 4 bytes long, making the total length ofcandidates, say 50.the USERNAME attribute (4*2 + 1 + 4) = 13 bytes. Theanswerer receivesMESSAGE-INTEGRITY attribute is 4 bytes of overhead plus 20 bytes value, for 24 bytes. The PRIORITY attribute is 4 bytes of overhead plus 4 bytes of value, for 8 bytes. Thus, theoffer, and starts its checks, which are directed attotal length of thetarget,STUN Binding Request is (20 + 13 + 24 + 8) = 65 bytes, with 28 bytes of overhead for IP andconsequently, never generateUDP for aresponse.total of 93 bytes. Theanswerer will start a new connectivity check every 50ms,response contains the STUN 20 byte header, the XOR- MAPPED-ADDRESS, and MESSAGE-INTEGRITY attributes. XOR-MAPPED-ADDRESS has 4 bytes overhead plus an 8 byte value, for a total of 12 bytes. Thus, eachcheckSTUN response is (20 + 12 + 24) = 56 bytes plus 28 bytes of UDP/IP overhead for aSTUN transaction consistingtotal of9 retransmits84 bytes. Checks typically fall into one of two cases. If amessage 64check works, each transaction has a single request and a single response, for a total of 2 packets and 177 bytesin length. This producesover one RTT interval. Assuming a fairlysubstantial 92agressive RTT of 70ms, this produces 20.23 kbps,just in STUN requests. It is impossible to eliminate the amplification,butthe volume can be reduced throughonly briefly. If avariety of heuristics. For example, agents can limit the number of candidates they'll accept in an offer or answer, they can increase the value of Tb, or exponentially increase Tb as time goes on. All of these ultimately trade off the time for the ICE exchanges to complete, withcheck fails because theamount of traffic that gets sent. 14. IANA Considerationspair is invalid, there will be nine requests and no responses. Thisspecification defines three new SDP attribute per the proceduresproduces 837 bytes over 7.9s, for a total ofSection 8.2.4105.9 bps, but over a long period of[5].time. OPEN ISSUE: Therequired information for the registrationsbandwidth computations areincluded here. 14.1. candidate Attribute Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. Attribute Name: candidate Long Form: candidate Type of Attribute: media level Charset Considerations: The attributepretty complex because ICE is notsubjecta CBR stream, and its bandwidth utilization depends on how many transactions it ends up generating before it finishes. Need to work this model more. Given that these numbers are close to, if not greater than, thecharset attribute. Purpose: This attributebandwidths utilized by many voice codecs, this seems a reasonable value to use. OPEN ISSUE: There isusedsome debate about whether to reduce this pacing interval smaller, say 20ms, to speed up ICE, or perhaps make it equal to the bandwidth that would be utilized by the media streams themselves. A.3. Candidates withInteractive Connectivity Establishment (ICE), and provides one of many possible candidate addresses for communication. These addressesMultiple Bases Section 4.1 talks about merging together candidates that arevalidated withidentical but have different bases. When can anend-to-end connectivity check using Simple Traversalagent have two candidates that have the same IP address and port, but different bases? Consider the topology ofUDP withFigure 16: +----------+ | STUN Srvr| +----------+ | | ----- // \\ | | | B:net10 | | | \\ // ----- | | +----------+ | NAT(STUN). Appropriate Values: See Section 12 of RFC XXXX [Note to RFC-ed: please replace XXXX with| +----------+ | | ----- // \\ | A | |192.168/16 | | | \\ // ----- | | |192.168.1.1 ----- +----------+ // \\ +----------+ | | | | | | | Offerer |---------| C:net10 |---------| Answerer | | |10.0.1.1 | | 10.0.1.2 | | +----------+ \\ // +----------+ ----- Figure 16 In this case, theRFC number ofofferer is multi-homed. It has one interface, 10.0.1.1, on network C, which is a net 10 private network. The Answerer is on thisspecification]. 14.2. remote-candidate Attribute Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. Attribute Name: remote-candidate Long Form: remote-candidate Type of Attribute: media level Charset Considerations:same network. The offerer is also connected to network A, which is 192.168/16. Theattributeofferer has an interface of 192.168.1.1 on this network. There is a NAT on this network, natting into network B, which is another net10 private network, but notsubjectconnected tothe charset attribute. Purpose: This attributenetwork C. There isused with Interactive Connectivity Establishment (ICE),a STUN server on network B. The offerer obtains a host candidate on its interface on network C (10.0.1.1:2498) andprovidesa host candidate on its interface on network A (192.168.1.1:3344). It performs a STUN query to its configured STUN server from 192.168.1.1:3344. This query passes through theidentity ofNAT, which happens to assign theremote candidate thatbinding 10.0.1.1:2498. The STUN server reflects this in theofferer wishesSTUN Binding Response. Now, theanswererofferer has obtained a server reflexive candidate with a transport address that is identical touse in its answer. Appropriate Values: See Section 12a host candidate (10.0.1.1:2498). However, the server reflexive candidate has a base ofRFC XXXX [Note to RFC-ed: please replace XXXX with192.168.1.1:3344, and theRFC numberhost candidate has a base ofthis specification]. 14.3. ice-pwd Attribute Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. Attribute Name: ice-pwd Long Form: ice-pwd Type10.0.1.1:2498. A.4. Purpose ofAttribute: session level Charset Considerations: The attribute is not subject tothecharset attribute. Purpose: This attributeTranslation When a candidate isused with Interactive Connectivity Establishment (ICE),relayed, the SDP offer or answer contain both the relayed candidate andprovidesits translation. However, thepasswordtranslation is never usedto protect STUN connectivity checks. Appropriate Values: See Section 12 of RFC XXXX [Note to RFC-ed: please replace XXXX withby ICE itself. Why is it present in theRFC number of this specification]. 15. IAB Considerationsmessage? There are two motivations for its inclusion. TheIAB has studiedfirst is diagnostic. It is very useful to know theproblemrelationship between the different types of"Unilateral Self Address Fixing",candidates. By including the translation, an agent can know which relayed candidate isthe general process byassociated with which reflexive candidate, whicha agent attempts to determine its addressinanother realm on the other side of a NAT through a collaborative protocol reflection mechanism [20]. ICEturn isan example ofassociated with aprotocol that performs this type of function. Interestingly, the processspecific host candidate. When checks forICE is not unilateral, but bilateral,one candidate succeed and not thedifference has a signficant impactothers, this provides useful diagnostics on what is going on in theissues raised by IAB.network. TheIABsecond reason hasmandated that any protocols developed for this purpose document a specific set of considerations. This section meets those requirements. 15.1. Problem Definition From RFC 3424 any UNSAF proposal must provide: Precise definitionto do with off-path Quality ofa specific, limited-scope problem thatService (QoS) mechanisms. When ICE is used in environments such as PacketCable 2.0 [[TODO: need PC2.0 reference]], proxies will, in addition tobe solvedperforming normal SIP operations, inspect the SDP in SIP messages, and extract the IP address and port for media traffic. They can then interact, through policy servers, with access routers in theUNSAF proposal. A short term fix should not be generalizednetwork, tosolve other problems; thisestablish guaranteed QoS for the media flows. This QoS iswhy "short term fixes usually aren't". The specific problems being solvedprovided byICE are: Provideclassifying the RTP traffic based on 5-tuple, and then providing it ameansguaranteed rate, or marking its Diffserv codepoints appropriately. When a residential NAT is present, and a relayed candidate gets selected fortwo peers to determinemedia, this relayed candidate will be a transport address on an actual STUN relay. That address says nothing about theset ofactual transportaddresses which canaddress in the access router that would be used to classify packets forcommunication. Provide a means for resolving many ofQoS treatment. Rather, thelimitations of other UNSAF mechanisms by wrapping them in an additional layertranslation ofprocessing (the ICE methodology). Provide a means for a agent to determine an addressthat relayed address isreachable by another peer with which it wishesneeded. By carrying the translation in the SDP, the proxy can use that transport address tocommunicate. 15.2. Exit Strategy From RFC 3424, any UNSAF proposal must provide: Descriptionrequest QoS from the access router. A.5. Importance ofan exit strategy/transition plan.the STUN Username ICE requires the usage of message integrity with STUN using its short term credential functionality. Thebetteractual short termfixes are the ones that will naturally see less and less use as the appropriate technology is deployed. ICE itself doesn't easily get phased out. However, itcredential isuseful evenformed by exchanging username fragments ina globally connected Internet, to serve as a meansthe SDP offer/answer exchange. The need fordetecting whether a router failure has temporarily disrupted connectivity,this mechanism goes beyond just security; it is actual required forexample.correct operation of ICEalso helps prevent certain security attacksin the first place. Consider agents A, B, and C. A and B are within private enterprise 1, which is using 10.0.0.0/8. C is within private enterprise 2, which is also using 10.0.0.0/8. As it turns out, B and C both havenothingIP address 10.0.1.1. A sends an offer todoC. C, in its answer, provides A withNAT. However, what ICE does is help phase out other UNSAF mechanisms. ICE effectively selects amongstits host candidates. In this case, thosemechanisms, prioritizing ones thatcandidates arebetter,10.0.1.1:8866 anddeprioritizing ones that are worse. Local IPv6 addresses can be preferred.10.0.1.1:8877. AsNATs begin to dissipateit turns out, B is in a session at that same time, and is also using 10.0.1.1:8866 and 10.0.1.1:8877 asIPv6host candidates. This means that B isintroduced, derived transport addresses from other UNSAF mechanisms simply never get used, because higher priority connectivity exists. Therefore, the servers get used lessprepared to accept STUN messages on those ports, just as C is. A will send a STUN request to 10.0.1.1:8866 andless,andcan eventually be remove when their usage goesanother tozero. Indeed, ICE can assist in the transition from IPv410.0.1.1:8877. However, these do not go toIPv6. It can be usedC as expected. Instead, they go todetermine whetherB! If B just replied touse IPv6 or IPv4 when two dual-stack hosts communicate with SIP (IPv6 gets used). It can also allow a network with both 6to4 and native v6them, A would believe it has connectivity todetermine which address to useC, whencommunicating within fact it has connectivity to apeer. 15.3. Brittleness Introduced by ICE From RFC3424, any UNSAF proposal must provide: Discussion of specific issuescompletely different user, B. To fix this, the STUN short term credential mechanisms are used. The username fragments are sufficiently random thatmay render systems more "brittle". For example, approachesit is highly unlikely thatinvolveB would be usingdata at multiple network layers create more dependencies, increase debugging challenges, and make it harder to transition. ICE actually removes brittleness from existing UNSAF mechanisms.the same values as A. Consequently, B would reject the STUN request since the credentials were invalid. Inparticular, traditionalessence, the STUN(as described in [14]) has several pointsusername fragments provide a form ofbrittleness. Onetransient host identifiers, bound to a particular offer/answer session. An unfortunate consequence ofthemthe non-uniqueness of IP addresses is that, in thediscovery process which requires a agent to tryabove example, B might not even be an ICE agent. It could be any host, andclassifythetype of NAT it is behind. This processport to which the STUN packet iserror-prone. With ICE,directed could be any ephemeral port on thatdiscovery processhost. If there issimply not used. Rather than unilaterally assessing the validity of the address, its validityan application listening on this socket for packets, and it isdynamically determined by measuring connectivitynot prepared toa peer. The process of determining connectivityhandle malformed packets for whatever protocol isvery robust. Another pointin use, the operation ofbrittlenessthat application could be affected. Fortunately, since the ports exchanged inSTUNSDP are ephemeral andany other unilateral mechanismusually drawn from the dynamic or registered range, the odds are good that the port isits absolute reliance on an additional server. ICE makes use of a server for allocating unilateral addresses, but allows agentsnot used todirectly connect if possible. Therefore, inrun a server on host B, but rather is the agent side of somecases,protocol. This decreases thefailureprobability of hitting aSTUN server would still allow for a callport in-use, due toprogress when ICE is used. Another pointthe transient nature ofbrittlenessport usage intraditional STUN is that it assumes that the STUN server is onthis range. However, thepublic Internet. Interestingly, with ICE,possibility of a problem does exist, and network deployers should be prepared for it. Note that this is notnecessary. There can beamultitude of STUN servers inproblem specific to ICE; stray packets can arrive at avarietyport at any time for any type ofaddress realms. ICE will discoverprotocol, especially ones on theone that has provided a usable address. The most troubling point of brittleness in traditional STUN is that it doesn't work in all network topologies. In cases where therepublic Internet. As such, this requirement is just restating ashared NAT between each agent and the STUN server, traditional STUN may not work. With ICE, that restriction cangeneral design guideline for Internet applications - belifted. Traditional STUN also introduces some security considerations. Fortunately, those security considerations are also mitigated by ICE. Consequently, ICE serves to repair the brittleness introduced in other UNSAF mechanisms, and does not introduce any additional brittleness into the system. 15.4. Requirementsprepared fora Long Term Solution From RFC 3424,unknown packets on anyUNSAF proposal must provide: Identify requirements for longer term, sound technical solutions -- contribute toport. A.6. The Candidate Pair Sequence Number Formula The sequence number for a candidate pair has an odd form. It is: PAIR-SN = 10000*MAX(O-SN,A-SN) + MIN(O-SN,A-SN) + O-IP/SZ Why is this? When theprocess of findingcandidate pairs are sorted based on this value, theright longer term solution. Our conclusions from STUN remain unchanged. However, we feel ICE actually helps because we believe it can be partresulting sorting has the MAX/MIN property. This means that the pairs are first sorted based on increasing value of thelong term solution. 15.5. Issues with Existing NAPT Boxes From RFC 3424, any UNSAF proposal must provide: Discussionmaximum of theimpacttwo sequence numbers. For pairs that have the same value of thenoted practical issues with existing, deployed NA[P]Ts and experience reports. Amaximum sequence number, the minimum sequence numberof NAT boxes are now being deployed intois used to sort amongst them. If themarket which trymax andprovide "generic" ALG functionality. These generic ALGs hunt forthe min sequence numbers are the same, the IPaddresses, either in text or binary form within a packet, and rewrite them if they match a binding. This interferes with traditional STUN. However,address of theupdate to STUN [12] uses an encoding which hides these binary addresses from generic ALGs. Since [12] is required for all ICE implementations, this NAPT problem does not impact ICE. Existing NAPT boxes have non-deterministic and typically short expiration times for UDP-based bindings. This requires implementations to send periodic keepalives to maintain those bindings. ICE usesofferers candidate serves as adefaulttie breaker. The factor of15s, which1000 isa very conservative estimate. Eventually, over time, as NAT boxes become compliant to behave [32], this minimum keepaliveused since there willbecome deterministic and well-known,always be fewer than a 1000 candidates, and thus theICE timers can be adjusted. Havinglargest value away to discoversequence number (and thus the minimumkeepalive interval would be far better still. 16. Acknowledgements The authors would like to thank Flemming Andreasen, Rohan Mahy, Dean Willis, Dan Wing, Douglas Otis, Tim Moore, Francois Audet, Bill May and Philip Matthewssequence number) can have is always less than 1000. This creates the desired sorting property. Recall that candidate sequence numbers are assigned such that, fortheir comments and input. A special thanks goesa particular set of candidates of the same type, the RTP components have lower sequence numbers than the corresponding RTCP component. Also recall that, if an agent prefers host candidates toMagnus Westerlundserver reflexive to relayed, sequence numbers fordoing several detailed reviews on the various revisionshost candidates are always lower than server reflexive which are always lower than relayed. Because ofthis specification. His input ledthis, A.7. The Frozen State The Frozen state is used for two purposes. Firstly, it allows ICE tomany substantive improvements in this document. 17. References 17.1. Normative References [1] Huitema, C., "Real Time Control Protocol (RTCP) attribute in Session Description Protocol (SDP)", RFC 3605, October 2003. [2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [3] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 3548, July 2003. [4] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002. [5] Handley, M., "SDP: Session Description Protocol", draft-ietf-mmusic-sdp-new-26 (work in progress), January 2006. [6] Casner, S., "Session Description Protocol (SDP) Bandwidth Modifiersfirst perform checks for the first component of a media stream. Once a successful check has completed forRTP Control Protocol (RTCP) Bandwidth", RFC 3556, July 2003. [7] Camarillo, G., Marshall, W., and J. Rosenberg, "Integrationthe first component, the other components ofResource Management and Session Initiation Protocol (SIP)", RFC 3312, October 2002. [8] Camarillo, G.the same type andP. Kyzivat, "Updatelocal preference will get performed. Secondly, when there are multiple media streams, it allows ICE tothe Session Initiation Protocol (SIP) Preconditions Framework", RFC 4032, March 2005. [9] Crocker, D.first check candidates for a single media stream, andP. Overell, "Augmented BNFonce a set of candidates has been found, candidates of that same type forSyntax Specifications: ABNF", RFC 4234, October 2005. [10] Olson, S., Camarillo, G.,other media streams can be checked first. This effectively 'caches' the results of a check for one media stream, andA. Roach, "Supportapplies them to another. For example, if only the relayed candidates forIPv6audio (which were the last resort candidates) succeed, ICE will check the relayed candidates for video first. A.8. The remote-candidates attribute The a=remote-candidates attribute exists to eliminate a race condition between the updated offer and the response to the STUN Binding Request that moved a candidate into the Valid list. This race condition is shown inSession Description Protocol (SDP)", RFC 3266, June 2002. [11] Rosenberg, J.Figure 17. On receipt of message 4, agent A adds a candidate pair to the valid list. If there was only a single media stream with a single component, agent A could now send an updated offer. However, the check from agent B has not yet generated a response, andH. Schulzrinne, "Reliability of Provisional Responsesagent B receives the updated offer (message 7) before getting the response (message 10). Thus, it does not yet know that this particular pair is valid. To eliminate this condition, the actual candidates at B that were selected by the offerer (the remote candidates) are included inSession Initiation Protocol (SIP)", RFC 3262, June 2002. [12] Rosenberg, J., "Simple Traversal of UDP Throughthe offer itself. Note, however, that agent B will not send media until it has received this STUN response. Agent A NetworkAddress Translators (NAT) (STUN)", draft-ietf-behave-rfc3489bis-03 (work in progress), March 2006. [13] Rosenberg, J., "Obtaining Relay Addresses from Simple TraversalAgent B |(1) Offer | | |------------------------------------------>| |(2) Answer | | |<------------------------------------------| |(3) STUN Req. | | |------------------------------------------>| |(4) STUN Res. | | |<------------------------------------------| |(5) STUN Req. | | |<------------------------------------------| |(6) STUN Res. | | |-------------------->| | | |Lost | |(7) Offer | | |------------------------------------------>| |(8) Answer | | |<------------------------------------------| |(9) STUN Req. | | |<------------------------------------------| |(10) STUN Res. | | |------------------------------------------>| Figure 17 A.9. Why are Keepalives Needed? Once media begins flowing on a candidate pair, it is still necessary to keep the bindings alive at intermediate NATs for the duration ofUDP Through NAT (STUN)", draft-ietf-behave-turn-00 (workthe session. Normally, the media stream packets themselves (e.g., RTP) meet this objective. However, several cases merit further discussion. Firstly, in some RTP usages, such as SIP, the media streams can be "put on hold". This is accomplished by using the SDP "sendonly" or "inactive" attributes, as defined inprogress), March 2006. 17.2. Informative References [14] Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, "STUN - Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs)",RFC3489, March 2003. [15] Senie, D., "Network Address Translator (NAT)-Friendly Application Design Guidelines",3264 [4]. RFC3235, January 2002. [16] Rosenberg, J.3264 directs implementations to cease transmission of media in these cases. However, doing so may cause NAT bindings to timeout, andH. Schulzrinne, "Anmedia won't be able to come off hold. Secondly, some RTPPayload Formatpayload formats, such as the payload format forGeneric Forward Error Correction", RFC 2733, December 1999. [17] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A., and A. Rayhan, "Middlebox communication architecture and framework", RFC 3303, August 2002. [18] Borella, M., Lo, J., Grabelsky, D., and G. Montenegro, "Realm Specific IP: Framework", RFC 3102, October 2001. [19] Borella, M., Grabelsky, D., Lo, J.,text conversation [29], may send packets so infrequently that the interval exceeds the NAT binding timeouts. Thirdly, if silence suppression is in use, long periods of silence may cause media transmission to cease sufficiently long for NAT bindings to time out. For these reasons, the media packets themselves cannot be relied upon. ICE defines a simple periodic keepalive that operates indpendently of media transmission. This makes its bandwidth requirements highly predictable, andK. Taniguchi, "Realm Specific IP: Protocol Specification", RFC 3103, October 2001. [20] Daigle, L.thus amenable to QoS reservations. A.10. Why Prefer Peer Reflexive Candidates? Section 4.2 describes procedures for computing the priority of candidate based on its type andIAB, "IAB Considerationslocal preferences. That section requires that the type preference for peer reflexive candidates always be lower than server reflexive. Why is that? The reason has to do with the security considerations in Section 15. It is much easier forUNilateral Self- Address Fixing (UNSAF) Across Network Address Translation", RFC 3424, November 2002. [21] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocolan attacker to cause an agent to use a false server reflexive candidate than it is forReal-Time Applications", RFC 3550, July 2003. [22] Baugher, M., McGrew, D., Naslund, M., Carrara, E.,an attacker to cause an agent to use a false peer reflexive candidate. Consequently, attacks against the STUN binding discovery usage are thwarted by ICE by preferring the peer reflexive candidates. A.11. Why Can't Offerers Send Media When a Pair Validates Section 11.1 describes rules for sending media. The rules are asymmetric, andK. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004. [23] Carpenter, B.not the same for offerers andK. Moore, "Connection of IPv6 Domains via IPv4 Clouds", RFC 3056, February 2001. [24] Zopf, R., "Real-time Transport Protocol (RTP) Payloadanswerers. In particular, an answerer can send media right away to a candidate pair once it validates, even if it doesnt match the pairs in the m/c-line. THe offerer cannot - it must wait forComfort Noise (CN)", RFC 3389, September 2002. [25] Rosenberg, J., "The Session Initiation Protocol (SIP) UPDATE Method", RFC 3311, October 2002. [26] Camarillo, G.an updated offer/answer exchange. Why is that? This, in fact, relates to a bigger question - why is the updated offer/answer exchange needed at all? Indeed, in a pure offer/answer environment, it would not be. The offerer andH. Schulzrinne, "Early Mediaanswerer will agree on the candidates to use through ICE, andRinging Tone Generationthen can begin using them. As far as the agents themselves are concerned, the updated offer/answer provides no new information. However, in practice, numerous components along the signaling path look at the SDP information. These include entities performing off-path QoS reservations, NAT traversal components such as ALGs and SessionInitiation Protocol (SIP)", RFC 3960, December 2004. [27] Andreasen, F., "Connectivity Preconditions for Session Description Protocol Media Streams", draft-ietf-mmusic-connectivity-precon-02 (work in progress), June 2006. [28] Andreasen, F., "A No-Op Payload FormatBorder Controllers (SBCs) and diagnostic tools that passively monitor the network. For these tools to continue to function without change, the core property of SDP - that the m/c-lines represent the addresses used forRTP", draft-ietf-avt-rtp-no-op-00 (workmedia - must be retained. For this reason, an updated offer must be sent. To ensure that an updated offerer is sent, ICE purposefully prevents the offerer from sending media until that offer is sent. It furthermore restricts the answerer inprogress), May 2005. [29] Huitema, C., "Teredo: Tunneling IPv6 over UDP through Network Address Translations (NATs)", RFC 4380, February 2006. [30] Kohler, E., Handley, M., and S. Floyd, "Datagram Congestion Control Protocol (DCCP)", RFC 4340, March 2006. [31] Hellstrom, G. and P. Jones, "RTP Payloadhow long it can send media until an updated offer is received. This provides protocol incentives forText Conversation", RFC 4103, June 2005. [32] Audet, F.sending the updated offer. The updated offer also helps ensure that ICE did the right thing. In very unusual cases, the offerer andC. Jennings, "NAT Behavioral Requirements for Unicast UDP", draft-ietf-behave-nat-udp-07 (workanswerer might not agree on the candidates selected by ICE. This would be detected inprogress), June 2006.the updated offer/answer exchange, allowing them to restart ICE procedures to fix the problem. Author's Address Jonathan Rosenberg Cisco Systems 600 Lanidex Plaza Parsippany, NJ 07054 US Phone: +1 973 952-5000 Email: jdrosen@cisco.com URI: http://www.jdrosen.net Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society.