draft-ietf-avtcore-rtp-topologies-update-04.txt   draft-ietf-avtcore-rtp-topologies-update-05.txt 
Network Working Group M. Westerlund Network Working Group M. Westerlund
Internet-Draft Ericsson Internet-Draft Ericsson
Obsoletes: 5117 (if approved) S. Wenger Obsoletes: 5117 (if approved) S. Wenger
Intended status: Informational Vidyo Intended status: Informational Vidyo
Expires: February 19, 2015 August 18, 2014 Expires: May 16, 2015 November 12, 2014
RTP Topologies RTP Topologies
draft-ietf-avtcore-rtp-topologies-update-04 draft-ietf-avtcore-rtp-topologies-update-05
Abstract Abstract
This document discusses point to point and multi-endpoint topologies This document discusses point to point and multi-endpoint topologies
used in Real-time Transport Protocol (RTP)-based environments. In used in Real-time Transport Protocol (RTP)-based environments. In
particular, centralized topologies commonly employed in the video particular, centralized topologies commonly employed in the video
conferencing industry are mapped to the RTP terminology. conferencing industry are mapped to the RTP terminology.
This document is updated with additional topologies and is intended This document is updated with additional topologies and is intended
to replace RFC 5117. to replace RFC 5117.
skipping to change at page 1, line 37 skipping to change at page 1, line 37
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on February 19, 2015. This Internet-Draft will expire on May 16, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 36 skipping to change at page 2, line 36
3.5.2. Media Translator . . . . . . . . . . . . . . . . . . 20 3.5.2. Media Translator . . . . . . . . . . . . . . . . . . 20
3.6. Point to Multipoint Using the RFC 3550 Mixer Model . . . 21 3.6. Point to Multipoint Using the RFC 3550 Mixer Model . . . 21
3.6.1. Media Mixing Mixer . . . . . . . . . . . . . . . . . 23 3.6.1. Media Mixing Mixer . . . . . . . . . . . . . . . . . 23
3.6.2. Media Switching . . . . . . . . . . . . . . . . . . . 26 3.6.2. Media Switching . . . . . . . . . . . . . . . . . . . 26
3.7. Selective Forwarding Middlebox . . . . . . . . . . . . . 28 3.7. Selective Forwarding Middlebox . . . . . . . . . . . . . 28
3.8. Point to Multipoint Using Video Switching MCUs . . . . . 31 3.8. Point to Multipoint Using Video Switching MCUs . . . . . 31
3.9. Point to Multipoint Using RTCP-Terminating MCU . . . . . 33 3.9. Point to Multipoint Using RTCP-Terminating MCU . . . . . 33
3.10. Split Component Terminal . . . . . . . . . . . . . . . . 34 3.10. Split Component Terminal . . . . . . . . . . . . . . . . 34
3.11. Non-Symmetric Mixer/Translators . . . . . . . . . . . . . 37 3.11. Non-Symmetric Mixer/Translators . . . . . . . . . . . . . 37
3.12. Combining Topologies . . . . . . . . . . . . . . . . . . 37 3.12. Combining Topologies . . . . . . . . . . . . . . . . . . 37
4. Comparing Topologies . . . . . . . . . . . . . . . . . . . . 38 4. Topology Properties . . . . . . . . . . . . . . . . . . . . . 38
4.1. Topology Properties . . . . . . . . . . . . . . . . . . . 38 4.1. All to All Media Transmission . . . . . . . . . . . . . . 38
4.1.1. All to All Media Transmission . . . . . . . . . . . . 38 4.2. Transport or Media Interoperability . . . . . . . . . . . 39
4.1.2. Transport or Media Interoperability . . . . . . . . . 39 4.3. Per Domain Bit-Rate Adaptation . . . . . . . . . . . . . 39
4.1.3. Per Domain Bit-Rate Adaptation . . . . . . . . . . . 39 4.4. Aggregation of Media . . . . . . . . . . . . . . . . . . 40
4.1.4. Aggregation of Media . . . . . . . . . . . . . . . . 40 4.5. View of All Session Participants . . . . . . . . . . . . 40
4.1.5. View of All Session Participants . . . . . . . . . . 40 4.6. Loop Detection . . . . . . . . . . . . . . . . . . . . . 40
4.1.6. Loop Detection . . . . . . . . . . . . . . . . . . . 41 4.7. Consistency between header extensions and RTCP . . . . . 41
4.2. Comparison of Topologies . . . . . . . . . . . . . . . . 41 5. Comparison of Topologies . . . . . . . . . . . . . . . . . . 41
5. Security Considerations . . . . . . . . . . . . . . . . . . . 41 6. Security Considerations . . . . . . . . . . . . . . . . . . . 42
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 43 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 44 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 44
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 44 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.1. Normative References . . . . . . . . . . . . . . . . . . 44 9.1. Normative References . . . . . . . . . . . . . . . . . . 44
8.2. Informative References . . . . . . . . . . . . . . . . . 44 9.2. Informative References . . . . . . . . . . . . . . . . . 45
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 46
1. Introduction 1. Introduction
Real-time Transport Protocol (RTP) [RFC3550] topologies describe Real-time Transport Protocol (RTP) [RFC3550] topologies describe
methods for interconnecting RTP entities and their processing methods for interconnecting RTP entities and their processing
behavior of RTP and RTCP. This document tries to address past and behavior of RTP and RTCP. This document tries to address past and
existing confusion, especially with respect to terms not defined in existing confusion, especially with respect to terms not defined in
RTP but in common use in the conversational communication industry, RTP but in common use in the conversational communication industry,
such as the Multipoint Control Unit or MCU. such as the Multipoint Control Unit or MCU.
skipping to change at page 5, line 39 skipping to change at page 5, line 39
3.1. Point to Point 3.1. Point to Point
Shortcut name: Topo-Point-to-Point Shortcut name: Topo-Point-to-Point
The Point to Point (PtP) topology (Figure 1) consists of two End The Point to Point (PtP) topology (Figure 1) consists of two End
Points, communicating using unicast. Both RTP and RTCP traffic are Points, communicating using unicast. Both RTP and RTCP traffic are
conveyed endpoint-to-endpoint, using unicast traffic only (even if, conveyed endpoint-to-endpoint, using unicast traffic only (even if,
in exotic cases, this unicast traffic happens to be conveyed over an in exotic cases, this unicast traffic happens to be conveyed over an
IP-multicast address). IP-multicast address).
+---+ +---+ +---+ +---+
| A |<------->| B | | A |<------->| B |
+---+ +---+ +---+ +---+
Figure 1: Point to Point Figure 1: Point to Point
The main property of this topology is that A sends to B, and only B, The main property of this topology is that A sends to B, and only B,
while B sends to A, and only A. This avoids all complexities of while B sends to A, and only A. This avoids all complexities of
handling multiple End Points and combining the requirements stemming handling multiple End Points and combining the requirements stemming
from them. Note that an End Point can still use multiple RTP from them. Note that an End Point can still use multiple RTP
Synchronization Sources (SSRCs) in an RTP session. The number of RTP Synchronization Sources (SSRCs) in an RTP session. The number of RTP
sessions in use between A and B can also be of any number, subject sessions in use between A and B can also be of any number, subject
only to system level limitations like the number range of ports. only to system level limitations like the number range of ports.
skipping to change at page 7, line 37 skipping to change at page 7, line 37
transport translators. These middleboxes come in many variations transport translators. These middleboxes come in many variations
including NAT [RFC3022] traversal by pinning the media path to a including NAT [RFC3022] traversal by pinning the media path to a
public address domain relay, network topologies where the RTP stream public address domain relay, network topologies where the RTP stream
is required to pass a particular point for audit by employing is required to pass a particular point for audit by employing
relaying, or preserving privacy by hiding each peer's transport relaying, or preserving privacy by hiding each peer's transport
addresses to the other party. Other protocols or functionalities addresses to the other party. Other protocols or functionalities
that provide this behavior are TURN [RFC5766] servers, Session Border that provide this behavior are TURN [RFC5766] servers, Session Border
Gateways and Media Processing Nodes with media anchoring Gateways and Media Processing Nodes with media anchoring
functionalities. functionalities.
+---+ +---+ +---+ +---+ +---+ +---+
| A |<------>| T |<------->| B | | A |<------>| T |<------->| B |
+---+ +---+ +---+ +---+ +---+ +---+
Figure 2: Point to Point with Translator Figure 2: Point to Point with Translator
A common element in these functions is that they are normally A common element in these functions is that they are normally
transparent at the RTP level, i.e., they perform no changes on any transparent at the RTP level, i.e., they perform no changes on any
RTP or RTCP packet fields and only affect the lower layers. They may RTP or RTCP packet fields and only affect the lower layers. They may
affect, however, the path the RTP and RTCP packets are routed between affect, however, the path the RTP and RTCP packets are routed between
the End Points in the RTP session, and thereby indirectly affect the the End Points in the RTP session, and thereby indirectly affect the
RTP session. For this reason, one could believe that transport RTP session. For this reason, one could believe that transport
translator-type middleboxes do not need to be included in this translator-type middleboxes do not need to be included in this
skipping to change at page 9, line 42 skipping to change at page 9, line 42
A variant of translator behaviour worth pointing out is the one A variant of translator behaviour worth pointing out is the one
depicted in Figure 3 of an End Point A sending a RTP stream depicted in Figure 3 of an End Point A sending a RTP stream
containing media (only) to B. On the path there is a device T that containing media (only) to B. On the path there is a device T that
on A's behalf manipulates the RTP streams. One common example is on A's behalf manipulates the RTP streams. One common example is
that T adds a second RTP stream containing Forward Error Correction that T adds a second RTP stream containing Forward Error Correction
(FEC) information in order to protect A's (non FEC-protected) RTP (FEC) information in order to protect A's (non FEC-protected) RTP
stream. In this case, T needs to semantically bind the new FEC RTP stream. In this case, T needs to semantically bind the new FEC RTP
stream to A's media-carrying RTP stream, for example by using the stream to A's media-carrying RTP stream, for example by using the
same CNAME as A. same CNAME as A.
+------+ +------+ +------+ +------+ +------+ +------+
| | | | | | | | | | | |
| A |------->| T |-------->| B | | A |------->| T |-------->| B |
| | | |---FEC-->| | | | | |---FEC-->| |
+------+ +------+ +------+ +------+ +------+ +------+
Figure 3: Media Translator adding FEC Figure 3: Media Translator adding FEC
there may also be cases where information is added into the original there may also be cases where information is added into the original
RTP stream, while leaving most or all of the original RTP packets RTP stream, while leaving most or all of the original RTP packets
intact (with the exception of certain RTP header fields, such as the intact (with the exception of certain RTP header fields, such as the
sequence number). One example is the injection of meta-data into the sequence number). One example is the injection of meta-data into the
RTP stream, carried in their own RTP packets. RTP stream, carried in their own RTP packets.
Similarly, a Media Translator can sometimes remove information from Similarly, a Media Translator can sometimes remove information from
the RTP stream, while otherwise leaving teh remaining RTP packets the RTP stream, while otherwise leaving the remaining RTP packets
unchanged (again with the exception of certain RTP header fields). unchanged (again with the exception of certain RTP header fields).
Either type of functionality where T manipulates the RTP stream, or Either type of functionality where T manipulates the RTP stream, or
adds an accompanying RTP stream, on behalf of A is also covered under adds an accompanying RTP stream, on behalf of A is also covered under
the media translator definition. the media translator definition.
3.2.2. Back to Back RTP sessions 3.2.2. Back to Back RTP sessions
There exist middleboxes that interconnect two End Points A and B There exist middleboxes that interconnect two End Points A and B
through themselves (MB), but not by being part of a common RTP through themselves (MB), but not by being part of a common RTP
session. They establish instead two different RTP sessions, one session. They establish instead two different RTP sessions, one
between A and the middlebox and another between the middlebox and B. between A and the middlebox and another between the middlebox and B.
This topology is called Topo-Back-To-Back This topology is called Topo-Back-To-Back
|<--Session A-->| |<--Session B-->| |<--Session A-->| |<--Session B-->|
+------+ +------+ +------+ +------+ +------+ +------+
| A |------->| MB |-------->| B | | A |------->| MB |-------->| B |
+------+ +------+ +------+ +------+ +------+ +------+
Figure 4: Back-to-back RTP sessions through Middlebox Figure 4: Back-to-back RTP sessions through Middlebox
The middlebox acts as an application-level gateway and bridges the The middlebox acts as an application-level gateway and bridges the
two RTP sessions. This bridging can be as basic as forwarding the two RTP sessions. This bridging can be as basic as forwarding the
RTP payloads between the sessions, or more complex including media RTP payloads between the sessions, or more complex including media
transcoding. The difference of this topology relative to the single transcoding. The difference of this topology relative to the single
RTP session context is the handling of the SSRCs and the other RTP session context is the handling of the SSRCs and the other
session-related identifiers, such as CNAMEs. With two different RTP session-related identifiers, such as CNAMEs. With two different RTP
sessions these can be freely changed and it becomes the middlebox's sessions these can be freely changed and it becomes the middlebox's
skipping to change at page 11, line 24 skipping to change at page 11, line 24
Multicast (ASM) [RFC1112] where any multicast group participant can Multicast (ASM) [RFC1112] where any multicast group participant can
send to the group address and expect the packet to reach all group send to the group address and expect the packet to reach all group
participants; and Source Specific Multicast (SSM) [RFC3569], where participants; and Source Specific Multicast (SSM) [RFC3569], where
only a particular IP host sends to the multicast group. Both these only a particular IP host sends to the multicast group. Both these
models are discussed below in their respective sections. models are discussed below in their respective sections.
3.3.1. Any Source Multicast (ASM) 3.3.1. Any Source Multicast (ASM)
Shortcut name: Topo-ASM (was Topo-Multicast) Shortcut name: Topo-ASM (was Topo-Multicast)
+-----+ +-----+
+---+ / \ +---+ +---+ / \ +---+
| A |----/ \---| B | | A |----/ \---| B |
+---+ / Multi- \ +---+ +---+ / Multi- \ +---+
+ Cast + + Cast +
+---+ \ Network / +---+ +---+ \ Network / +---+
| C |----\ /---| D | | C |----\ /---| D |
+---+ \ / +---+ +---+ \ / +---+
+-----+ +-----+
Figure 5: Point to Multipoint Using Multicast Figure 5: Point to Multipoint Using Multicast
Point to Multipoint (PtM) is defined here as using a multicast Point to Multipoint (PtM) is defined here as using a multicast
topology as a transmission model, in which traffic from any multicast topology as a transmission model, in which traffic from any multicast
group participant reaches all the other multicast group participants, group participant reaches all the other multicast group participants,
except for cases such as: except for cases such as:
o packet loss, or o packet loss, or
skipping to change at page 13, line 5 skipping to change at page 13, line 5
3.3.2. Source Specific Multicast (SSM) 3.3.2. Source Specific Multicast (SSM)
In Any Source Multicast, any of the multicast group participants can In Any Source Multicast, any of the multicast group participants can
send to all the other multicast group participants, by sending a send to all the other multicast group participants, by sending a
packet to the multicast group. In contrast, Source Specific packet to the multicast group. In contrast, Source Specific
Multicast [RFC3569][RFC4607] refers to scenarios where only a single Multicast [RFC3569][RFC4607] refers to scenarios where only a single
source (Distribution Source) can send to the multicast group, source (Distribution Source) can send to the multicast group,
creating a topology that looks like the one below: creating a topology that looks like the one below:
+--------+ +-----+ +--------+ +-----+
|Media | | | Source-specific |Media | | | Source-specific
|Sender 1|<----->| D S | Multicast |Sender 1|<----->| D S | Multicast
+--------+ | I O | +--+----------------> R(1) +--------+ | I O | +--+----------------> R(1)
| S U | | | | | S U | | | |
+--------+ | T R | | +-----------> R(2) | +--------+ | T R | | +-----------> R(2) |
|Media |<----->| R C |->+ | : | | |Media |<----->| R C |->+ | : | |
|Sender 2| | I E | | +------> R(n-1) | | |Sender 2| | I E | | +------> R(n-1) | |
+--------+ | B | | | | | | +--------+ | B | | | | | |
: | U | +--+--> R(n) | | | : | U | +--+--> R(n) | | |
: | T +-| | | | | : | T +-| | | | |
: | I | |<---------+ | | | : | I | |<---------+ | | |
+--------+ | O |F|<---------------+ | | +--------+ | O |F|<---------------+ | |
|Media | | N |T|<--------------------+ | |Media | | N |T|<--------------------+ |
|Sender M|<----->| | |<-------------------------+ |Sender M|<----->| | |<-------------------------+
+--------+ +-----+ RTCP Unicast +--------+ +-----+ RTCP Unicast
FT = Feedback Target FT = Feedback Target
Transport from the Feedback Target to the Distribution Transport from the Feedback Target to the Distribution
Source is via unicast or multicast RTCP if they are not Source is via unicast or multicast RTCP if they are not
co-located. co-located.
Figure 6: Point to Multipoint using Source Specific Multicast Figure 6: Point to Multipoint using Source Specific Multicast
In the SSM topology (Figure 6) a number of RTP sending End Points In the SSM topology (Figure 6) a number of RTP sending End Points
(RTP sources henceforth) (1 to M) are allowed to send media to the (RTP sources henceforth) (1 to M) are allowed to send media to the
SSM group. These sources send media to a dedicated distribution SSM group. These sources send media to a dedicated distribution
source, which forwards the RTP streams to the multicast group on source, which forwards the RTP streams to the multicast group on
behalf of the original RTP sources. The RTP streams reach the behalf of the original RTP sources. The RTP streams reach the
receiving End Points (Receivers henceforth) (R(1) to R(n)). The receiving End Points (Receivers henceforth) (R(1) to R(n)). The
Receivers' RTCP messages cannot be sent to the multicast group, as Receivers' RTCP messages cannot be sent to the multicast group, as
skipping to change at page 14, line 38 skipping to change at page 14, line 38
not visually pleasing results. not visually pleasing results.
Security solutions for this type of group communications are also Security solutions for this type of group communications are also
challenging. First, the key-management and the security protocol challenging. First, the key-management and the security protocol
must support group communication. Source authentication becomes more must support group communication. Source authentication becomes more
difficult and requires specialized solutions. For more discussion on difficult and requires specialized solutions. For more discussion on
this please review Options for Securing RTP Sessions [RFC7201]. this please review Options for Securing RTP Sessions [RFC7201].
3.3.3. SSM with Local Unicast Resources 3.3.3. SSM with Local Unicast Resources
[RFC6285] "Unicast-Based Rapid Acquisition of Multicast RTP Sessions" "Unicast-Based Rapid Acquisition of Multicast RTP Sessions" [RFC6285]
results in additional extensions to SSM Topology. results in additional extensions to SSM Topology.
----------- -------------- ----------- --------------
| |------------------------------------>| | | |------------------------------------>| |
| |.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.->| | | |.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.->| |
| | | | | | | |
| Multicast | ---------------- | | | Multicast | ---------------- | |
| Source | | Retransmission | | | | Source | | Retransmission | | |
| |-------->| Server (RS) | | | | |-------->| Server (RS) | | |
| |.-.-.-.->| | | | | |.-.-.-.->| | | |
skipping to change at page 15, line 36 skipping to change at page 15, line 36
| ------------ | | | | ------------ | | |
| | | | | | | |
---------------- -------------- ---------------- --------------
-------> Multicast RTP Stream -------> Multicast RTP Stream
.-.-.-.> Multicast RTCP Stream .-.-.-.> Multicast RTCP Stream
.=.=.=.> Unicast RTCP Reports .=.=.=.> Unicast RTCP Reports
~~~~~~~> Unicast RTCP Feedback Messages ~~~~~~~> Unicast RTCP Feedback Messages
.......> Unicast RTP Stream .......> Unicast RTP Stream
Figure 7 Figure 7: SSM with Local Unicast Resources (RAMS)
The Rapid acquisition extension allows an End Point joining an SSM The Rapid acquisition extension allows an End Point joining an SSM
multicast session to request media starting with the last sync-point multicast session to request media starting with the last sync-point
(from where media can be decoded without requiring context (from where media can be decoded without requiring context
established by the decoding of prior packets) to be sent at high established by the decoding of prior packets) to be sent at high
speed until such time where, after decoding of these burst-delivered speed until such time where, after decoding of these burst-delivered
media packets, the correct media timing is established, i.e. media media packets, the correct media timing is established, i.e. media
packets are received within adequate buffer intervals for this packets are received within adequate buffer intervals for this
application. This is accomplished by first establishing a unicast application. This is accomplished by first establishing a unicast
PtP RTP session between the Burst/Retransmission Source (BRS, PtP RTP session between the Burst/Retransmission Source (BRS,
Figure 7) and the RTP Receiver. The unicast session is used to Figure 7) and the RTP Receiver. The unicast session is used to
transmit cached packets from the multicast group at higher then transmit cached packets from the multicast group at higher then
normal speed in order to synchronize the receiver to the ongoing normal speed in order to synchronize the receiver to the ongoing
multicast RTP stream. Once the RTP receiver and its decoder have multicast RTP stream. Once the RTP receiver and its decoder have
caught up with the multicast session's current delivery, the receiver caught up with the multicast session's current delivery, the receiver
switches over to receiving directly from the multicast group. The switches over to receiving directly from the multicast group. In
(still existing) PtP RTP session is, in many deployed applications, many deployed application, the (still existing) PtP RTP session is
be used as a repair channel, i.e., for RTP Retransmission traffic of used as a repair channel, i.e., for RTP Retransmission traffic of
those packets that were not received from the multicast group. those packets that were not received from the multicast group.
3.4. Point to Multipoint Using Mesh 3.4. Point to Multipoint Using Mesh
Shortcut name: Topo-Mesh Shortcut name: Topo-Mesh
+---+ +---+ +---+ +---+
| A |<---->| B | | A |<---->| B |
+---+ +---+ +---+ +---+
^ ^ ^ ^
\ / \ /
\ / \ /
v v v v
+---+ +---+
| C | | C |
+---+ +---+
Figure 8: Point to Multi-Point using Mesh Figure 8: Point to Multi-Point using Mesh
Based on the RTP session definition, it is clearly possible to have a Based on the RTP session definition, it is clearly possible to have a
joint RTP session involving three or more End Points over multiple joint RTP session involving three or more End Points over multiple
unicast transport flows, like the joint three End point session unicast transport flows, like the joint three End point session
depicted above. In this case, A needs to send its RTP streams and depicted above. In this case, A needs to send its RTP streams and
RTCP packets to both B and C over their respective transport flows. RTCP packets to both B and C over their respective transport flows.
As long as all End Points do the same, everyone will have a joint As long as all End Points do the same, everyone will have a joint
view of the RTP session. view of the RTP session.
This topology does not create any additional requirements beyond the This topology does not create any additional requirements beyond the
need to have multiple transport flows associated with a single RTP need to have multiple transport flows associated with a single RTP
session. Note that an End Point may use a single local port to session. Note that an End Point may use a single local port to
receive all these transport flows (in which case the sending port, IP receive all these transport flows (in which case the sending port, IP
address, or SSRC can be used to demultiplex), or it might have address, or SSRC can be used to demultiplex), or it might have
separate local reception ports for each of the End Points. separate local reception ports for each of the End Points.
+-A--------------------+ +-A--------------------+
|+---+ | |+---+ |
||CAM| | +-B-----------+ ||CAM| | +-B-----------+
|+---+ +-UDP1------| |-UDP1------+ | |+---+ +-UDP1------| |-UDP1------+ |
| | | +-RTP1----| |-RTP1----+ | | | | | +-RTP1----| |-RTP1----+ | |
| V | | +-Video-| |-Video-+ | | | | V | | +-Video-| |-Video-+ | | |
|+----+ | | | |<----------------|BV1 | | | | |+----+ | | | |<----------------|BV1 | | | |
||ENC |----+-+-+--->AV1|---------------->| | | | | ||ENC |----+-+-+--->AV1|---------------->| | | | |
|+----+ | | +-------| |-------+ | | | |+----+ | | +-------| |-------+ | | |
| | | +---------| |---------+ | | | | | +---------| |---------+ | |
| | +-----------| |-----------+ | | | +-----------| |-----------+ |
| | | +-------------+ | | | +-------------+
| | | | | |
| | | +-C-----------+ | | | +-C-----------+
| | +-UDP2------| |-UDP2------+ | | | +-UDP2------| |-UDP2------+ |
| | | +-RTP1----| |-RTP1----+ | | | | | +-RTP1----| |-RTP1----+ | |
| | | | +-Video-| |-Video-+ | | | | | | | +-Video-| |-Video-+ | | |
| +-------+-+-+--->AV1|---------------->| | | | | | +-------+-+-+--->AV1|---------------->| | | | |
| | | | |<----------------|CV1 | | | | | | | | |<----------------|CV1 | | | |
| | | +-------| |-------+ | | | | | | +-------| |-------+ | | |
| | +---------| |---------+ | | | | +---------| |---------+ | |
| +-----------| |-----------+ | | +-----------| |-----------+ |
+----------------------+ +-------------+ +----------------------+ +-------------+
Figure 9: An Multi-unicast Mesh with a joint RTP session Figure 9: An Multi-unicast Mesh with a joint RTP session
A joint RTP session from End Point A's perspective for the Mesh A joint RTP session from End Point A's perspective for the Mesh
depicted in Figure 8 with a joint RTP session have multiple transport depicted in Figure 8 with a joint RTP session have multiple transport
flows, here enumerated as UDP1 and UDP2. However, there is only one flows, here enumerated as UDP1 and UDP2. However, there is only one
RTP session (RTP1). The Media Source (CAM) is encoded and RTP session (RTP1). The Media Source (CAM) is encoded and
transmitted over the SSRC (AV1) across both transport layers. transmitted over the SSRC (AV1) across both transport layers.
However, as this is a joint RTP session, the two streams must be the However, as this is a joint RTP session, the two streams must be the
same. Thus, an congestion control adaptation needed for the paths A same. Thus, an congestion control adaptation needed for the paths A
to B and A to C needs to use the most restricting path's properties. to B and A to C needs to use the most restricting path's properties.
An alternative structure for establishing the above topology is to An alternative structure for establishing the above topology is to
use independent RTP sessions between each pair of peers, i.e., three use independent RTP sessions between each pair of peers, i.e., three
different RTP sessions. In some scenarios, the same RTP stream may different RTP sessions. In some scenarios, the same RTP stream may
be sent from the transmitting End Point, however it also supports be sent from the transmitting End Point, however it also supports
local adaptation taking place in one or more of the RTP streams, local adaptation taking place in one or more of the RTP streams,
rendering them non-identical. rendering them non-identical.
+-A----------------------+ +-B-----------+ +-A----------------------+ +-B-----------+
|+---+ | | | |+---+ | | |
||MIC| +-UDP1------| |-UDP1------+ | ||MIC| +-UDP1------| |-UDP1------+ |
|+---+ | +-RTP1----| |-RTP1----+ | | |+---+ | +-RTP1----| |-RTP1----+ | |
| | +----+ | | +-Audio-| |-Audio-+ | | | | | +----+ | | +-Audio-| |-Audio-+ | | |
| +->|ENC1|--+-+-+--->AA1|------------->| | | | | | +->|ENC1|--+-+-+--->AA1|------------->| | | | |
| | +----+ | | | |<-------------|BA1 | | | | | | +----+ | | | |<-------------|BA1 | | | |
| | | | +-------| |-------+ | | | | | | | +-------| |-------+ | | |
| | | +---------| |---------+ | | | | | +---------| |---------+ | |
| | +-----------| |-----------+ | | | +-----------| |-----------+ |
| | ------------| |-------------| | | ------------| |-------------|
| | | |-------------+ | | | |-------------+
| | | | | |
| | | +-C-----------+ | | | +-C-----------+
| | | | | | | | | |
| | +-UDP2------| |-UDP2------+ | | | +-UDP2------| |-UDP2------+ |
| | | +-RTP2----| |-RTP2----+ | | | | | +-RTP2----| |-RTP2----+ | |
| | +----+ | | +-Audio-| |-Audio-+ | | | | | +----+ | | +-Audio-| |-Audio-+ | | |
| +->|ENC2|--+-+-+--->AA2|------------->| | | | | | +->|ENC2|--+-+-+--->AA2|------------->| | | | |
| +----+ | | | |<-------------|CA1 | | | | | +----+ | | | |<-------------|CA1 | | | |
| | | +-------| |-------+ | | | | | | +-------| |-------+ | | |
| | +---------| |---------+ | | | | +---------| |---------+ | |
| +-----------| |-----------+ | | +-----------| |-----------+ |
+------------------------+ +-------------+ +------------------------+ +-------------+
Figure 10: An Multi-unicast Mesh with independent RTP session Figure 10: An Multi-unicast Mesh with independent RTP session
Lets review the topology when independent RTP sessions are used, from Lets review the topology when independent RTP sessions are used, from
A's perspective in Figure 8 by considering both how the media is a A's perspective in Figure 10 by considering both how the media is
handled and the RTP sessions that are set-up in Figure 10. A's handled and the RTP sessions that are set-up in Figure 10. A's
microphone is captured and the digital audio can then be fed into two microphone is captured and the audio is fed into two different
different encoder instances, as each being associated with two encoder instances, each being associated with two independent RTP
independent RTP sessions (RTP1 and RTP2). The SSRCs (AA1 and AA2) in sessions (RTP1 and RTP2). The SSRCs (AA1 and AA2) in each RTP
each RTP session are completely independent and the media bit-rate session are completely independent and the media bit-rate produced by
produced by the encoders can also be tuned differently to address any the encoders can also be tuned differently to address any congestion
congestion control requirements differing for the paths A to B control requirements differing for the paths A to B compared to A to
compared to A to C. C.
From a topologies viewpoint, an important difference exists in the From a topologies viewpoint, an important difference exists in the
behavior around RTCP. First, when a single RTP session spans all behavior around RTCP. First, when a single RTP session spans all
three End Points A, B, and C, and their connecting RTP streams, a three End Points A, B, and C, and their connecting RTP streams, a
common RTCP bandwidth is calculated and used for this single joint common RTCP bandwidth is calculated and used for this single joint
session. In contrast, when there are multiple independent RTP session. In contrast, when there are multiple independent RTP
sessions, each RTP session has its local RTCP bandwidth allocation. sessions, each RTP session has its local RTCP bandwidth allocation.
Further, when multiple sessions are used, End Points not directly Further, when multiple sessions are used, End Points not directly
involved in a session do not have any awareness of the conditions in involved in a session do not have any awareness of the conditions in
skipping to change at page 19, line 31 skipping to change at page 19, line 31
multipoint of Translators compared to the point to point only cases multipoint of Translators compared to the point to point only cases
in Section 3.2.1. in Section 3.2.1.
3.5.1. Relay - Transport Translator 3.5.1. Relay - Transport Translator
Shortcut name: Topo-PtM-Trn-Translator Shortcut name: Topo-PtM-Trn-Translator
This section discusses Transport Translator only usages to enable This section discusses Transport Translator only usages to enable
multipoint sessions. multipoint sessions.
+-----+ +-----+
+---+ / \ +------------+ +---+ +---+ / \ +------------+ +---+
| A |<---/ \ | |<---->| B | | A |<---/ \ | |<---->| B |
+---+ / Multi- \ | | +---+ +---+ / Multi- \ | | +---+
+ cast +->| Translator | + cast +->| Translator |
+---+ \ Network / | | +---+ +---+ \ Network / | | +---+
| C |<---\ / | |<---->| D | | C |<---\ / | |<---->| D |
+---+ \ / +------------+ +---+ +---+ \ / +------------+ +---+
+-----+ +-----+
Figure 11: Point to Multipoint Using Multicast Figure 11: Point to Multipoint Using Multicast
Figure 11 depicts an example of a Transport Translator performing at Figure 11 depicts an example of a Transport Translator performing at
least IP address translation. It allows the (non-multicast-capable) least IP address translation. It allows the (non-multicast-capable)
End Points B and D to take part in an any source multicast session End Points B and D to take part in an any source multicast session
involving End Points A and C, by having the Translator forward their involving End Points A and C, by having the Translator forward their
unicast traffic to the multicast addresses in use, and vice versa. unicast traffic to the multicast addresses in use, and vice versa.
It must also forward B's traffic to D, and vice versa, to provide It must also forward B's traffic to D, and vice versa, to provide
each of B and D with a complete view of the session. each of B and D with a complete view of the session.
+---+ +------------+ +---+ +---+ +------------+ +---+
| A |<---->| |<---->| B | | A |<---->| |<---->| B |
+---+ | | +---+ +---+ | | +---+
| Translator | | Translator |
+---+ | | +---+ +---+ | | +---+
| C |<---->| |<---->| D | | C |<---->| |<---->| D |
+---+ +------------+ +---+ +---+ +------------+ +---+
Figure 12: RTP Translator (Relay) with Only Unicast Paths Figure 12: RTP Translator (Relay) with Only Unicast Paths
Another Translator scenario is depicted in Figure 12. The Translator Another Translator scenario is depicted in Figure 12. The Translator
in this case connects multiple End Points through unicast. This can in this case connects multiple End Points through unicast. This can
be implemented using a very simple transport Translator which, in be implemented using a very simple transport Translator which, in
this document, is called a relay. The relay forwards all traffic it this document, is called a relay. The relay forwards all traffic it
receives, both RTP and RTCP, to all other End Points. In doing so, a receives, both RTP and RTCP, to all other End Points. In doing so, a
multicast network is emulated without relying on a multicast-capable multicast network is emulated without relying on a multicast-capable
network infrastructure. network infrastructure.
skipping to change at page 21, line 19 skipping to change at page 21, line 19
3.6. Point to Multipoint Using the RFC 3550 Mixer Model 3.6. Point to Multipoint Using the RFC 3550 Mixer Model
Shortcut name: Topo-Mixer Shortcut name: Topo-Mixer
A Mixer is a middlebox that aggregates multiple RTP streams that are A Mixer is a middlebox that aggregates multiple RTP streams that are
part of a session by generating one or more new RTP streams and, in part of a session by generating one or more new RTP streams and, in
most cases, by manipulating the media data. One common application most cases, by manipulating the media data. One common application
for a Mixer is to allow a participant to receive a session with a for a Mixer is to allow a participant to receive a session with a
reduced amount of resources. reduced amount of resources.
+-----+ +-----+
+---+ / \ +-----------+ +---+ +---+ / \ +-----------+ +---+
| A |<---/ \ | |<---->| B | | A |<---/ \ | |<---->| B |
+---+ / Multi- \ | | +---+ +---+ / Multi- \ | | +---+
+ cast +->| Mixer | + cast +->| Mixer |
+---+ \ Network / | | +---+ +---+ \ Network / | | +---+
| C |<---\ / | |<---->| D | | C |<---\ / | |<---->| D |
+---+ \ / +-----------+ +---+ +---+ \ / +-----------+ +---+
+-----+ +-----+
Figure 13: Point to Multipoint Using the RFC 3550 Mixer Model Figure 13: Point to Multipoint Using the RFC 3550 Mixer Model
A Mixer can be viewed as a device terminating the RTP streams A Mixer can be viewed as a device terminating the RTP streams
received from other End Points in the same RTP session. Using the received from other End Points in the same RTP session. Using the
media data carried in the received RTP streams, a Mixer generates media data carried in the received RTP streams, a Mixer generates
derived RTP streams that are sent to the receiving End Points. derived RTP streams that are sent to the receiving End Points.
The content that the Mixer provides is the mixed aggregate of what The content that the Mixer provides is the mixed aggregate of what
the Mixer receives over the PtP or PtM paths, which are part of the the Mixer receives over the PtP or PtM paths, which are part of the
skipping to change at page 22, line 18 skipping to change at page 22, line 18
with its role. It is an RTP receiver and should therefore send RTCP with its role. It is an RTP receiver and should therefore send RTCP
receiver reports for the RTP streams it receives and terminates. In receiver reports for the RTP streams it receives and terminates. In
its role as an RTP sender, it should also generate RTCP sender its role as an RTP sender, it should also generate RTCP sender
reports for those RTP streams it sends. As specified in Section 7.3 reports for those RTP streams it sends. As specified in Section 7.3
of RFC 3550, a Mixer must not forward RTCP unaltered between the two of RFC 3550, a Mixer must not forward RTCP unaltered between the two
domains. domains.
The Mixer depicted in Figure 13 is involved in three domains that The Mixer depicted in Figure 13 is involved in three domains that
need to be separated: the any source multicast network (including End need to be separated: the any source multicast network (including End
Points A and C), End Point B, and End Point D. Assuming all four End Points A and C), End Point B, and End Point D. Assuming all four End
Points in the conference are interested in receiving content from Points in the conference are interested in receiving content from all
each other End Point, the Mixer produces different mixed RTP streams other End Points, the Mixer produces different mixed RTP streams for
for B and D, as the one to B may contain content received from D, and B and D, as the one to B may contain content received from D, and
vice versa. However, the Mixer may only need one SSRC per media type vice versa. However, the Mixer may only need one SSRC per media type
in each domain where it is the receiving entity and transmitter of in each domain where it is the receiving entity and transmitter of
mixed content. mixed content.
In the multicast domain, a Mixer still needs to provide a mixed view In the multicast domain, a Mixer still needs to provide a mixed view
of the other domains. This makes the Mixer simpler to implement and of the other domains. This makes the Mixer simpler to implement and
avoids any issues with advanced RTCP handling or loop detection, avoids any issues with advanced RTCP handling or loop detection,
which would be problematic if the Mixer were providing non-symmetric which would be problematic if the Mixer were providing non-symmetric
behavior. Please see Section 3.11 for more discussion on this topic. behavior. Please see Section 3.11 for more discussion on this topic.
The mixing operation, however, in each domain could potentially be The mixing operation, however, in each domain could potentially be
skipping to change at page 23, line 5 skipping to change at page 23, line 5
and transmission of RTCP feedback messages by the Mixer to the End and transmission of RTCP feedback messages by the Mixer to the End
Points in the other domain(s). In other cases, a message is handled Points in the other domain(s). In other cases, a message is handled
by the Mixer locally and therefore not forwarded to any other domain. by the Mixer locally and therefore not forwarded to any other domain.
When replacing the multicast network in Figure 13 (to the left of the When replacing the multicast network in Figure 13 (to the left of the
Mixer) with individual unicast paths as depicted in Figure 14, the Mixer) with individual unicast paths as depicted in Figure 14, the
Mixer model is very similar to the one discussed in Section 3.9 Mixer model is very similar to the one discussed in Section 3.9
below. Please see the discussion in Section 3.9 about the below. Please see the discussion in Section 3.9 about the
differences between these two models. differences between these two models.
+---+ +------------+ +---+ +---+ +------------+ +---+
| A |<---->| |<---->| B | | A |<---->| |<---->| B |
+---+ | | +---+ +---+ | | +---+
| Mixer | | Mixer |
+---+ | | +---+ +---+ | | +---+
| C |<---->| |<---->| D | | C |<---->| |<---->| D |
+---+ +------------+ +---+ +---+ +------------+ +---+
Figure 14: RTP Mixer with Only Unicast Paths Figure 14: RTP Mixer with Only Unicast Paths
We now discuss in more detail the different mixing operations that a We now discuss in more detail the different mixing operations that a
mixer can perform and how they can affect RTP and RTCP behavior. mixer can perform and how they can affect RTP and RTCP behavior.
3.6.1. Media Mixing Mixer 3.6.1. Media Mixing Mixer
The media mixing mixer is likely the one that most think of when they The media mixing mixer is likely the one that most think of when they
hear the term "mixer". Its basic mode of operation is that it hear the term "mixer". Its basic mode of operation is that it
skipping to change at page 24, line 36 skipping to change at page 24, line 36
the lossy nature of most commonly used media codecs. A third problem the lossy nature of most commonly used media codecs. A third problem
is the latency introduced by the media mixing, which can be is the latency introduced by the media mixing, which can be
substantial and annoyingly noticeable in case of video, or in case of substantial and annoyingly noticeable in case of video, or in case of
audio if that mixed audio is lip-sychronized with high latency video. audio if that mixed audio is lip-sychronized with high latency video.
The advantage of media mixing is that it is straightforward for the The advantage of media mixing is that it is straightforward for the
End Points to handle the single media stream (which includes the End Points to handle the single media stream (which includes the
mixed aggregate of many sources), as they don't need to handle mixed aggregate of many sources), as they don't need to handle
multiple decodings, local mixing and composition. In fact, mixers multiple decodings, local mixing and composition. In fact, mixers
were introduced in pre-RTP times so that legacy, single stream were introduced in pre-RTP times so that legacy, single stream
receiving endpoints (that, in some protocol environments, actually receiving endpoints (that, in some protocol environments, actually
didn't need to be aware of the multipoint nature of teh conference) didn't need to be aware of the multipoint nature of the conference)
could successfully participate in what a user would recognize as a could successfully participate in what a user would recognize as a
multiparty video conference. multiparty video conference.
+-A---------+ +-MIXER----------------------+ +-A---------+ +-MIXER----------------------+
| +-RTP1----| |-RTP1------+ +-----+ | | +-RTP1----| |-RTP1------+ +-----+ |
| | +-Audio-| |-Audio---+ | +---+ | | | | | +-Audio-| |-Audio---+ | +---+ | | |
| | | AA1|--------->|---------+-+-|DEC|->| | | | | | AA1|--------->|---------+-+-|DEC|->| | |
| | | |<---------|MA1 <----+ | +---+ | | | | | | |<---------|MA1 <----+ | +---+ | | |
| | | | |(BA1+CA1)|\| +---+ | | | | | | | |(BA1+CA1)|\| +---+ | | |
| | +-------| |---------+ +-|ENC|<-| B+C | | | | +-------| |---------+ +-|ENC|<-| B+C | |
| +---------| |-----------+ +---+ | | | | +---------| |-----------+ +---+ | | |
+-----------+ | | | | +-----------+ | | | |
| | M | | | | M | |
+-B---------+ | | E | | +-B---------+ | | E | |
| +-RTP2----| |-RTP2------+ | D | | | +-RTP2----| |-RTP2------+ | D | |
| | +-Audio-| |-Audio---+ | +---+ | I | | | | +-Audio-| |-Audio---+ | +---+ | I | |
| | | BA1|--------->|---------+-+-|DEC|->| A | | | | | BA1|--------->|---------+-+-|DEC|->| A | |
| | | |<---------|MA2 <----+ | +---+ | | | | | | |<---------|MA2 <----+ | +---+ | | |
| | +-------| |(BA1+CA1)|\| +---+ | | | | | +-------| |(AA1+CA1)|\| +---+ | | |
| +---------| |---------+ +-|ENC|<-| A+C | | | +---------| |---------+ +-|ENC|<-| A+C | |
+-----------+ |-----------+ +---+ | | | +-----------+ |-----------+ +---+ | | |
| | M | | | | M | |
+-C---------+ | | I | | +-C---------+ | | I | |
| +-RTP3----| |-RTP3------+ | X | | | +-RTP3----| |-RTP3------+ | X | |
| | +-Audio-| |-Audio---+ | +---+ | E | | | | +-Audio-| |-Audio---+ | +---+ | E | |
| | | CA1|--------->|---------+-+-|DEC|->| R | | | | | CA1|--------->|---------+-+-|DEC|->| R | |
| | | |<---------|MA3 <----+ | +---+ | | | | | | |<---------|MA3 <----+ | +---+ | | |
| | +-------| |(BA1+CA1)|\| +---+ | | | | | +-------| |(AA1+BA1)|\| +---+ | | |
| +---------| |---------+ +-|ENC|<-| A+B | | | +---------| |---------+ +-|ENC|<-| A+B | |
+-----------+ |-----------+ +---+ +-----+ | +-----------+ |-----------+ +---+ +-----+ |
+----------------------------+ +----------------------------+
Figure 15: Session and SSRC details for Media Mixer Figure 15: Session and SSRC details for Media Mixer
From an RTP perspective media mixing can be a very simple process, as From an RTP perspective media mixing can be a very simple process, as
can be seen in Figure 15. The mixer presents one SSRC towards the can be seen in Figure 15. The mixer presents one SSRC towards the
receiving End Point, e.g., MA1 to Peer A, where the associated stream receiving End Point, e.g., MA1 to Peer A, where the associated stream
is the media mix of the other End Points. As each peer, in this is the media mix of the other End Points. As each peer, in this
example, receives a different version of a mix from the mixer, there example, receives a different version of a mix from the mixer, there
is no actual relation between the different RTP sessions in terms of is no actual relation between the different RTP sessions in terms of
actual media or transport level information. There are, however, actual media or transport level information. There are, however,
skipping to change at page 27, line 14 skipping to change at page 27, line 14
otherwise media transcoding for codec compatibility would still be otherwise media transcoding for codec compatibility would still be
required. required.
We now consider the operation of a media switching mixer that We now consider the operation of a media switching mixer that
supports a video conference with six participating End Points (A-F) supports a video conference with six participating End Points (A-F)
where the two most recent speakers in the conference are shown to where the two most recent speakers in the conference are shown to
each receiving End Point. The mixer has thus two SSRCs sending video each receiving End Point. The mixer has thus two SSRCs sending video
to each peer, and each peer is capable of locally handling two video to each peer, and each peer is capable of locally handling two video
streams simultaneously. streams simultaneously.
+-A---------+ +-MIXER----------------------+ +-A---------+ +-MIXER----------------------+
| +-RTP1----| |-RTP1------+ +-----+ | | +-RTP1----| |-RTP1------+ +-----+ |
| | +-Video-| |-Video---+ | | | | | | +-Video-| |-Video---+ | | | |
| | | AV1|------------>|---------+-+------->| S | | | | | AV1|------------>|---------+-+------->| S | |
| | | |<------------|MV1 <----+-+-BV1----| W | | | | | |<------------|MV1 <----+-+-BV1----| W | |
| | | |<------------|MV2 <----+-+-EV1----| I | | | | | |<------------|MV2 <----+-+-EV1----| I | |
| | +-------| |---------+ | | T | | | | +-------| |---------+ | | T | |
| +---------| |-----------+ | C | | | +---------| |-----------+ | C | |
+-----------+ | | H | | +-----------+ | | H | |
| | | | | | | |
+-B---------+ | | M | | +-B---------+ | | M | |
| +-RTP2----| |-RTP2------+ | A | | | +-RTP2----| |-RTP2------+ | A | |
| | +-Video-| |-Video---+ | | T | | | | +-Video-| |-Video---+ | | T | |
| | | BV1|------------>|---------+-+------->| R | | | | | BV1|------------>|---------+-+------->| R | |
| | | |<------------|MV3 <----+-+-AV1----| I | | | | | |<------------|MV3 <----+-+-AV1----| I | |
| | | |<------------|MV4 <----+-+-EV1----| X | | | | | |<------------|MV4 <----+-+-EV1----| X | |
| | +-------| |---------+ | | | | | | +-------| |---------+ | | | |
| +---------| |-----------+ | | | | +---------| |-----------+ | | |
+-----------+ | | | | +-----------+ | | | |
: : : : : : : :
: : : : : : : :
+-F---------+ | | | | +-F---------+ | | | |
| +-RTP6----| |-RTP6------+ | | | | +-RTP6----| |-RTP6------+ | | |
| | +-Video-| |-Video---+ | | | | | | +-Video-| |-Video---+ | | | |
| | | CV1|------------>|---------+-+------->| | | | | | FV1|------------>|---------+-+------->| | |
| | | |<------------|MV11 <---+-+-AV1----| | | | | | |<------------|MV11 <---+-+-AV1----| | |
| | | |<------------|MV12 <---+-+-EV1----| | | | | | |<------------|MV12 <---+-+-EV1----| | |
| | +-------| |---------+ | | | | | | +-------| |---------+ | | | |
| +---------| |-----------+ +-----+ | | +---------| |-----------+ +-----+ |
+-----------+ +----------------------------+ +-----------+ +----------------------------+
Figure 16: Media Switching RTP Mixer Figure 16: Media Switching RTP Mixer
The Media Switching RTP mixer can, similarly to the Media Mixing The Media Switching RTP mixer can, similarly to the Media Mixing
Mixer, reduce the bit-rate required for media transmission towards Mixer, reduce the bit-rate required for media transmission towards
the different peers by selecting and forwarding only a sub-set of RTP the different peers by selecting and forwarding only a sub-set of RTP
streams it receives from the sending End Points. In cases the mixer streams it receives from the sending End Points. In cases the mixer
receives simulcast transmissions or a scalable encoding of the media receives simulcast transmissions or a scalable encoding of the media
source, the mixer has more degrees of freedom to select streams or source, the mixer has more degrees of freedom to select streams or
sub-sets of stream to forward to a receiving End Point, both based on sub-sets of stream to forward to a receiving End Point, both based on
skipping to change at page 29, line 5 skipping to change at page 29, line 5
3.7. Selective Forwarding Middlebox 3.7. Selective Forwarding Middlebox
Another method for handling media in the RTP mixer is to "project", Another method for handling media in the RTP mixer is to "project",
or make available, all potential RTP sources (SSRCs) into a per-End or make available, all potential RTP sources (SSRCs) into a per-End
Point, independent RTP session. The middlebox can select which of Point, independent RTP session. The middlebox can select which of
the potential sources that are currently actively transmitting media the potential sources that are currently actively transmitting media
will be sent to each of the End Points. This is similar to the media will be sent to each of the End Points. This is similar to the media
switching Mixer but has some important differences in RTP details. switching Mixer but has some important differences in RTP details.
+-A---------+ +-Middlebox-----------------+ +-A---------+ +-Middlebox-----------------+
| +-RTP1----| |-RTP1------+ +-----+ | | +-RTP1----| |-RTP1------+ +-----+ |
| | +-Video-| |-Video---+ | | | | | | +-Video-| |-Video---+ | | | |
| | | AV1|------------>|---------+-+------>| | | | | | AV1|------------>|---------+-+------>| | |
| | | |<------------|BV1 <----+-+-------| S | | | | | |<------------|BV1 <----+-+-------| S | |
| | | |<------------|CV1 <----+-+-------| W | | | | | |<------------|CV1 <----+-+-------| W | |
| | | |<------------|DV1 <----+-+-------| I | | | | | |<------------|DV1 <----+-+-------| I | |
| | | |<------------|EV1 <----+-+-------| T | | | | | |<------------|EV1 <----+-+-------| T | |
| | | |<------------|FV1 <----+-+-------| C | | | | | |<------------|FV1 <----+-+-------| C | |
| | +-------| |---------+ | | H | | | | +-------| |---------+ | | H | |
| +---------| |-----------+ | | | | +---------| |-----------+ | | |
+-----------+ | | M | | +-----------+ | | M | |
| | A | | | | A | |
+-B---------+ | | T | | +-B---------+ | | T | |
| +-RTP2----| |-RTP2------+ | R | | | +-RTP2----| |-RTP2------+ | R | |
| | +-Video-| |-Video---+ | | I | | | | +-Video-| |-Video---+ | | I | |
| | | BV1|------------>|---------+-+------>| X | | | | | BV1|------------>|---------+-+------>| X | |
| | | |<------------|AV1 <----+-+-------| | | | | | |<------------|AV1 <----+-+-------| | |
| | | |<------------|CV1 <----+-+-------| | | | | | |<------------|CV1 <----+-+-------| | |
| | | | : : : |: : : : : : : : :| | | | | | | : : : |: : : : : : : : :| | |
| | | |<------------|FV1 <----+-+-------| | | | | | |<------------|FV1 <----+-+-------| | |
| | +-------| |---------+ | | | | | | +-------| |---------+ | | | |
| +---------| |-----------+ | | | | +---------| |-----------+ | | |
+-----------+ | | | | +-----------+ | | | |
: : : : : : : :
: : : : : : : :
+-F---------+ | | | | +-F---------+ | | | |
| +-RTP6----| |-RTP6------+ | | | | +-RTP6----| |-RTP6------+ | | |
| | +-Video-| |-Video---+ | | | | | | +-Video-| |-Video---+ | | | |
| | | FV1|------------>|---------+-+------>| | | | | | FV1|------------>|---------+-+------>| | |
| | | |<------------|AV1 <----+-+-------| | | | | | |<------------|AV1 <----+-+-------| | |
| | | | : : : |: : : : : : : : :| | | | | | | : : : |: : : : : : : : :| | |
| | | |<------------|EV1 <----+-+-------| | | | | | |<------------|EV1 <----+-+-------| | |
| | +-------| |---------+ | | | | | | +-------| |---------+ | | | |
| +---------| |-----------+ +-----+ | | +---------| |-----------+ +-----+ |
+-----------+ +---------------------------+ +-----------+ +---------------------------+
Figure 17: Selective Forwarding Middlebox Figure 17: Selective Forwarding Middlebox
In the six End Point conference depicted above in (Figure 17) one can In the six End Point conference depicted above in (Figure 17) one can
see that End Point A is aware of five incoming SSRCs, BV1-FV1. If see that End Point A is aware of five incoming SSRCs, BV1-FV1. If
this middlebox intends to have a similar behavior as in Section 3.6.2 this middlebox intends to have a similar behavior as in Section 3.6.2
where the mixer provides the End Points with the two latest speaking where the mixer provides the End Points with the two latest speaking
End Points, then only two out of these five SSRCs need concurrently End Points, then only two out of these five SSRCs need concurrently
transmit media to A. As the middlebox selects the source in the transmit media to A. As the middlebox selects the source in the
different RTP sessions that transmit media to the End points, each different RTP sessions that transmit media to the End points, each
skipping to change at page 32, line 4 skipping to change at page 32, line 4
contexts, are one point of difference. The other is how the contexts, are one point of difference. The other is how the
identification is performed, where the Mixer uses CSRC to provide identification is performed, where the Mixer uses CSRC to provide
information on what is included in a particular RTP stream that information on what is included in a particular RTP stream that
represent a particular concept. Selective forwarding gets the source represent a particular concept. Selective forwarding gets the source
information through the SSRC, and instead have to use other mechanism information through the SSRC, and instead have to use other mechanism
to make clear the streams current purpose. to make clear the streams current purpose.
3.8. Point to Multipoint Using Video Switching MCUs 3.8. Point to Multipoint Using Video Switching MCUs
Shortcut name: Topo-Video-switch-MCU Shortcut name: Topo-Video-switch-MCU
+---+ +------------+ +---+ +---+ +------------+ +---+
| A |------| Multipoint |------| B | | A |------| Multipoint |------| B |
+---+ | Control | +---+ +---+ | Control | +---+
| Unit | | Unit |
+---+ | (MCU) | +---+ +---+ | (MCU) | +---+
| C |------| |------| D | | C |------| |------| D |
+---+ +------------+ +---+ +---+ +------------+ +---+
Figure 18: Point to Multipoint Using a Video Switching MCU Figure 18: Point to Multipoint Using a Video Switching MCU
This PtM topology was popular in early implementations of multipoint This PtM topology was popular in early implementations of multipoint
videoconferencing systems due to its simplicity, and the videoconferencing systems due to its simplicity, and the
corresponding middlebox design has been known as a "video switching corresponding middlebox design has been known as a "video switching
MCU". The more complex RTCP-terminating MCUs, discussed in the next MCU". The more complex RTCP-terminating MCUs, discussed in the next
section, became the norm, however, when technology allowed section, became the norm, however, when technology allowed
implementations at acceptable costs. implementations at acceptable costs.
skipping to change at page 33, line 18 skipping to change at page 33, line 18
appropriate CSRC values. Second, the MCU needs to modify the RTCP appropriate CSRC values. Second, the MCU needs to modify the RTCP
RRs it forwards between the domains. As a result, it is recommended RRs it forwards between the domains. As a result, it is recommended
that one implement a centralized video switching conference using a that one implement a centralized video switching conference using a
Mixer according to RFC 3550, instead of the shortcut implementation Mixer according to RFC 3550, instead of the shortcut implementation
described here. described here.
3.9. Point to Multipoint Using RTCP-Terminating MCU 3.9. Point to Multipoint Using RTCP-Terminating MCU
Shortcut name: Topo-RTCP-terminating-MCU Shortcut name: Topo-RTCP-terminating-MCU
+---+ +------------+ +---+ +---+ +------------+ +---+
| A |<---->| Multipoint |<---->| B | | A |<---->| Multipoint |<---->| B |
+---+ | Control | +---+ +---+ | Control | +---+
| Unit | | Unit |
+---+ | (MCU) | +---+ +---+ | (MCU) | +---+
| C |<---->| |<---->| D | | C |<---->| |<---->| D |
+---+ +------------+ +---+ +---+ +------------+ +---+
Figure 19: Point to Multipoint Using Content Modifying MCUs Figure 19: Point to Multipoint Using Content Modifying MCUs
In this PtM scenario, each End Point runs an RTP point-to-point In this PtM scenario, each End Point runs an RTP point-to-point
session between itself and the MCU. This is a very commonly deployed session between itself and the MCU. This is a very commonly deployed
topology in multipoint video conferencing. The content that the MCU topology in multipoint video conferencing. The content that the MCU
provides to each participant is either: provides to each participant is either:
a. a selection of the content received from the other End Points, or a. a selection of the content received from the other End Points, or
skipping to change at page 35, line 22 skipping to change at page 35, line 22
of these systems, the control stack subunit may also have its own of these systems, the control stack subunit may also have its own
network address. network address.
From an RTP viewpoint, each of the subunits terminates RTP, and acts From an RTP viewpoint, each of the subunits terminates RTP, and acts
as an End Point in the sense that each subunit includes its own, as an End Point in the sense that each subunit includes its own,
independent RTP stack. However, as the subunits are semantically independent RTP stack. However, as the subunits are semantically
part of the same terminal, it is appropriate that this semantic part of the same terminal, it is appropriate that this semantic
relationship is expressed in RTCP protocol elements, namely in the relationship is expressed in RTCP protocol elements, namely in the
CNAME. CNAME.
+---------------------+ +---------------------+
| Endpoint A | | Endpoint A |
| Local Area Network | | Local Area Network |
| +------------+ | | +------------+ |
| +->| Audio |<+-RTP---\ | +->| Audio |<+-RTP---\
| | +------------+ | \ +------+ | | +------------+ | \ +------+
| | +------------+ | +-->| | | | +------------+ | +-->| |
| +->| Video |<+-RTP-------->| B | | +->| Video |<+-RTP-------->| B |
| | +------------+ | +-->| | | | +------------+ | +-->| |
| | +------------+ | / +------+ | | +------------+ | / +------+
| +->| Control |<+-SIP---/ | +->| Control |<+-SIP---/
| +------------+ | | +------------+ |
+---------------------+ +---------------------+
Figure 20: Split Component Terminal Figure 20: Split Component Terminal
It is further sensible that the subunits share a common clock from It is further sensible that the subunits share a common clock from
which RTP and RTCP clocks are derived, to facilitate synchronization which RTP and RTCP clocks are derived, to facilitate synchronization
and avoid clock drift. and avoid clock drift.
To indicate that audio and video Source Streams generated by To indicate that audio and video Source Streams generated by
different sub-units share a common clock, and can be synchronized, different sub-units share a common clock, and can be synchronized,
the RTP streams generated from those Source Streams need to include the RTP streams generated from those Source Streams need to include
skipping to change at page 38, line 11 skipping to change at page 38, line 11
for both the CSRCs and the SSRCs. Thus, in a mixed domain, the only for both the CSRCs and the SSRCs. Thus, in a mixed domain, the only
SSRCs seen will be the ones present in the domain, while there can be SSRCs seen will be the ones present in the domain, while there can be
CSRCs from all the domains connected together with a combination of CSRCs from all the domains connected together with a combination of
Mixers and Translators. The combined SSRC and CSRC space is common Mixers and Translators. The combined SSRC and CSRC space is common
over any Translator or Mixer. It is important to facilitate loop over any Translator or Mixer. It is important to facilitate loop
detection, something that is likely to be even more important in detection, something that is likely to be even more important in
combined topologies due to the mixed behavior between the domains. combined topologies due to the mixed behavior between the domains.
Any hybrid, like the Topo-Video-switch-MCU or Topo-Asymmetric, Any hybrid, like the Topo-Video-switch-MCU or Topo-Asymmetric,
requires considerable thought on how RTCP is dealt with. requires considerable thought on how RTCP is dealt with.
4. Comparing Topologies 4. Topology Properties
The topologies discussed in Section 3 have different properties. The topologies discussed in Section 3 have different properties.
This section first describes these properties and then analyzes how This section describes these properties. Note that, even if a
these properties are supported by the different topologies. Note certain property is supported within a particular topology concept,
that, even if a certain property is supported within a particular the necessary functionality may be optional to implement.
topology concept, the necessary functionality may be optional to
implement.
4.1. Topology Properties
4.1.1. All to All Media Transmission 4.1. All to All Media Transmission
To recapitulate, multicast, and in particular Any Source Multicast To recapitulate, multicast, and in particular Any Source Multicast
(ASM), provides the functionality that everyone may send to, or (ASM), provides the functionality that everyone may send to, or
receive from, everyone else within the session. Source-specific receive from, everyone else within the session. Source-specific
Multicast (SSM) can provide a similar functionality by having anyone Multicast (SSM) can provide a similar functionality by having anyone
intending to participate as sender to send its media to the SSM intending to participate as sender to send its media to the SSM
distribution source. The SSM distribution source forwards the media distribution source. The SSM distribution source forwards the media
to all receivers subscribed to the multicast group. Mesh, MCUs, to all receivers subscribed to the multicast group. Mesh, MCUs,
Mixers, SFMs and Translators may all provide that functionality at Mixers, SFMs and Translators may all provide that functionality at
least on some basic level. However, there are some differences in least on some basic level. However, there are some differences in
which type of reachability they provide. which type of reachability they provide.
Closest to true IP-multicast-based, all-to-all transmission comes Closest to true IP-multicast-based, all-to-all transmission comes
perhaps the transport Translator function called "relay" in in perhaps the transport Translator function called "relay" in
Section 3.5, as well as the Mesh with joint RTP sessions. Media Section 3.5, as well as the Mesh with joint RTP sessions. Media
Translators, Mesh with independent RTP Sessions, Mixers, SFUs and the Translators, Mesh with independent RTP Sessions, Mixers, SFUs and the
MCU variants do not provide a fully meshed forwarding on the MCU variants do not provide a fully meshed forwarding on the
transport level; instead, they only allow limited forwarding of transport level; instead, they only allow limited forwarding of
content from the other session participants. content from the other session participants.
The "all to all media transmission" requires that any media The "all to all media transmission" requires that any media
transmitting End Point considers the path to the least capable transmitting End Point considers the path to the least capable
receiving End Point. Otherwise, the media transmissions may overload receiving End Point. Otherwise, the media transmissions may overload
that path. Therefore, a sending End Point needs to monitor the path that path. Therefore, a sending End Point needs to monitor the path
skipping to change at page 39, line 24 skipping to change at page 39, line 20
causes N-1 streams of transmitted packets to traverse the first hop causes N-1 streams of transmitted packets to traverse the first hop
link from the End Point, in an N End Point mesh. How long the link from the End Point, in an N End Point mesh. How long the
different paths are common, is highly situation dependent. different paths are common, is highly situation dependent.
The transmission of RTCP by design adapts to any changes in the The transmission of RTCP by design adapts to any changes in the
number of participants due to the transmission algorithm, defined in number of participants due to the transmission algorithm, defined in
the RTP specification [RFC3550], and the extensions in AVPF [RFC4585] the RTP specification [RFC3550], and the extensions in AVPF [RFC4585]
(when applicable). That way, the resources utilized for RTCP stay (when applicable). That way, the resources utilized for RTCP stay
within the bounds configured for the session. within the bounds configured for the session.
4.1.2. Transport or Media Interoperability 4.2. Transport or Media Interoperability
All Translators, Mixers, and RTCP-terminating MCU, and Mesh with All Translators, Mixers, and RTCP-terminating MCU, and Mesh with
individual RTP sessions, allow changing the media encoding or the individual RTP sessions, allow changing the media encoding or the
transport to other properties of the other domain, thereby providing transport to other properties of the other domain, thereby providing
extended interoperability in cases where the End Points lack a common extended interoperability in cases where the End Points lack a common
set of media codecs and/or transport protocols. Selective Forwarding set of media codecs and/or transport protocols. Selective Forwarding
Middleboxes can adopt the transport, and (at least) selectively Middleboxes can adopt the transport, and (at least) selectively
forward the encoded streams that match a receiving End Point's forward the encoded streams that match a receiving End Point's
capability. It requires an additional translator to change the media capability. It requires an additional translator to change the media
encoding if the encoded streams do not match the receiving End encoding if the encoded streams do not match the receiving End
Point's capabilities. Point's capabilities.
4.1.3. Per Domain Bit-Rate Adaptation 4.3. Per Domain Bit-Rate Adaptation
End Points are often connected to each other with a heterogeneous set End Points are often connected to each other with a heterogeneous set
of paths. This makes congestion control in a Point to Multipoint set of paths. This makes congestion control in a Point to Multipoint set
problematic. For the ASM, SSM, Mesh with common RTP session, and problematic. For the ASM, SSM, Mesh with common RTP session, and
Transport Relay scenario, each individual sending End Point has to Transport Relay scenario, each individual sending End Point has to
adapt to the receiving End Point behind the least capable path, adapt to the receiving End Point behind the least capable path,
yielding suboptimal quality for the End Points behind the more yielding suboptimal quality for the End Points behind the more
capable paths. This is no longer an issue when Media Translators, capable paths. This is no longer an issue when Media Translators,
Mixers, SFM or MCUs are involved, as each End Point only needs to Mixers, SFM or MCUs are involved, as each End Point only needs to
adapt to the slowest path within its own domain. The Translator, adapt to the slowest path within its own domain. The Translator,
Mixer, SFM, or MCU topologies all require their respective outgoing Mixer, SFM, or MCU topologies all require their respective outgoing
RTP streams to adjust the bit-rate, packet-rate, etc., to adapt to RTP streams to adjust the bit-rate, packet-rate, etc., to adapt to
the least capable path in each of the other domains. That way one the least capable path in each of the other domains. That way one
can avoid lowering the quality to the least-capable End Point in all can avoid lowering the quality to the least-capable End Point in all
the domains at the cost (complexity, delay, equipment) of the Mixer, the domains at the cost (complexity, delay, equipment) of the Mixer,
SFM or Translator, and potentially media sender (multicast/layered SFM or Translator, and potentially media sender (multicast/layered
encoding and sending the different representations). encoding and sending the different representations).
4.1.4. Aggregation of Media 4.4. Aggregation of Media
In the all-to-all media property mentioned above and provided by ASM, In the all-to-all media property mentioned above and provided by ASM,
SSM, Mesh with common RTP session, and relay, all simultaneous media SSM, Mesh with common RTP session, and relay, all simultaneous media
transmissions share the available bit-rate. For End Points with transmissions share the available bit-rate. For End Points with
limited reception capabilities, this may result in a situation where limited reception capabilities, this may result in a situation where
even a minimal acceptable media quality cannot be accomplished, even a minimal acceptable media quality cannot be accomplished,
because multiple RTP streams need to share the same resources. One because multiple RTP streams need to share the same resources. One
solution to this problem is to provide for a Mixer, or MCU to solution to this problem is to provide for a Mixer, or MCU to
aggregate the multiple RTP streams into a single one, where the aggregate the multiple RTP streams into a single one, where the
single RTP stream takes up less resources in terms of bit-rate. This single RTP stream takes up less resources in terms of bit-rate. This
aggregation can be performed according to different methods. Mixing aggregation can be performed according to different methods. Mixing
or selection are two common methods. Selection is almost always or selection are two common methods. Selection is almost always
possible and easy to implement. Mixing requires resources in the possible and easy to implement. Mixing requires resources in the
mixer, and may be relatively easy and not impairing the quality too mixer, and may be relatively easy and not impairing the quality too
badly (audio) or quite difficult (video tiling, which is not only badly (audio) or quite difficult (video tiling, which is not only
computationally complex but also reduces the pixel count per stream, computationally complex but also reduces the pixel count per stream,
with corresponding loss in perceptual quality). with corresponding loss in perceptual quality).
4.1.5. View of All Session Participants 4.5. View of All Session Participants
The RTP protocol includes functionality to identify the session The RTP protocol includes functionality to identify the session
participants through the use of the SSRC and CSRC fields. In participants through the use of the SSRC and CSRC fields. In
addition, it is capable of carrying some further identity information addition, it is capable of carrying some further identity information
about these participants using the RTCP Source Descriptors (SDES). about these participants using the RTCP Source Descriptors (SDES).
In topologies that provide a full all-to-all functionality, i.e. In topologies that provide a full all-to-all functionality, i.e.
ASM, Mesh with common RTP session, Relay a compliant RTP ASM, Mesh with common RTP session, Relay a compliant RTP
implementation offers the functionality directly as specified in RTP. implementation offers the functionality directly as specified in RTP.
In topologies that do not offer all-to-all communication, it is In topologies that do not offer all-to-all communication, it is
necessary that RTCP is handled correctly in domain bridging function. necessary that RTCP is handled correctly in domain bridging function.
skipping to change at page 41, line 5 skipping to change at page 40, line 47
text. However, the MCU described in Section 3.8 cannot offer the text. However, the MCU described in Section 3.8 cannot offer the
full functionality for session participant identification through RTP full functionality for session participant identification through RTP
means. The topologies that create independent RTP sessions per End means. The topologies that create independent RTP sessions per End
Point or pair of End Points, like Back-to-Back RTP session, MESH with Point or pair of End Points, like Back-to-Back RTP session, MESH with
independent RTP sessions, and the RTCP terminating MCU RTCP independent RTP sessions, and the RTCP terminating MCU RTCP
terminating MCU (Section 3.9) do not support RTP based identification terminating MCU (Section 3.9) do not support RTP based identification
of session participants. In all those cases, other non-RTP based of session participants. In all those cases, other non-RTP based
mechanisms need to be implemented if such knowledge is required or mechanisms need to be implemented if such knowledge is required or
desirable. desirable.
4.1.6. Loop Detection 4.6. Loop Detection
In complex topologies with multiple interconnected domains, it is In complex topologies with multiple interconnected domains, it is
possible to unintentionally form media loops. RTP and RTCP support possible to unintentionally form media loops. RTP and RTCP support
detecting such loops, as long as the SSRC and CSRC identities are detecting such loops, as long as the SSRC and CSRC identities are
maintained and correctly set in forwarded packets. Loop detection maintained and correctly set in forwarded packets. Loop detection
will work in ASM, SSM, Mesh with joint RTP session, and Relay. It is will work in ASM, SSM, Mesh with joint RTP session, and Relay. It is
likely that loop detection works for the video switching MCU likely that loop detection works for the video switching MCU
Section 3.8, at least as long as it forwards the RTCP between the End Section 3.8, at least as long as it forwards the RTCP between the End
Points. However, the Back-to-Back RTP sessions, Mesh with Points. However, the Back-to-Back RTP sessions, Mesh with
independent RTP sessions, SFM, will definitely break the loop independent RTP sessions, SFM, will definitely break the loop
detection mechanism. detection mechanism.
4.2. Comparison of Topologies 4.7. Consistency between header extensions and RTCP
Some RTP header extensions have relevance not only end-to-end, but
also hop-to-hop, meaning at least some of the middleboxes in the path
are aware of their potential presence through signaling, intercept
and interpret such header extensions and potentially also rewrite or
generate them. Modern header extensions generally follow RFC 5285
[RFC5285], which allows for all of the above. Examples for such
header extensions include the mid (media ID) in [draft-ietf-mmusic-
sdp-bundle-negotiation-12] [I-D.ietf-mmusic-sdp-bundle-negotiation].
There is also a generalization of mapping RTCP SDES into an RTP
header extension [draft-westerlund-avtext-dses-hdr-ext]
[I-D.westerlund-avtext-sdes-hdr-ext].
When such header extensions are in use, any middlebox that
understands it must ensure consistency between the extensions it sees
and/or generates, and the RTCP it receives and generates. For
example, the mid of bundle is sent in an RTP header extension and
also in an RTCP SDES message. This apparent redundancy was
introduced as unaware middleboxes may choose to discard RTP header
extensions. Obviously, inconsistency between the media ID sent in
the RTP header extension and in the RTCP SDES message could lead to
undesirable results, and, therefore, consistency is needed.
Middleboxes unaware of the nature of a header extension, as specified
in RFC 5285 [RFC5285], are free to forward or discard header
extensions.
5. Comparison of Topologies
The table below attempts to summarize the properties of the different The table below attempts to summarize the properties of the different
topologies. The legend to the topology abbreviations are: Topo- topologies. The legend to the topology abbreviations are: Topo-
Point-to-Point (PtP), Topo-ASM (ASM), Topo-SSM (SSM), Topo-Trns- Point-to-Point (PtP), Topo-ASM (ASM), Topo-SSM (SSM), Topo-Trns-
Translator (TT), Topo-Media-Translator (including Transport Translator (TT), Topo-Media-Translator (including Transport
Translator) (MT), Topo-Mesh with joint session (MJS), Topo-Mesh with Translator) (MT), Topo-Mesh with joint session (MJS), Topo-Mesh with
individual sessions (MIS), Topo-Mixer (Mix), Topo-Asymmetric (ASY), individual sessions (MIS), Topo-Mixer (Mix), Topo-Asymmetric (ASY),
Topo-Video-switch-MCU (VSM), and Topo-RTCP-terminating-MCU (RTM), Topo-Video-switch-MCU (VSM), and Topo-RTCP-terminating-MCU (RTM),
Selective Forwarding Middlebox (SFM). In the table below, Y Selective Forwarding Middlebox (SFM). In the table below, Y
indicates Yes or full support, N indicates No support, (Y) indicates indicates Yes or full support, N indicates No support, (Y) indicates
skipping to change at page 41, line 43 skipping to change at page 42, line 17
All to All media N Y (Y) Y Y Y (Y) (Y) (Y) (Y) (Y) (Y) All to All media N Y (Y) Y Y Y (Y) (Y) (Y) (Y) (Y) (Y)
Interoperability N/A N N Y Y Y Y Y Y N Y Y Interoperability N/A N N Y Y Y Y Y Y N Y Y
Per Domain Adaptation N/A N N N Y N Y Y Y N Y Y Per Domain Adaptation N/A N N N Y N Y Y Y N Y Y
Aggregation of media N N N N N N N Y (Y) Y Y N Aggregation of media N N N N N N N Y (Y) Y Y N
Full Session View Y Y Y Y Y Y N Y Y (Y) N Y Full Session View Y Y Y Y Y Y N Y Y (Y) N Y
Loop Detection Y Y Y Y Y Y N Y Y (Y) N N Loop Detection Y Y Y Y Y Y N Y Y (Y) N N
Please note that the Media Translator also includes the transport Please note that the Media Translator also includes the transport
Translator functionality. Translator functionality.
5. Security Considerations 6. Security Considerations
The use of Mixers, SFMs and Translators has impact on security and The use of Mixers, SFMs and Translators has impact on security and
the security functions used. The primary issue is that both Mixers, the security functions used. The primary issue is that both Mixers,
SFMs and Translators modify packets, thus preventing the use of SFMs and Translators modify packets, thus preventing the use of
integrity and source authentication, unless they are trusted devices integrity and source authentication, unless they are trusted devices
that take part in the security context, e.g., the device can send that take part in the security context, e.g., the device can send
Secure Realtime Transport Protocol (SRTP) and Secure Realtime Secure Realtime Transport Protocol (SRTP) and Secure Realtime
Transport Control Protocol (SRTCP) [RFC3711] packets to End Points in Transport Control Protocol (SRTCP) [RFC3711] packets to End Points in
the Communication Session. If encryption is employed, the media the Communication Session. If encryption is employed, the media
Translator, SFM and Mixer need to be able to decrypt the media to Translator, SFM and Mixer need to be able to decrypt the media to
skipping to change at page 43, line 48 skipping to change at page 44, line 22
There exist a number of different mechanisms to provide keys to the There exist a number of different mechanisms to provide keys to the
different participants. One example is the choice between group keys different participants. One example is the choice between group keys
and unique keys per SSRC. The appropriate keying model is impacted and unique keys per SSRC. The appropriate keying model is impacted
by the topologies one intends to use. The final security properties by the topologies one intends to use. The final security properties
are dependent on both the topologies in use and the keying are dependent on both the topologies in use and the keying
mechanisms' properties, and need to be considered by the application. mechanisms' properties, and need to be considered by the application.
Exactly which mechanisms are used is outside of the scope of this Exactly which mechanisms are used is outside of the scope of this
document. Please review RTP Security Options [RFC7201] to get a document. Please review RTP Security Options [RFC7201] to get a
better understanding of most of the available options. better understanding of most of the available options.
6. IANA Considerations 7. IANA Considerations
This document makes no request of IANA. This document makes no request of IANA.
Note to RFC Editor: this section may be removed on publication as an Note to RFC Editor: this section may be removed on publication as an
RFC. RFC.
7. Acknowledgements 8. Acknowledgements
The authors would like to thank Mark Baugher, Bo Burman, Umesh The authors would like to thank Mark Baugher, Bo Burman, Umesh
Chandra, Alex Eleftheriadis, Roni Even, Ladan Gharai, Geoff Hunt, Chandra, Alex Eleftheriadis, Roni Even, Ladan Gharai, Geoff Hunt,
Keith Lantz, and Colin Perkins for their help in reviewing and Keith Lantz, Jonathan Lennox, Colin Perkins, and Suhas Nandakumar for
improving this document. their help in reviewing and improving this document.
8. References 9. References
8.1. Normative References 9.1. Normative References
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003. Applications", STD 64, RFC 3550, July 2003.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
Norrman, "The Secure Real-time Transport Protocol (SRTP)", Norrman, "The Secure Real-time Transport Protocol (SRTP)",
RFC 3711, March 2004. RFC 3711, March 2004.
[RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session [RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session
Initiation Protocol (SIP) Event Package for Conference Initiation Protocol (SIP) Event Package for Conference
State", RFC 4575, August 2006. State", RFC 4575, August 2006.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
"Extended RTP Profile for Real-time Transport Control "Extended RTP Profile for Real-time Transport Control
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July
2006. 2006.
8.2. Informative References 9.2. Informative References
[I-D.ietf-avtcore-rtp-multi-stream-optimisation] [I-D.ietf-avtcore-rtp-multi-stream-optimisation]
Lennox, J., Westerlund, M., Wu, W., and C. Perkins, Lennox, J., Westerlund, M., Wu, W., and C. Perkins,
"Sending Multiple Media Streams in a Single RTP Session: "Sending Multiple Media Streams in a Single RTP Session:
Grouping RTCP Reception Statistics and Other Feedback", Grouping RTCP Reception Statistics and Other Feedback",
draft-ietf-avtcore-rtp-multi-stream-optimisation-03 (work draft-ietf-avtcore-rtp-multi-stream-optimisation-04 (work
in progress), July 2014. in progress), August 2014.
[I-D.ietf-mmusic-sdp-bundle-negotiation]
Holmberg, C., Alvestrand, H., and C. Jennings,
"Negotiating Media Multiplexing Using the Session
Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle-
negotiation-12 (work in progress), October 2014.
[I-D.westerlund-avtext-sdes-hdr-ext]
Westerlund, M., Even, R., and M. Zanaty, "RTP Header
Extension for RTCP Source Description Items", draft-
westerlund-avtext-sdes-hdr-ext-03 (work in progress),
November 2014.
[RFC1112] Deering, S., "Host extensions for IP multicasting", STD 5, [RFC1112] Deering, S., "Host extensions for IP multicasting", STD 5,
RFC 1112, August 1989. RFC 1112, August 1989.
[RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network
Address Translator (Traditional NAT)", RFC 3022, January Address Translator (Traditional NAT)", RFC 3022, January
2001. 2001.
[RFC3569] Bhattacharyya, S., "An Overview of Source-Specific [RFC3569] Bhattacharyya, S., "An Overview of Source-Specific
Multicast (SSM)", RFC 3569, July 2003. Multicast (SSM)", RFC 3569, July 2003.
[RFC4607] Holbrook, H. and B. Cain, "Source-Specific Multicast for [RFC4607] Holbrook, H. and B. Cain, "Source-Specific Multicast for
IP", RFC 4607, August 2006. IP", RFC 4607, August 2006.
[RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
"Codec Control Messages in the RTP Audio-Visual Profile "Codec Control Messages in the RTP Audio-Visual Profile
with Feedback (AVPF)", RFC 5104, February 2008. with Feedback (AVPF)", RFC 5104, February 2008.
[RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP
Header Extensions", RFC 5285, July 2008.
[RFC5760] Ott, J., Chesterfield, J., and E. Schooler, "RTP Control [RFC5760] Ott, J., Chesterfield, J., and E. Schooler, "RTP Control
Protocol (RTCP) Extensions for Single-Source Multicast Protocol (RTCP) Extensions for Single-Source Multicast
Sessions with Unicast Feedback", RFC 5760, February 2010. Sessions with Unicast Feedback", RFC 5760, February 2010.
[RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using [RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using
Relays around NAT (TURN): Relay Extensions to Session Relays around NAT (TURN): Relay Extensions to Session
Traversal Utilities for NAT (STUN)", RFC 5766, April 2010. Traversal Utilities for NAT (STUN)", RFC 5766, April 2010.
[RFC6285] Ver Steeg, B., Begen, A., Van Caenegem, T., and Z. Vax,
"Unicast-Based Rapid Acquisition of Multicast RTP
Sessions", RFC 6285, June 2011.
[RFC6465] Ivov, E., Marocco, E., and J. Lennox, "A Real-time [RFC6465] Ivov, E., Marocco, E., and J. Lennox, "A Real-time
Transport Protocol (RTP) Header Extension for Mixer-to- Transport Protocol (RTP) Header Extension for Mixer-to-
Client Audio Level Indication", RFC 6465, December 2011. Client Audio Level Indication", RFC 6465, December 2011.
[RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP [RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP
Sessions", RFC 7201, April 2014. Sessions", RFC 7201, April 2014.
Authors' Addresses Authors' Addresses
Magnus Westerlund Magnus Westerlund
 End of changes. 52 change blocks. 
317 lines changed or deleted 359 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/