draft-ietf-nvo3-hpvr2nve-cp-req-11.txt   draft-ietf-nvo3-hpvr2nve-cp-req-12.txt 
NVO3 Working Group Y. Li NVO3 Working Group Y. Li
INTERNET-DRAFT D. Eastlake INTERNET-DRAFT D. Eastlake
Intended Status: Informational Huawei Technologies Intended Status: Informational Huawei Technologies
L. Kreeger L. Kreeger
Arrcus, Inc Arrcus, Inc
T. Narten T. Narten
IBM IBM
D. Black D. Black
EMC EMC
Expires: July 12, 2018 January 8, 2018 Expires: July 17, 2018 January 13, 2018
Split-NVE Control Plane Requirements Split Network Virtualization Edge (Split-NVE) Control Plane Requirements
draft-ietf-nvo3-hpvr2nve-cp-req-11 draft-ietf-nvo3-hpvr2nve-cp-req-12
Abstract Abstract
In a Split-NVE architecture, the functions of the NVE (Network In a Split Network Virtualization Edge (Split-NVE) architecture, the
Virtualization Edge) are split across a server and an external functions of the NVE (Network Virtualization Edge) are split across a
network equipment which is called an external NVE. The server- server and an external network equipment which is called an external
resident control plane functionality resides in control software, NVE. The server-resident control plane functionality resides in
which may be part of a hypervisor or container management software; control software, which may be part of hypervisor or container
for simplicity, this draft refers to the hypervisor as the location management software; for simplicity, this document refers to the
of this software. hypervisor as the location of this software.
Control plane protocol(s) between a hypervisor and its associated Control plane protocol(s) between a hypervisor and its associated
external NVE(s) are used by the hypervisor to distribute its virtual external NVE(s) are used by the hypervisor to distribute its virtual
machine networking state to the external NVE(s) for further handling. machine networking state to the external NVE(s) for further handling.
This document illustrates the functionality required by this type of This document illustrates the functionality required by this type of
control plane signaling protocol and outlines the high level control plane signaling protocol and outlines the high level
requirements. Virtual machine states as well as state transitioning requirements. Virtual machine states as well as state transitioning
are summarized to help clarify the needed protocol requirements. are summarized to help clarify the protocol requirements.
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as other groups may also distribute working documents as
Internet-Drafts. Internet-Drafts.
skipping to change at page 3, line 6 skipping to change at page 3, line 6
3.3 TSI Disassociate and Deactivate . . . . . . . . . . . . . . 15 3.3 TSI Disassociate and Deactivate . . . . . . . . . . . . . . 15
4. Hypervisor-to-NVE Control Plane Protocol Requirements . . . . . 16 4. Hypervisor-to-NVE Control Plane Protocol Requirements . . . . . 16
5. VDP Applicability and Enhancement Needs . . . . . . . . . . . . 17 5. VDP Applicability and Enhancement Needs . . . . . . . . . . . . 17
6. Security Considerations . . . . . . . . . . . . . . . . . . . . 19 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 19
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 19 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 19
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.1 Normative References . . . . . . . . . . . . . . . . . . . 20 8.1 Normative References . . . . . . . . . . . . . . . . . . . 20
8.2 Informative References . . . . . . . . . . . . . . . . . . 20 8.2 Informative References . . . . . . . . . . . . . . . . . . 20
Appendix A. IEEE 802.1Qbg VDP Illustration (For information Appendix A. IEEE 802.1Qbg VDP Illustration (For information
only) . . . . . . . . . . . . . . . . . . . . . . . . . . 20 only) . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23
1. Introduction 1. Introduction
In the Split-NVE architecture shown in Figure 1, the functionality of In the Split-NVE architecture shown in Figure 1, the functionality of
the NVE (Network Virtualization Edge) is split across an end device the NVE (Network Virtualization Edge) is split across an end device
supporting virtualization and an external network device which is supporting virtualization and an external network device which is
called an external NVE. The portion of the NVE functionality located called an external NVE. The portion of the NVE functionality located
on the end device is called the tNVE and the portion located on the on the end device is called the tNVE and the portion located on the
external NVE is called the nNVE in this document. Overlay external NVE is called the nNVE in this document. Overlay
encapsulation/decapsulation functions are normally off-loaded to the encapsulation/decapsulation functions are normally off-loaded to the
nNVE on the external NVE. nNVE on the external NVE.
The tNVE is normally implemented as a part of hypervisor or container The tNVE is normally implemented as a part of hypervisor or container
and/or virtual switch in an virtualized end device. This document and/or virtual switch in an virtualized end device. This document
uses the term "hypervisor" throughout when describing the Split-NVE uses the term "hypervisor" throughout when describing the Split-NVE
scenario where part of the NVE functionality is off-loaded to a scenario where part of the NVE functionality is off-loaded to a
separate device from the "hypervisor" that contains a VM connected to separate device from the "hypervisor" that contains a VM (Virtual
a VN. In this context, the term "hypervisor" is meant to cover any Machine) connected to a VN (Virutal Network). In this context, the
device type where part of the NVE functionality is off-loaded in this term "hypervisor" is meant to cover any device type where part of the
fashion, e.g.,a Network Service Appliance or Linux Container. NVE functionality is off-loaded in this fashion, e.g.,a Network
Service Appliance or Linux Container.
The problem statement [RFC7364], discusses the needs for a control The NVO3 problem statement [RFC7364], discusses the needs for a
plane protocol (or protocols) to populate each NVE with the state control plane protocol (or protocols) to populate each NVE with the
needed to perform the required functions. In one scenario, an NVE state needed to perform the required functions. In one scenario, an
provides overlay encapsulation/decapsulation packet forwarding NVE provides overlay encapsulation/decapsulation packet forwarding
services to Tenant Systems (TSs) that are co-resident within the NVE services to Tenant Systems (TSs) that are co-resident within the NVE
on the same End Device (e.g. when the NVE is embedded within a on the same End Device (e.g. when the NVE is embedded within a
hypervisor or a Network Service Appliance). In such cases, there is hypervisor or a Network Service Appliance). In such cases, there is
no need for a standardized protocol between the hypervisor and NVE, no need for a standardized protocol between the hypervisor and NVE,
as the interaction is implemented via software on a single device. as the interaction is implemented via software on a single device.
While in the Split-NVE architecture scenarios, as shown in figure 2 While in the Split-NVE architecture scenarios, as shown in figure 2
to figure 4, control plane protocol(s) between a hypervisor and its to figure 4, control plane protocol(s) between a hypervisor and its
associated external NVE(s) are required for the hypervisor to associated external NVE(s) are required for the hypervisor to
distribute the virtual machines networking states to the NVE(s) for distribute the virtual machines networking states to the NVE(s) for
further handling. The protocol is an NVE-internal protocol and runs further handling. The protocol is an NVE-internal protocol and runs
between tNVE and nNVE logical entities. This protocol is mentioned in between tNVE and nNVE logical entities. This protocol is mentioned in
NVO3 problem statement [RFC7364] and appears as the third work item. the NVO3 problem statement [RFC7364] and appears as the third work
item.
Virtual machine states and state transitioning are summarized in this Virtual machine states and state transitioning are summarized in this
document to show events where the NVE needs to take specific actions. document showing events where the NVE needs to take specific actions.
Such events might correspond to actions the control plane signaling Such events might correspond to actions the control plane signaling
protocol(s) need to take between tNVE and nNVE in Split-NVE scenario. protocol(s) need to take between tNVE and nNVE in the Split-NVE
The high level requirements to be fulfilled are stated. scenario. The high level requirements to be fulfilled are stated.
+-- -- -- -- Split-NVE -- -- -- --+ +-- -- -- -- Split-NVE -- -- -- --+
| |
| |
+---------------|-----+ +---------------|-----+
| +------------- ----+| | | +------------- ----+| |
| | +--+ +---\|/--+|| +------ --------------+ | | +--+ +---\|/--+|| +------ --------------+
| | |VM|---+ ||| | \|/ | | | |VM|---+ ||| | \|/ |
| | +--+ | ||| |+--------+ | | | +--+ | ||| |+--------+ |
| | +--+ | tNVE |||----- - - - - - -----|| | | | | +--+ | tNVE |||----- - - - - - -----|| | |
skipping to change at page 5, line 32 skipping to change at page 5, line 32
Figure 1 Split-NVE structure Figure 1 Split-NVE structure
This document uses VMs as an example of Tenant Systems (TSs) in order This document uses VMs as an example of Tenant Systems (TSs) in order
to describe the requirements, even though a VM is just one type of to describe the requirements, even though a VM is just one type of
Tenant System that may connect to a VN. For example, a service Tenant System that may connect to a VN. For example, a service
instance within a Network Service Appliance is another type of TS, as instance within a Network Service Appliance is another type of TS, as
are systems running on an OS-level virtualization technologies like are systems running on an OS-level virtualization technologies like
containers. The fact that VMs have lifecycles (e.g., can be created containers. The fact that VMs have lifecycles (e.g., can be created
and destroyed, can be moved, and can be started or stopped) results and destroyed, can be moved, and can be started or stopped) results
in a general set of protocol requirements, most of which are in a general set of protocol requirements, most of which are
applicable to other forms of TSs. Note that not all of the applicable to other forms of TSs although not all of the requirements
requirements are applicable to all forms of TSs. are applicable to all forms of TSs.
Section 2 describes VM states and state transitioning in the VM's Section 2 describes VM states and state transitioning in the VM's
lifecycle. Section 3 introduces Hypervisor-to-NVE control plane lifecycle. Section 3 introduces Hypervisor-to-NVE control plane
protocol functionality derived from VM operations and network events. protocol functionality derived from VM operations and network events.
Section 4 outlines the requirements of the control plane protocol to Section 4 outlines the requirements of the control plane protocol to
achieve the required functionality. achieve the required functionality.
1.1 Terminology 1.1 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119] and document are to be interpreted as described in RFC 2119 [RFC2119] and
RFC 8174 [RFC8174]. RFC 8174 [RFC8174].
This document uses the same terminology as found in [RFC7365]. This This document uses the same terminology as found in [RFC7365]. This
section defines additional terminology used by this document. section defines additional terminology used by this document.
Split-NVE: a type of NVE (Network Virtualization Edge) that the Split-NVE: a type of NVE (Network Virtualization Edge) where the
functionalities of it are split across an end device supporting functionalities are split across an end device supporting
virtualization and an external network device. virtualization and an external network device.
tNVE: the portion of Split-NVE functionalities located on the end tNVE: the portion of Split-NVE functionalities located on the end
device supporting virtualization. It interacts with tenant system by device supporting virtualization. It interacts with a tenant system
internal interface in end device. through an internal interface in the end device.
nNVE: the portion of Split-NVE functionalities located on the network nNVE: the portion of Split-NVE functionalities located on the network
device which is directly or indirectly connects to the end device device that is directly or indirectly connected to the end device
holding the corresponding tNVE. nNVE normally performs encapsulation holding the corresponding tNVE. nNVE normally performs encapsulation
and decapsulation to the overlay network. to and decapsulation from the overlay network.
External NVE: the physical network device holding nNVE External NVE: the physical network device holding the nNVE
Hypervisor: the logical collection of software, firmware and/or Hypervisor: the logical collection of software, firmware and/or
hardware that allows the creation and running of server or service hardware that allows the creation and running of server or service
appliance virtualization. tNVE is located on Hypervisor. It is appliance virtualization. tNVE is located under a Hypervisor.
loosely used in this document to refer to the end device supporting Hypervisor is loosely used in this document to refer to the end
the virtualization. For simplicity, we also use Hypervisor in this device supporting the virtualization. For simplicity, we also use
document to represent both hypervisor and container. Hypervisor to represent both hypervisor and container.
Container: Please refer to Hypervisor. This document use hypervisors Container: Please refer to Hypervisor. For simplicity this document
to represent both hypervisor and container for simplicity. use the term hypervisor to represent both hypervisor and container.
VN Profile: Meta data associated with a VN (Virtual Network) that is VN Profile: Meta data associated with a VN (Virtual Network) that is
applied to any attachment point to the VN. That is, VAP (Virtual applied to any attachment point to the VN. That is, VAP (Virtual
Access Point) properties that are applied to all VAPs associated with Access Point) properties that are applied to all VAPs associated with
a given VN and used by an NVE when ingressing/egressing packets a given VN and used by an NVE when ingressing/egressing packets
to/from a specific VN. Meta data could include such information as to/from a specific VN. Meta data could include such information as
ACLs, QoS settings, etc. The VN Profile contains parameters that ACLs, QoS settings, etc. The VN Profile contains parameters that
apply to the VN as a whole. Control protocols between the NVE and apply to the VN as a whole. Control protocols between the NVE and
NVA (Network Virtualization Authority) could use the VN ID or VN Name NVA (Network Virtualization Authority) could use the VN ID or VN Name
to obtain the VN Profile. to obtain the VN Profile.
VSI: Virtual Station Interface. [IEEE 802.1Qbg] VSI: Virtual Station Interface. [IEEE 802.1Qbg]
VDP: VSI Discovery and Configuration Protocol [IEEE 802.1Qbg] VDP: VSI Discovery and Configuration Protocol [IEEE 802.1Qbg]
1.2 Target Scenarios 1.2 Target Scenarios
In the Split-NVE architecture, an external NVE can provide an offload In the Split-NVE architecture, an external NVE can provide an offload
of the encapsulation / decapsulation functions and network policy of the encapsulation / decapsulation functions and network policy
enforcement as well as the VN Overlay protocol overhead. This enforcement as well as the VN Overlay protocol overhead. This
offloading may provide performance improvements and/or resource offloading may improve performance and/or save resources in the End
savings to the End Device (e.g. hypervisor) making use of the Device (e.g. hypervisor) using the external NVE.
external NVE.
The following figures give example scenarios of a Split-NVE The following figures give example scenarios of a Split-NVE
architecture. architecture.
Hypervisor Access Switch Hypervisor Access Switch
+------------------+ +-----+-------+ +------------------+ +-----+-------+
| +--+ +-------+ | | | | | +--+ +-------+ | | | |
| |VM|---| | | VLAN | | | | |VM|---| | | VLAN | | |
| +--+ | tNVE |---------+ nNVE| +--- Underlying | +--+ | tNVE |---------+ nNVE| +--- Underlying
| +--+ | | | Trunk | | | Network | +--+ | | | Trunk | | | Network
skipping to change at page 7, line 29 skipping to change at page 7, line 29
Hypervisor L2 Switch Hypervisor L2 Switch
+---------------+ +-----+ +----+---+ +---------------+ +-----+ +----+---+
| +--+ +----+ | | | | | | | +--+ +----+ | | | | | |
| |VM|---| | |VLAN | |VLAN | | | | |VM|---| | |VLAN | |VLAN | | |
| +--+ |tNVE|-------+ +-----+nNVE| +--- Underlying | +--+ |tNVE|-------+ +-----+nNVE| +--- Underlying
| +--+ | | |Trunk| |Trunk| | | Network | +--+ | | |Trunk| |Trunk| | | Network
| |VM|---| | | | | | | | | |VM|---| | | | | | | |
| +--+ +----+ | | | | | | | +--+ +----+ | | | | | |
+---------------+ +-----+ +----+---+ +---------------+ +-----+ +----+---+
Figure 3 Hypervisor with an External NVE Figure 3 Hypervisor with an External NVE
across an Ethernet Access Switch connected through an Ethernet Access Switch
Network Service Appliance Access Switch Network Service Appliance Access Switch
+--------------------------+ +-----+-------+ +--------------------------+ +-----+-------+
| +------------+ | \ | | | | | +------------+ | \ | | | |
| |Net Service |----| \ | | | | | |Net Service |----| \ | | | |
| |Instance | | \ | VLAN | | | | |Instance | | \ | VLAN | | |
| +------------+ |tNVE| |------+nNVE | +--- Underlying | +------------+ |tNVE| |------+nNVE | +--- Underlying
| +------------+ | | | Trunk| | | Network | +------------+ | | | Trunk| | | Network
| |Net Service |----| / | | | | | |Net Service |----| / | | | |
| |Instance | | / | | | | | |Instance | | / | | | |
skipping to change at page 8, line 18 skipping to change at page 8, line 18
have a complex address hierarchy. This implies that if a given TSI have a complex address hierarchy. This implies that if a given TSI
disassociates from one VN, all the MAC and/or IP addresses are also disassociates from one VN, all the MAC and/or IP addresses are also
disassociated. There is no need to signal the deletion of every MAC disassociated. There is no need to signal the deletion of every MAC
or IP when the TSI is brought down or deleted. In the majority of or IP when the TSI is brought down or deleted. In the majority of
cases, a VM will be acting as a simple host that will have a single cases, a VM will be acting as a simple host that will have a single
TSI and single MAC and IP visible to the external NVE. TSI and single MAC and IP visible to the external NVE.
Figures 2 through 4 show the use of VLANs to separate traffic for Figures 2 through 4 show the use of VLANs to separate traffic for
multiple VNs between the tNVE and nNVE; VLANs are not strictly multiple VNs between the tNVE and nNVE; VLANs are not strictly
necessary if only one VN is involved, but multiple VNs are expected necessary if only one VN is involved, but multiple VNs are expected
in most cases, and hence this draft assumes their presence. in most cases. Hence this draft assumes the presence of VLANs.
2. VM Lifecycle 2. VM Lifecycle
Figure 2 of [RFC7666] shows the state transition of a VM. Some of the Figure 2 of [RFC7666] shows the state transition of a VM. Some of the
VM states are of interest to the external NVE. This section VM states are of interest to the external NVE. This section
illustrates the relevant phases and events in the VM lifecycle. Note illustrates the relevant phases and events in the VM lifecycle. Note
that the following subsections do not give an exhaustive traversal of that the following subsections do not give an exhaustive traversal of
VM lifecycle state. They are intended as the illustrative examples VM lifecycle state. They are intended as the illustrative examples
which are relevant to Split-NVE architecture, not as prescriptive which are relevant to Split-NVE architecture, not as prescriptive
text; the goal is to capture sufficient detail to set a context for text; the goal is to capture sufficient detail to set a context for
the signaling protocol functionality and requirements described in the signaling protocol functionality and requirements described in
the following sections. the following sections.
2.1 VM Creation Event 2.1 VM Creation Event
The VM creation event causes the VM state transition from Preparing The VM creation event causes the VM state transition from Preparing
to Shutdown and then to Running [RFC7666]. The end device allocates to Shutdown and then to Running [RFC7666]. The end device allocates
and initializes local virtual resources like storage in the VM and initializes local virtual resources like storage in the VM
Preparing state. In the Shutdown state, the VM has everything ready Preparing state. In the Shutdown state, the VM has everything ready
except that CPU execution is not scheduled by the hypervisor and VM's except that CPU execution is not scheduled by the hypervisor and VM's
memory is not resident in the hypervisor. From the Shutdown state to memory is not resident in the hypervisor. The transition from the
the Running state, normally it requires human action or a system Shutdown state to the Running state normally requires human action or
triggered event. Running state indicates the VM is in the normal a system triggered event. Running state indicates the VM is in the
execution state. As part of transitioning the VM to the Running normal execution state. As part of transitioning the VM to the
state, the hypervisor must also provision network connectivity for Running state, the hypervisor must also provision network
the VM's TSI(s) so that Ethernet frames can be sent and received connectivity for the VM's TSI(s) so that Ethernet frames can be sent
correctly. No ongoing migration, suspension or shutdown is in and received correctly. Initially, when Running, no ongoing
process. migration, suspension or shutdown is in process.
In the VM creation phase, the VM's TSI has to be associated with the In the VM creation phase, the VM's TSI has to be associated with the
external NVE. Association here indicates that hypervisor and the external NVE. Association here indicates that hypervisor and the
external NVE have signaled each other and reached some agreement. external NVE have signaled each other and reached some agreement.
Relevant networking parameters or information have been provisioned Relevant networking parameters or information have been provisioned
properly. The External NVE SHOULD be informed of the VM's TSI MAC properly. The External NVE should be informed of the VM's TSI MAC
address and/or IP address. In addition to external network address and/or IP address. In addition to external network
connectivity, the hypervisor may provide local network connectivity connectivity, the hypervisor may provide local network connectivity
between the VM's TSI and other VM's TSI that are co-resident on the between the VM's TSI and other VM's TSI that are co-resident on the
same hypervisor. When the intra or inter-hypervisor connectivity is same hypervisor. When the intra- or inter-hypervisor connectivity is
extended to the external NVE, a locally significant tag, e.g. VLAN extended to the external NVE, a locally significant tag, e.g. VLAN
ID, SHOULD be used between the hypervisor and the external NVE to ID, should be used between the hypervisor and the external NVE to
differentiate each VN's traffic. Both the hypervisor and external NVE differentiate each VN's traffic. Both the hypervisor and external NVE
sides must agree on that tag value for traffic identification, sides must agree on that tag value for traffic identification,
isolation, and forwarding. isolation, and forwarding.
The external NVE may need to do some preparation before it signals The external NVE may need to do some preparation before it signals
successful association with TSI. Such preparation may include locally successful association with the TSI. Such preparation may include
saving the states and binding information of the tenant system locally saving the states and binding information of the tenant
interface and its VN, communicating with the NVA for network system interface and its VN, communicating with the NVA for network
provisioning, etc. provisioning, etc.
Tenant System interface association SHOULD be performed before the VM Tenant System interface association should be performed before the VM
enters the Running state, preferably in the Shutdown state. If enters the Running state, preferably in the Shutdown state. If
association with external NVE fails, the VM SHOULD NOT go into the association with an external NVE fails, the VM should not go into the
Running state. Running state.
2.2 VM Live Migration Event 2.2 VM Live Migration Event
Live migration is sometimes referred to as "hot" migration in that, Live migration is sometimes referred to as "hot" migration in that,
from an external viewpoint, the VM appears to continue to run while from an external viewpoint, the VM appears to continue to run while
being migrated to another server (e.g., TCP connections generally being migrated to another server (e.g., TCP connections generally
survive this class of migration). In contrast, "cold" migration survive this class of migration). In contrast, "cold" migration
consists of shutting down VM execution on one server and restarting consists of shutting down VM execution on one server and restarting
it on another. For simplicity, the following abstract summary about it on another. For simplicity, the following abstract summary of live
live migration assumes shared storage, so that the VM's storage is migration assumes shared storage, so that the VM's storage is
accessible to the source and destination servers. Assume VM live accessible to the source and destination servers. Assume VM live
migrates from hypervisor 1 to hypervisor 2. Such migration event migrates from hypervisor 1 to hypervisor 2. Such a migration event
involves the state transition on both hypervisors, source hypervisor involves state transitions on both source hypervisor 1 and
1 and destination hypervisor 2. VM state on source hypervisor 1 destination hypervisor 2. The VM state on source hypervisor 1
transits from Running to Migrating and then to Shutdown [RFC7666]. VM transits from Running to Migrating and then to Shutdown [RFC7666].
state on destination hypervisor 2 transits from Shutdown to Migrating The VM state on destination hypervisor 2 transits from Shutdown to
and then Running. Migrating and then Running.
The external NVE connected to destination hypervisor 2 has to The external NVE connected to destination hypervisor 2 has to
associate the migrating VM's TSI with it by discovering the TSI's MAC associate the migrating VM's TSI with it by discovering the TSI's MAC
and/or IP addresses, its VN, locally significant VLAN ID if any, and and/or IP addresses, its VN, locally significant VLAN ID if any, and
provisioning other network related parameters of the TSI. The provisioning other network related parameters of the TSI. The
external NVE may be informed about the VM's peer VMs, storage devices external NVE may be informed about the VM's peer VMs, storage devices
and other network appliances with which the VM needs to communicate and other network appliances with which the VM needs to communicate
or is communicating. The migrated VM on destination hypervisor 2 or is communicating. The migrated VM on destination hypervisor 2
SHOULD not go to Running state before all the network provisioning should not go to Running state until all the network provisioning and
and binding has been done. binding has been done.
The migrating VM SHOULD not be in Running state at the same time on The migrating VM should not be in Running state at the same time on
the source hypervisor and destination hypervisor during migration. the source hypervisor and destination hypervisor during migration.
The VM on the source hypervisor does not transition into Shutdown The VM on the source hypervisor does not transition into Shutdown
state until the VM successfully enters the Running state on the state until the VM successfully enters the Running state on the
destination hypervisor. It is possible that VM on the source destination hypervisor. It is possible that the VM on the source
hypervisor stays in Migrating state for a while after VM on the hypervisor stays in Migrating state for a while after the VM on the
destination hypervisor is in Running state. destination hypervisor enters Running state.
2.3 VM Termination Event 2.3 VM Termination Event
VM termination event is also referred to as "powering off" a VM. VM A VM termination event is also referred to as "powering off" a VM. A
termination event leads to its state going to Shutdown. There are two VM termination event leads to its state becoming Shutdown. There are
possible causes of VM termination [RFC7666]. One is the normal "power two possible causes of VM termination [RFC7666]. One is the normal
off" of a running VM; the other is that VM has been migrated to "power off" of a running VM; the other is that the VM has been
another hypervisor and the VM image on the source hypervisor has to migrated to another hypervisor and the VM image on the source
stop executing and to be shutdown. hypervisor has to stop executing and be shutdown.
In VM termination, the external NVE connecting to that VM needs to In VM termination, the external NVE connecting to that VM needs to
deprovision the VM, i.e. delete the network parameters associated deprovision the VM, i.e. delete the network parameters associated
with that VM. In other words, the external NVE has to de-associate with that VM. In other words, the external NVE has to de-associate
the VM's TSI. the VM's TSI.
2.4 VM Pause, Suspension and Resumption Events 2.4 VM Pause, Suspension and Resumption Events
The VM pause event leads to the VM transiting from Running state to A VM pause event leads to the VM transiting from Running state to
Paused state. The Paused state indicates that the VM is resident in Paused state. The Paused state indicates that the VM is resident in
memory but no longer scheduled to execute by the hypervisor memory but no longer scheduled to execute by the hypervisor
[RFC7666]. The VM can be easily re-activated from Paused state to [RFC7666]. The VM can be easily re-activated from Paused state to
Running state. Running state.
The VM suspension event leads to the VM transiting from Running state A VM suspension event leads to the VM transiting from Running state
to Suspended state. The VM resumption event leads to the VM to Suspended state. A VM resumption event leads to the VM transiting
transiting state from Suspended state to Running state. Suspended state from Suspended state to Running state. Suspended state means
state means the memory and CPU execution state of the virtual machine the memory and CPU execution state of the virtual machine are saved
are saved to persistent store. During this state, the virtual to persistent store. During this state, the virtual machine is not
machine is not scheduled to execute by the hypervisor [RFC7666]. scheduled to execute by the hypervisor [RFC7666].
In the Split-NVE architecture, the external NVE should not In the Split-NVE architecture, the external NVE SHOULD NOT
disassociate the paused or suspended VM as the VM can return to disassociate the paused or suspended VM as the VM can return to
Running state at any time. Running state at any time.
3. Hypervisor-to-NVE Control Plane Protocol Functionality 3. Hypervisor-to-NVE Control Plane Protocol Functionality
The following subsections show the illustrative examples of the state The following subsections show illustrative examples of the state
transitions on external NVE which are relevant to Hypervisor-to-NVE transitions of an external NVE which are relevant to Hypervisor-to-
Signaling protocol functionality. It should be noted they are not NVE Signaling protocol functionality. It should be noted this is not
prescriptive text for full state machines. prescriptive text for the full state machine.
3.1 VN Connect and Disconnect 3.1 VN Connect and Disconnect
In Split-NVE scenario, a protocol is needed between the End Device In the Split-NVE scenario, a protocol is needed between the End
(e.g. Hypervisor) making use of the external NVE and the external NVE Device (e.g. Hypervisor) and the external NVE it is using in order to
in order to make the external NVE aware of the changing VN membership make the external NVE aware of the changing VN membership
requirements of the Tenant Systems within the End Device. requirements of the Tenant Systems within the End Device.
A key driver for using a protocol rather than using static A key driver for using a protocol rather than using static
configuration of the external NVE is because the VN connectivity configuration of the external NVE is because the VN connectivity
requirements can change frequently as VMs are brought up, moved, and requirements can change frequently as VMs are brought up, moved, and
brought down on various hypervisors throughout the data center or brought down on various hypervisors throughout the data center or
external cloud. external cloud.
+---------------+ Recv VN_connect; +-------------------+ +---------------+ Receive VN_connect; +-------------------+
|VN_Disconnected| return Local_Tag value |VN_Connected | |VN_Disconnected| return Local_Tag value |VN_Connected |
+---------------+ for VN if successful; +-------------------+ +---------------+ for VN if successful; +-------------------+
|VN_ID; |-------------------------->|VN_ID; | |VN_ID; |-------------------------->|VN_ID; |
|VN_State= | |VN_State=connected;| |VN_State= | |VN_State=connected;|
|disconnected; | |Num_TSI_Associated;| |disconnected; | |Num_TSI_Associated;|
| |<----Recv VN_disconnect----|Local_Tag; | | |<--Receive VN_disconnect---|Local_Tag; |
+---------------+ |VN_Context; | +---------------+ |VN_Context; |
+-------------------+ +-------------------+
Figure 5. State Transition Example of a VAP Instance Figure 5. State Transition Example of a VAP Instance
on an External NVE on an External NVE
Figure 5 shows the state transition for a VAP on the external NVE. An Figure 5 shows the state transition for a VAP on the external NVE. An
NVE that supports the hypervisor to NVE control plane protocol should NVE that supports the hypervisor to NVE control plane protocol should
support one instance of the state machine for each active VN. The support one instance of the state machine for each active VN. The
state transition on the external NVE is normally triggered by the state transition on the external NVE is normally triggered by the
hypervisor-facing side events and behaviors. Some of the interleaved hypervisor-facing side events and behaviors. Some of the interleaved
interaction between NVE and NVA will be illustrated for better interaction between NVE and NVA will be illustrated to better explain
understanding of the whole procedure; while others of them may not be the whole procedure; while others of them may not be shown.
shown.
The external NVE MUST be notified when an End Device requires The external NVE MUST be notified when an End Device requires
connection to a particular VN and when it no longer requires connection to a particular VN and when it no longer requires
connection. In addition, the external NVE must provide a local tag connection. In addition, the external NVE must provide a local tag
value for each connected VN to the End Device to use for exchange of value for each connected VN to the End Device to use for exchanging
packets between the End Device and the external NVE (e.g. a locally packets between the End Device and the external NVE (e.g. a locally
significant 802.1Q tag value). How "local" the significance is significant [IEEE 802.1Q] tag value). How "local" the significance is
depends on whether the Hypervisor has a direct physical connection to depends on whether the Hypervisor has a direct physical connection to
the external NVE (in which case the significance is local to the the external NVE (in which case the significance is local to the
physical link), or whether there is an Ethernet switch (e.g. a blade physical link), or whether there is an Ethernet switch (e.g. a blade
switch) connecting the Hypervisor to the NVE (in which case the switch) connecting the Hypervisor to the NVE (in which case the
significance is local to the intervening switch and all the links significance is local to the intervening switch and all the links
connected to it). connected to it).
These VLAN tags are used to differentiate between different VNs as These VLAN tags are used to differentiate between different VNs as
packets cross the shared access network to the external NVE. When the packets cross the shared access network to the external NVE. When the
external NVE receives packets, it uses the VLAN tag to identify the external NVE receives packets, it uses the VLAN tag to identify their
VN of packets coming from a given TSI, strips the tag, and adds the VN coming from a given TSI, strips the tag, adds the appropriate
appropriate overlay encapsulation for that VN and sends it towards overlay encapsulation for that VN, and sends it towards the
the corresponding remote NVE across the underlying IP network. corresponding remote NVE across the underlying IP network.
The Identification of the VN in this protocol could either be through The Identification of the VN in this protocol could either be through
a VN Name or a VN ID. A globally unique VN Name facilitates a VN Name or a VN ID. A globally unique VN Name facilitates
portability of a Tenant's Virtual Data Center. Once an external NVE portability of a Tenant's Virtual Data Center. Once an external NVE
receives a VN connect indication, the NVE needs a way to get a VN receives a VN connect indication, the NVE needs a way to get a VN
Context allocated (or receive the already allocated VN Context) for a Context allocated (or receive the already allocated VN Context) for a
given VN Name or ID (as well as any other information needed to given VN Name or ID (as well as any other information needed to
transmit encapsulated packets). How this is done is the subject of transmit encapsulated packets). How this is done is the subject of
the NVE-to-NVA protocol which are part of work items 1 and 2 in the NVE-to-NVA protocol which are part of work items 1 and 2 in
[RFC7364]. [RFC7364].
The VN_connect message can be explicit or implicit. Explicit means The VN_connect message can be explicit or implicit. Explicit means
the hypervisor sending a message explicitly to request for the the hypervisor sends a request message explicitly for the connection
connection to a VN. Implicit means the external NVE receives other to a VN. Implicit means the external NVE receives other messages,
messages, e.g. very first TSI associate message (see the next e.g. very first TSI associate message (see the next subsection) for a
subsection) for a given VN, to implicitly indicate its interest to given VN, that implicitly indicate its interest in connecting to a
connect to a VN. VN.
A VN_disconnect message will indicate that the NVE can release all A VN_disconnect message indicates that the NVE can release all the
the resources for that disconnected VN and transit to VN_disconnected resources for that disconnected VN and transit to VN_disconnected
state. The local tag assigned for that VN can possibly be reclaimed state. The local tag assigned for that VN can possibly be reclaimed
by another VN. for use by another VN.
3.2 TSI Associate and Activate 3.2 TSI Associate and Activate
Typically, a TSI is assigned a single MAC address and all frames Typically, a TSI is assigned a single MAC address and all frames
transmitted and received on that TSI use that single MAC address. As transmitted and received on that TSI use that single MAC address. As
mentioned earlier, it is also possible for a Tenant System to mentioned earlier, it is also possible for a Tenant System to
exchange frames using multiple MAC addresses or packets with multiple exchange frames using multiple MAC addresses or packets with multiple
IP addresses. IP addresses.
Particularly in the case of a TS that is forwarding frames or packets Particularly in the case of a TS that is forwarding frames or packets
from other TSs, the external NVE will need to communicate the mapping from other TSs, the external NVE will need to communicate the mapping
between the NVE's IP address (on the underlying network) and ALL the between the NVE's IP address on the underlying network and ALL the
addresses the TS is forwarding on behalf of for the corresponding VN addresses the TS is forwarding on behalf of the corresponding VN to
to the NVA. the NVA.
The NVE has two ways in which it can discover the tenant addresses The NVE has two ways it can discover the tenant addresses for which
for which frames are to be forwarded to a given End Device (and frames are to be forwarded to a given End Device (and ultimately to
ultimately to the TS within that End Device). the TS within that End Device).
1. It can glean the addresses by inspecting the source addresses in 1. It can glean the addresses by inspecting the source addresses in
packets it receives from the End Device. packets it receives from the End Device.
2. The hypervisor can explicitly signal the address associations of 2. The hypervisor can explicitly signal the address associations of
a TSI to the external NVE. The address association includes all the a TSI to the external NVE. An address association includes all the
MAC and/or IP addresses possibly used as source addresses in a packet MAC and/or IP addresses possibly used as source addresses in a packet
sent from the hypervisor to external NVE. The external NVE may sent from the hypervisor to external NVE. The external NVE may
further use this information to filter the future traffic from the further use this information to filter the future traffic from the
hypervisor. hypervisor.
To perform the second approach above, the "hypervisor-to-NVE" To use the second approach above, the "hypervisor-to-NVE" protocol
protocol requires a means to allow End Devices to communicate new must support End Devices communicating new tenant addresses
tenant addresses associations for a given TSI within a given VN. associations for a given TSI within a given VN.
Figure 6 shows the example of a state transition for a TSI connecting Figure 6 shows the example of a state transition for a TSI connecting
to a VAP on the external NVE. An NVE that supports the hypervisor to to a VAP on the external NVE. An NVE that supports the hypervisor to
NVE control plane protocol may support one instance of the state NVE control plane protocol may support one instance of the state
machine for each TSI connecting to a given VN. machine for each TSI connecting to a given VN.
disassociate; +--------+ disassociate disassociate +--------+ disassociate
+--------------->| Init |<--------------------+ +--------------->| Init |<--------------------+
| +--------+ | | +--------+ |
| | | | | | | |
| | | | | | | |
| +--------+ | | +--------+ |
| | | | | | | |
| associate | | activate | | associate | | activate |
| +-----------+ +-----------+ | | +-----------+ +-----------+ |
| | | | | | | |
| | | | | | | |
| \|/ \|/ | | \|/ \|/ |
+--------------------+ +---------------------+ +--------------------+ +---------------------+
| Associated | | Activated | | Associated | | Activated |
+--------------------+ +---------------------+ +--------------------+ +---------------------+
|TSI_ID; | |TSI_ID; | |TSI_ID; | |TSI_ID; |
|Port; |-----activate---->|Port; | |Port; |-----activate---->|Port; |
|VN_ID; | |VN_ID; | |VN_ID; | |VN_ID; |
|State=associated; | |State=activated ; |-+ |State=associated; | |State=activated ; |-+
+-|Num_Of_Addr; |<---deactivate;---|Num_Of_Addr; | | +-|Num_Of_Addr; |<---deactivate ---|Num_Of_Addr; | |
| |List_Of_Addr; | |List_Of_Addr; | | | |List_Of_Addr; | |List_Of_Addr; | |
| +--------------------+ +---------------------+ | | +--------------------+ +---------------------+ |
| /|\ /|\ | | /|\ /|\ |
| | | | | | | |
+---------------------+ +-------------------+ +---------------------+ +-------------------+
add/remove/updt addr; add/remove/updt addr; add/remove/updt addr; add/remove/updt addr;
or update port; or update port; or update port; or update port;
Figure 6 State Transition Example of a TSI Instance Figure 6 State Transition Example of a TSI Instance
on an External NVE on an External NVE
The Associated state of a TSI instance on an external NVE indicates The Associated state of a TSI instance on an external NVE indicates
all the addresses for that TSI have already associated with the VAP all the addresses for that TSI have already associated with the VAP
of the external NVE on port p for a given VN but no real traffic to of the external NVE on a given port e.g. on port p for a given VN but
and from the TSI is expected and allowed to pass through. An NVE has no real traffic to and from the TSI is expected and allowed to pass
reserved all the necessary resources for that TSI. An external NVE through. An NVE has reserved all the necessary resources for that
may report the mappings of its' underlay IP address and the TSI. An external NVE may report the mappings of its underlay IP
associated TSI addresses to NVA and relevant network nodes may save address and the associated TSI addresses to NVA and relevant network
such information to its mapping table but not forwarding table. A NVE nodes may save such information to their mapping tables but not their
may create ACL or filter rules based on the associated TSI addresses forwarding tables. An NVE may create ACL or filter rules based on the
on the attached port p but not enable them yet. Local tag for the VN associated TSI addresses on that attached port p but not enable them
corresponding to the TSI instance should be provisioned on port p to yet. The local tag for the VN corresponding to the TSI instance
receive packets. should be provisioned on port p to receive packets.
VM migration event (discussed section 2) may cause the hypervisor to The VM migration event (discussed section 2) may cause the hypervisor
send an associate message to the NVE connected to the destination to send an associate message to the NVE connected to the destination
hypervisor the VM migrates to. VM creation event may also lead to the hypervisor of the migration. A VM creation event may also cause to
same practice. the same practice.
The Activated state of a TSI instance on an external NVE indicates The Activated state of a TSI instance on an external NVE indicates
that all the addresses for that TSI are functioning correctly on port that all the addresses for that TSI are functioning correctly on a
p and traffic can be received from and sent to that TSI via the NVE. given port e.g. port p and traffic can be received from and sent to
The mappings of the NVE's underlay IP address and the associated TSI that TSI via the NVE. The mappings of the NVE's underlay IP address
addresses should be put into the forwarding table rather than the and the associated TSI addresses should be put into the forwarding
mapping table on relevant network nodes. ACL or filter rules based on table rather than the mapping table on relevant network nodes. ACL or
the associated TSI addresses on the attached port p in NVE are filter rules based on the associated TSI addresses on the attached
enabled. The local tag for the VN corresponding to the TSI instance port p in the NVE are enabled. The local tag for the VN corresponding
MUST be provisioned on port p to receive packets. to the TSI instance MUST be provisioned on port p to receive packets.
The Activate message makes the state transit from Init or Associated The Activate message makes the state transit from Init or Associated
to Activated. VM creation, VM migration and VM resumption events to Activated. VM creation, VM migration, and VM resumption events
discussed in Section 4 may trigger the Activate message to be sent discussed in Section 4 may trigger sending the Activate message from
from the hypervisor to the external NVE. the hypervisor to the external NVE.
TSI information may get updated either in Associated or Activated TSI information may get updated in either the Associated or Activated
state. The following are considered updates to the TSI information: state. The following are considered updates to the TSI information:
add or remove the associated addresses, update current associated add or remove the associated addresses, update the current associated
addresses (for example updating IP for a given MAC), update NVE port addresses (for example updating IP for a given MAC), and update the
information based on where the NVE receives messages. Such updates do NVE port information based on where the NVE receives messages. Such
not change the state of TSI. When any address associated to a given updates do not change the state of TSI. When any address associated
TSI changes, the NVE should inform the NVA to update the mapping with a given TSI changes, the NVE should inform the NVA to update the
information on NVE's underlying address and the associated TSI mapping information for NVE's underlying address and the associated
addresses. The NVE should also change its local ACL or filter TSI addresses. The NVE should also change its local ACL or filter
settings accordingly for the relevant addresses. Port information settings accordingly for the relevant addresses. Port information
update will cause the local tag for the VN corresponding to the TSI updates will cause the provisioning of the local tag for the VN
instance to be provisioned on new port p and removed from the old corresponding to the TSI instance on new port and removal from the
port. old port.
3.3 TSI Disassociate and Deactivate 3.3 TSI Disassociate and Deactivate
Disassociate and deactivate conceptually are the reverse behaviors of Disassociate and deactivate behaviors are conceptually the reverse of
associate and activate. From Activated state to Associated state, the associate and activate.
external NVE needs to make sure the resources are still reserved but
the addresses associated to the TSI are not functioning and no From Activated state to Associated state, the external NVE needs to
traffic to and from the TSI is expected and allowed to pass through. make sure the resources are still reserved but the addresses
For example, the NVE needs to inform the NVA to remove the relevant associated to the TSI are not functioning. No traffic to or from the
addresses mapping information from forwarding or routing table. ACL TSI is expected or allowed to pass through. For example, the NVE
or filtering rules regarding the relevant addresses should be needs to tell the NVA to remove the relevant addresses mapping
disabled. From Associated or Activated state to the Init state, the information from forwarding and routing tables. ACL and filtering
NVE will release all the resources relevant to TSI instances. The NVE rules regarding the relevant addresses should be disabled.
should also inform the NVA to remove the relevant entries from
mapping table. ACL or filtering rules regarding the relevant From Associated or Activated state to the Init state, the NVE
addresses should be removed. Local tag provisioning on the connecting releases all the resources relevant to TSI instances. The NVE should
port on NVE SHOULD be cleared. also inform the NVA to remove the relevant entries from mapping
table. ACL or filtering rules regarding the relevant addresses should
be removed. Local tag provisioning on the connecting port on NVE
SHOULD be cleared.
A VM suspension event (discussed in section 2) may cause the relevant A VM suspension event (discussed in section 2) may cause the relevant
TSI instance(s) on the NVE to transit from Activated to Associated TSI instance(s) on the NVE to transit from Activated to Associated
state. A VM pause event normally does not affect the state of the state.
relevant TSI instance(s) on the NVE as the VM is expected to run
again soon. The VM shutdown event will normally cause the relevant
TSI instance(s) on NVE transition to Init state from Activated state.
All resources should be released.
A VM migration will lead the TSI instance on the source NVE to leave A VM pause event normally does not affect the state of the relevant
TSI instance(s) on the NVE as the VM is expected to run again soon.
A VM shutdown event will normally cause the relevant TSI instance(s)
on the NVE to transition to Init state from Activated state. All
resources should be released.
A VM migration will cause the TSI instance on the source NVE to leave
Activated state. When a VM migrates to another hypervisor connecting Activated state. When a VM migrates to another hypervisor connecting
to the same NVE, i.e. source and destination NVE are the same, NVE to the same NVE, i.e. source and destination NVE are the same, NVE
should use TSI_ID and incoming port to differentiate two TSI should use TSI_ID and incoming port to differentiate two TSI
instance. instances.
Although the triggering messages for state transition shown in Figure Although the triggering messages for the state transition shown in
6 does not indicate the difference between VM creation/shutdown event Figure 6 does not indicate the difference between a VM
and VM migration arrival/departure event, the external NVE can make creation/shutdown event and a VM migration arrival/departure event,
optimizations if it is notified of such information. For example, if the external NVE can make optimizations if it is given such
the NVE knows the incoming activate message is caused by migration information. For example, if the NVE knows the incoming activate
rather than VM creation, some mechanisms may be employed or triggered message is caused by migration rather than VM creation, some
to make sure the dynamic configurations or provisionings on the mechanisms may be employed or triggered to make sure the dynamic
destination NVE are the same as those on the source NVE for the configurations or provisionings on the destination NVE are the same
migrated VM. For example an IGMP query [RFC2236] can be triggered by as those on the source NVE for the migrated VM. For example an IGMP
the destination external NVE to the migrated VM on destination query [RFC2236] can be triggered by the destination external NVE to
hypervisor so that the VM is forced to answer an IGMP report to the the migrated VM so that VM is forced to send an IGMP report to the
multicast router. Then a multicast router can correctly send the multicast router. Then a multicast router can correctly route the
multicast traffic to the new external NVE for those multicast groups multicast traffic to the new external NVE for those multicast groups
the VM had joined before the migration. the VM joined before the migration.
4. Hypervisor-to-NVE Control Plane Protocol Requirements 4. Hypervisor-to-NVE Control Plane Protocol Requirements
Req-1: The protocol MUST support a bridged network connecting End Req-1: The protocol MUST support a bridged network connecting End
Devices to External NVE. Devices to the External NVE.
Req-2: The protocol MUST support multiple End Devices sharing the Req-2: The protocol MUST support multiple End Devices sharing the
same External NVE via the same physical port across a bridged same External NVE via the same physical port across a bridged
network. network.
Req-3: The protocol MAY support an End Device using multiple external Req-3: The protocol MAY support an End Device using multiple external
NVEs simultaneously, but only one external NVE for each VN. NVEs simultaneously, but only one external NVE for each VN.
Req-4: The protocol MAY support an End Device using multiple external Req-4: The protocol MAY support an End Device using multiple external
NVEs simultaneously for the same VN. NVEs simultaneously for the same VN.
Req-5: The protocol MUST allow the End Device initiating a request to Req-5: The protocol MUST allow the End Device to initiate a request
its associated External NVE to be connected/disconnected to a given to its associated External NVE to be connected/disconnected to a
VN. given VN.
Req-6: The protocol MUST allow an External NVE initiating a request Req-6: The protocol MUST allow an External NVE initiating a request
to its connected End Devices to be disconnected to a given VN. to its connected End Devices to be disconnected from a given VN.
Req-7: When a TS attaches to a VN, the protocol MUST allow for an End Req-7: When a TS attaches to a VN, the protocol MUST allow for an End
Device and its external NVE to negotiate one or more locally- Device and its external NVE to negotiate one or more locally-
significant tag(s) for carrying traffic associated with a specific VN significant tag(s) for carrying traffic associated with a specific VN
(e.g., 802.1Q tags). (e.g., [IEEE 802.1Q] tags).
Req-8: The protocol MUST allow an End Device initiating a request to Req-8: The protocol MUST allow an End Device initiating a request to
associate/disassociate and/or activate/deactive some or all associate/disassociate and/or activate/deactive some or all
address(es) of a TSI instance to a VN on an NVE port. address(es) of a TSI instance to a VN on an NVE port.
Req-9: The protocol MUST allow the External NVE initiating a request Req-9: The protocol MUST allow the External NVE initiating a request
to disassociate and/or deactivate some or all address(es) of a TSI to disassociate and/or deactivate some or all address(es) of a TSI
instance to a VN on an NVE port. instance to a VN on an NVE port.
Req-10: The protocol MUST allow an End Device initiating a request to Req-10: The protocol MUST allow an End Device initiating a request to
skipping to change at page 17, line 33 skipping to change at page 17, line 40
the external NVE. Addresses can be expressed in different formats, the external NVE. Addresses can be expressed in different formats,
for example, MAC, IP or pair of IP and MAC. for example, MAC, IP or pair of IP and MAC.
Req-11: The protocol MUST allow the External NVE to authenticate the Req-11: The protocol MUST allow the External NVE to authenticate the
End Device connected. End Device connected.
Req-12: The protocol MUST be able to run over L2 links between the Req-12: The protocol MUST be able to run over L2 links between the
End Device and its External NVE. End Device and its External NVE.
Req-13: The protocol SHOULD support the End Device indicating if an Req-13: The protocol SHOULD support the End Device indicating if an
associate or activate request from it results from a VM hot migration associate or activate request from it is the result of a VM hot
event. migration event.
5. VDP Applicability and Enhancement Needs 5. VDP Applicability and Enhancement Needs
Virtual Station Interface (VSI) Discovery and Configuration Protocol Virtual Station Interface (VSI) Discovery and Configuration Protocol
(VDP) [IEEE 802.1Qbg] can be the control plane protocol running (VDP) [IEEE 802.1Qbg] can be the control plane protocol running
between the hypervisor and the external NVE. Appendix A illustrates between the hypervisor and the external NVE. Appendix A illustrates
VDP for the reader's information. VDP for the reader's information.
VDP facilitates the automatic discovery and configuration for Edge VDP facilitates the automatic discovery and configuration of Edge
Virtual Bridging (EVB) station and Edge Virtual Bridging (EVB) Virtual Bridging (EVB) stations and Edge Virtual Bridging (EVB)
bridge. EVB station is normally an end station running multiple VMs. bridges. An EVB station is normally an end station running multiple
It is conceptually equivalent to hypervisor in this document. And EVB VMs. It is conceptually equivalent to a hypervisor in this document.
bridge is conceptually equivalent to the external NVE. An EVB bridge is conceptually equivalent to the external NVE.
VDP is able to pre-associate/associate/de-associate a VSI on an EVB VDP is able to pre-associate/associate/de-associate a VSI on an EVB
station to a port on the EVB bridge. VSI is approximately the concept station with a port on the EVB bridge. A VSI is approximately the
of a virtual port a VM connects to the hypervisor in this document concept of a virtual port by which a VM connects to the hypervisor in
context. The EVB station and the EVB bridge can reach agreement on this document's context. The EVB station and the EVB bridge can reach
VLAN ID(s) assigned to a VSI via VDP message exchange. Other agreement on VLAN ID(s) assigned to a VSI via VDP message exchange.
configuration parameters can be exchanged via VDP as well. VDP is Other configuration parameters can be exchanged via VDP as well. VDP
carried over the Edge Control Protocol(ECP) [IEEE8021Qbg] which is carried over the Edge Control Protocol(ECP) [IEEE 802.1Qbg] which
provides a reliable transportation over a layer 2 network. provides a reliable transportation over a layer 2 network.
VDP protocol needs some extensions to fulfill the requirements listed VDP protocol needs some extensions to fulfill the requirements listed
in this document. Table 1 shows the needed extensions and/or in this document. Table 1 shows the needed extensions and/or
clarifications in the NVO3 context. clarifications in the NVO3 context.
+------+-----------+-----------------------------------------------+ +------+-----------+-----------------------------------------------+
| Req | Supported | remarks | | Req | Supported | remarks |
| | by VDP? | | | | by VDP? | |
+------+-----------+-----------------------------------------------+ +------+-----------+-----------------------------------------------+
| Req-1| | | | Req-1| | |
+------+ |Needs extension. Must be able to send to a | +------+ |Needs extension. Must be able to send to a |
| Req-2| |specific unicast MAC and should be able to send| | Req-2| |specific unicast MAC and should be able to send|
+------+ Partially |to a non-reserved well known multicast address | +------+ Partially |to a non-reserved well known multicast address |
| Req-3| |other than the nearest customer bridge address | | Req-3| |other than the nearest customer bridge address.|
+------+ | | +------+ | |
| Req-4| | | | Req-4| | |
+------+-----------+-----------------------------------------------+ +------+-----------+-----------------------------------------------+
| Req-5| Yes |VN is indicated by GroupID | | Req-5| Yes |VN is indicated by GroupID |
+------+-----------+-----------------------------------------------+ +------+-----------+-----------------------------------------------+
| Req-6| Yes |Bridge sends De-Associate | | Req-6| Yes |Bridge sends De-Associate |
+------+-----------+------------------------+----------------------+ +------+-----------+------------------------+----------------------+
| | |VID==NULL in request and bridge returns the | | | |VID==NULL in request and bridge returns the |
| Req-7| Yes |assigned value in response or specify GroupID | | Req-7| Yes |assigned value in response or specify GroupID |
| | |in request and get VID assigned in returning | | | |in request and get VID assigned in returning |
| | |response. Multiple VLANs per group is allowed | | | |response. Multiple VLANs per group are allowed.|
+------+-----------+------------------------+----------------------+ +------+-----------+------------------------+----------------------+
| | | requirements | VDP equivalence | | | | requirements | VDP equivalence |
| | +------------------------+----------------------+ | | +------------------------+----------------------+
| | | associate/disassociate|pre-asso/de-associate | | | | associate/disassociate|pre-asso/de-associate |
| Req-8| Partially | activate/deactivate |associate/de-associate| | Req-8| Partially | activate/deactivate |associate/de-associate|
| | +------------------------+----------------------| | | +------------------------+----------------------|
| | |Needs extension to allow associate->pre-assoc | | | |Needs extension to allow associate->pre-assoc |
+------+-----------+------------------------+----------------------+ +------+-----------+------------------------+----------------------+
| Req-9| Yes | VDP bridge initiates de-associate | | Req-9| Yes | VDP bridge initiates de-associate |
+------+-----------+-----------------------------------------------+ +------+-----------+-----------------------------------------------+
|Req-10| Partially |Needs extension for IPv4/IPv6 address. Add a | |Req-10| Partially |Needs extension for IPv4/IPv6 address. Add a |
| | |new "filter info format" type | | | |new "filter info format" type. |
+------+-----------+-----------------------------------------------+ +------+-----------+-----------------------------------------------+
|Req-11| No |Out-of-band mechanism is preferred, e.g. MACSec| |Req-11| No |Out-of-band mechanism is preferred, e.g. MACSec|
| | |or 802.1X. | | | |or 802.1X. |
+------+-----------+-----------------------------------------------+ +------+-----------+-----------------------------------------------+
|Req-12| Yes |L2 protocol naturally | |Req-12| Yes |L2 protocol naturally |
+------+-----------+-----------------------------------------------+ +------+-----------+-----------------------------------------------+
| | |M bit for migrated VM on destination hypervisor| | | |M bit for migrated VM on destination hypervisor|
| | |and S bit for that on source hypervisor. | | | |and S bit for that on source hypervisor. |
|Req-13| Partially |It is indistinguishable when M/S is 0 between | |Req-13| Partially |It is indistinguishable when M/S is 0 between |
| | |no guidance and events not caused by migration | | | |no guidance and events not caused by migration |
| | |where NVE may act differently. Needs new | | | |where NVE may act differently. Needs new |
| | |New bits for migration indication in new | | | |New bits for migration indication in new |
| | |"filter info format" type | | | |"filter info format" type. |
+------+-----------+-----------------------------------------------+ +------+-----------+-----------------------------------------------+
Table 1 Compare VDP with the requirements Table 1 Compare VDP with the requirements
Simply adding the ability to carry layer 3 addresses, VDP can serve Simply adding the ability to carry layer 3 addresses, VDP can serve
the Hypervisor-to-NVE control plane functions pretty well. Other the Hypervisor-to-NVE control plane functions pretty well. Other
extensions are the improvement of the protocol capabilities for extensions are the improvement of the protocol capabilities for
better fit in NVO3 network. better fit in an NVO3 network.
6. Security Considerations 6. Security Considerations
NVEs must ensure that only properly authorized Tenant Systems are NVEs must ensure that only properly authorized Tenant Systems are
allowed to join and become a part of any specific Virtual Network. In allowed to join and become a part of any particular Virtual Network.
addition, NVEs will need appropriate mechanisms to ensure that any In addition, NVEs will need appropriate mechanisms to ensure that any
hypervisor wishing to use the services of an NVE are properly hypervisor wishing to use the services of an NVE are properly
authorized to do so. One design point is whether the hypervisor authorized to do so. One design point is whether the hypervisor
should supply the NVE with necessary information (e.g., VM addresses, should supply the NVE with necessary information (e.g., VM addresses,
VN information, or other parameters) that the NVE uses directly, or VN information, or other parameters) that the NVE uses directly, or
whether the hypervisor should only supply a VN ID and an identifier whether the hypervisor should only supply a VN ID and an identifier
for the associated VM (e.g., its MAC address), with the NVE using for the associated VM (e.g., its MAC address), with the NVE using
that information to obtain the information needed to validate the that information to obtain the information needed to validate the
hypervisor-provided parameters or obtain related parameters in a hypervisor-provided parameters or obtain related parameters in a
secure manner. secure manner.
skipping to change at page 20, line 13 skipping to change at page 20, line 19
The authors would like to specially thank Lucy Yong and Jon Hudson The authors would like to specially thank Lucy Yong and Jon Hudson
for their generous help in improving this document. for their generous help in improving this document.
8. References 8. References
8.1 Normative References 8.1 Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2236] Fenner, W., "Internet Group Management Protocol, Version
2", RFC 2236, November 1997.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words ", BCP 14, RFC 8174, May 2017. 2119 Key Words ", BCP 14, RFC 8174, May 2017.
8.2 Informative References 8.2 Informative References
[RFC4122] Leach, P., Mealling, M., and R. Salz, "A Universally
Unique IDentifier (UUID) URN Namespace", RFC 4122, July
2005.
[RFC7364] Narten, T., Gray, E., Black, D., Fang, L., Kreeger, L., and [RFC7364] Narten, T., Gray, E., Black, D., Fang, L., Kreeger, L., and
M. Napierala, "Problem Statement: Overlays for Network M. Napierala, "Problem Statement: Overlays for Network
Virtualization", October 2014. Virtualization", October 2014.
[RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. [RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y.
Rekhter, "Framework for DC Network Virtualization", Rekhter, "Framework for DC Network Virtualization",
October 2014. October 2014.
[RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., Narten, [RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., Narten,
T., "An Architecture for Data-Center Network T., "An Architecture for Data-Center Network
Virtualization over Layer 3 (NVO3)", December 2016. Virtualization over Layer 3 (NVO3)", December 2016.
[RFC7666] Asai H., MacFaden M., Schoenwaelder J., Shima K., Tsou T., [RFC7666] Asai H., MacFaden M., Schoenwaelder J., Shima K., Tsou T.,
"Management Information Base for Virtual Machines "Management Information Base for Virtual Machines
Controlled by a Hypervisor", October 2015. Controlled by a Hypervisor", October 2015.
[IEEE 802.1Qbg] IEEE, "Media Access Control (MAC) Bridges and Virtual [IEEE 802.1Qbg] IEEE, "Media Access Control (MAC) Bridges and Virtual
Bridged Local Area Networks - Amendment 21: Edge Virtual Bridged Local Area Networks - Amendment 21: Edge Virtual
Bridging", IEEE Std 802.1Qbg, 2012 Bridging", IEEE Std 802.1Qbg, 2012
[8021Q] IEEE, "Media Access Control (MAC) Bridges and Virtual Bridged [IEEE 802.1Q] IEEE, "Media Access Control (MAC) Bridges and Virtual
Local Area Networks", IEEE Std 802.1Q-2011, August, 2011 Bridged Local Area Networks", IEEE Std 802.1Q-2014,
November 2014.
Appendix A. IEEE 802.1Qbg VDP Illustration (For information only) Appendix A. IEEE 802.1Qbg VDP Illustration (For information only)
VDP (VSI Discovery and Discovery and Configuration Protocol) messages The VDP (VSI Discovery and Discovery and Configuration Protocol [IEEE
are formatted as a TLV as shown in Figure A.1. Virtual Station Interface 802.1Qbg]) can be considered as a controlling protocol running between
(VSI) is an interface to a virtual station that is attached to a the hypervisor and the external bridge. VDP association TLV structure
downlink port of an internal bridging function in a server. VSI's VDP are formatted as shown in Figure A.1.
packet will be handled by an external bridge. VDP is the controlling
protocol running between the hypervisor and the external bridge.
+--------+--------+------+----+----+------+------+------+-----------+ +--------+--------+------+-----+--------+------+------+-------+------+
|TLV type|TLV info|Status|VSI |VSI |VSIID | VSIID|Filter|Filter Info| |TLV type|TLV info|Status|VSI |VSI Type|VSI ID|VSI ID|Filter |Filter|
| 7b |str len | |Type|Type|Format| | Info | | | |string | |Type |Version |Format| |Info |Info |
| | 9b | 1oct |ID |Ver | | |format| | | |length | |ID | | | |format | |
| | | |3oct|1oct| 1oct |16oct |1oct | M oct | +--------+--------+------+-----+--------+------+------+-------+------+
+--------+--------+------+----+----+------+------+------+-----------+ | | |<----VSI type&instance----->|<--Filter---->|
| | | | | | | |<-------------VSI attributes-------------->|
| | |<--VSI type&instance-->|<----Filter------>| |<--TLV header--->|<-----------TLV information string -------------->|
| | |<------------VSI attributes-------------->|
|<--TLV header--->|<-------TLV info string = 23 + M octets--------->|
Figure A.1: VDP TLV definitions Figure A.1: VDP association TLV
There are basically four TLV types. There are basically four TLV types.
1. Pre-Associate: Pre-Associate is used to pre-associate a VSI instance 1. Pre-associate: Pre-associate is used to pre-associate a VSI instance
with a bridge port. The bridge validates the request and returns a with a bridge port. The bridge validates the request and returns a
failure Status in case of errors. Successful pre-association does not failure Status in case of errors. Successful pre-associate does not
imply that the indicated VSI Type or provisioning will be applied to any imply that the indicated VSI Type or provisioning will be applied to any
traffic flowing through the VSI. The pre-associate enables faster traffic flowing through the VSI. The pre-associate enables faster
response to an associate, by allowing the bridge to obtain the VSI Type response to an associate, by allowing the bridge to obtain the VSI Type
prior to an association. prior to an association.
2. Pre-Associate with resource reservation: Pre-Associate with Resource 2. Pre-associate with resource reservation: Pre-associate with Resource
Reservation involves the same steps as Pre-Associate, but on successful Reservation involves the same steps as Pre-associate, but on success it
pre-association also reserves resources in the Bridge to prepare for a also reserves resources in the bridge to prepare for a subsequent
subsequent Associate request. Associate request.
3. Associate: The Associate creates and activates an association between 3. Associate: Associate creates and activates an association between a
a VSI instance and a bridge port. The Bridge allocates any required VSI instance and a bridge port. An bridge allocates any required bridge
bridge resources for the referenced VSI. The Bridge activates the resources for the referenced VSI. The bridge activates the configuration
configuration for the VSI Type ID. This association is then applied to for the VSI Type ID. This association is then applied to the traffic
the traffic flow to/from the VSI instance. flow to/from the VSI instance.
4. Deassociate: The de-associate is used to remove an association 4. De-associate: The de-associate is used to remove an association
between a VSI instance and a bridge port. Pre-Associated and Associated between a VSI instance and a bridge port. Pre-associated and associated
VSIs can be de-associated. De-associate releases any resources that were VSIs can be de-associated. De-associate releases any resources that were
reserved as a result of prior Associate or Pre-Associate operations for reserved as a result of prior associate or pre-Associate operations for
that VSI instance. that VSI instance.
Deassociate can be initiated by either side and the rest types of De-associate can be initiated by either side and the other types can
messages can only be initiated by the server side. only be initiated by the server side.
Some important flag values in VDP Status field: Some important flag values in VDP Status field:
1. M-bit (Bit 5): Indicates that the user of the VSI (e.g., the VM) is 1. M-bit (Bit 5): Indicates that the user of the VSI (e.g., the VM) is
migrating (M-bit = 1) or provides no guidance on the migration of the migrating (M-bit = 1) or provides no guidance on the migration of the
user of the VSI (M-bit = 0). The M-bit is used as an indicator relative user of the VSI (M-bit = 0). The M-bit is used as an indicator relative
to the VSI that the user is migrating to. to the VSI that the user is migrating to.
2. S-bit (Bit 6): Indicates that the VSI user (e.g., the VM) is 2. S-bit (Bit 6): Indicates that the VSI user (e.g., the VM) is
suspended (S-bit = 1) or provides no guidance as to whether the user of suspended (S-bit = 1) or provides no guidance as to whether the user of
the VSI is suspended (S-bit = 0). A keep-alive Associate request with the VSI is suspended (S-bit = 0). A keep-alive Associate request with
S-bit = 1 can be sent when the VSI user is suspended. The S-bit is used S-bit = 1 can be sent when the VSI user is suspended. The S-bit is used
as an indicator relative to the VSI that the user is migrating from. as an indicator relative to the VSI that the user is migrating from.
The filter information format currently supports 4 types as the The filter information format currently defines 4 types. Each of the
following. filter information is shown in details as follows.
1. VID Filter Info format 1. VID Filter Info format
+---------+------+-------+--------+ +---------+------+-------+--------+
| #of | PS | PCP | VID | | #of | PS | PCP | VID |
|entries |(1bit)|(3bits)|(12bits)| |entries |(1bit)|(3bits)|(12bits)|
|(2octets)| | | | |(2octets)| | | |
+---------+------+-------+--------+ +---------+------+-------+--------+
|<--Repeated per entry->| |<--Repeated per entry->|
Figure A.2 VID Filter Info format Figure A.2 VID Filter Info format
2. MAC/VID filter format 2. MAC/VID Filter Info format
+---------+--------------+------+-------+--------+ +---------+--------------+------+-------+--------+
| #of | MAC address | PS | PCP | VID | | #of | MAC address | PS | PCP | VID |
|entries | (6 octets) |(1bit)|(3bits)|(12bits)| |entries | (6 octets) |(1bit)|(3bits)|(12bits)|
|(2octets)| | | | | |(2octets)| | | | |
+---------+--------------+------+-------+--------+ +---------+--------------+------+-------+--------+
|<--------Repeated per entry---------->| |<--------Repeated per entry---------->|
Figure A.3 MAC/VID filter format Figure A.3 MAC/VID filter format
3. GroupID/VID filter format 3. GroupID/VID Filter Info format
+---------+--------------+------+-------+--------+ +---------+--------------+------+-------+--------+
| #of | GroupID | PS | PCP | VID | | #of | GroupID | PS | PCP | VID |
|entries | (4 octets) |(1bit)|(3bits)|(12bits)| |entries | (4 octets) |(1bit)|(3bits)|(12bits)|
|(2octets)| | | | | |(2octets)| | | | |
+---------+--------------+------+-------+--------+ +---------+--------------+------+-------+--------+
|<--------Repeated per entry---------->| |<--------Repeated per entry---------->|
Figure A.4 GroupID/VID filter format Figure A.4 GroupID/VID filter format
4. GroupID/MAC/VID filter format 4. GroupID/MAC/VID Filter Info format
+---------+----------+-------------+------+-----+--------+ +---------+----------+-------------+------+-----+--------+
| #of | GroupID | MAC address | PS | PCP | VID | | #of | GroupID | MAC address | PS | PCP | VID |
|entries |(4 octets)| (6 octets) |(1bit)|(3b )|(12bits)| |entries |(4 octets)| (6 octets) |(1bit)|(3b )|(12bits)|
|(2octets)| | | | | | |(2octets)| | | | | |
+---------+----------+-------------+------+-----+--------+ +---------+----------+-------------+------+-----+--------+
|<-------------Repeated per entry------------->| |<-------------Repeated per entry------------->|
Figure A.5 GroupID/MAC/VID filter format Figure A.5 GroupID/MAC/VID filter format
The null VID can be used in the VDP Request sent from the hypervisor to The null VID can be used in the VDP Request sent from the station to the
the external bridge. Use of the null VID indicates that the set of VID external bridge. Use of the null VID indicates that the set of VID
values associated with the VSI is expected to be supplied by the Bridge. values associated with the VSI is expected to be supplied by the bridge.
The Bridge can obtain VID values from the VSI Type whose identity is The set of VID values is returned to the station via the VDP Response.
specified by the VSI Type information in the VDP Request. The set of VID The returned VID value can be a locally significant value. When GroupID
values is returned to the station via the VDP Response. The returned VID is used, it is equivalent to the VN ID in NVO3. GroupID will be provided
value can be a locally significant value. When GroupID is used, it is by the station to the bridge. The bridge maps GroupID to a locally
equivalent to the VN ID in NVO3. GroupID will be provided by the
hypervisor to the bridge. The bridge will map GroupID to a locally
significant VLAN ID. significant VLAN ID.
The VSIID in VDP request that identify a VM can be one of the following The VSI ID in VDP association TLV that identify a VM can be one of the
format: IPV4 address, IPV6 address, MAC address, UUID or locally following format: IPV4 address, IPV6 address, MAC address, UUID
defined. [RFC4122], or locally defined.
Authors' Addresses Authors' Addresses
Yizhou Li Yizhou Li
Huawei Technologies Huawei Technologies
101 Software Avenue, 101 Software Avenue,
Nanjing 210012 Nanjing 210012
China China
Phone: +86-25-56625409 Phone: +86-25-56625409
 End of changes. 99 change blocks. 
276 lines changed or deleted 284 lines changed or added

This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/