draft-ietf-bmwg-ca-bench-meth-00.txt   draft-ietf-bmwg-ca-bench-meth-01.txt 
Internet Engineering Task Force M. Hamilton Internet Engineering Task Force M. Hamilton
Internet-Draft BreakingPoint Systems Internet-Draft BreakingPoint Systems
Intended status: Informational S. Banks Intended status: Informational S. Banks
Expires: March 17, 2012 Cisco Systems Expires: September 13, 2012 Cisco Systems
September 14, 2011 March 12, 2012
Benchmarking Methodology for Content-Aware Network Devices Benchmarking Methodology for Content-Aware Network Devices
draft-ietf-bmwg-ca-bench-meth-00 draft-ietf-bmwg-ca-bench-meth-01
Abstract Abstract
This document defines a set of test scenarios and metrics that can be This document defines a set of test scenarios and metrics that can be
used to benchmark content-aware network devices. More specifically, used to benchmark content-aware network devices. The scenarios in
these scenarios are designed to most accurately predict performance the following document are intended to more accurately predict the
of these devices when subjected to relevant traffic patterns. This performance of these devices when subjected to dynamic traffic
document will operate within the constraints of the Benchmarking patterns. This document will operate within the constraints of the
Working Group charter, namely black box characterization in a Benchmarking Working Group charter, namely black box characterization
laboratory environment. in a laboratory environment.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 17, 2012. This Internet-Draft will expire on September 13, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 2, line 17 skipping to change at page 2, line 17
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5
2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Test Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Test Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1. Test Considerations . . . . . . . . . . . . . . . . . . . 6 3.1. Test Considerations . . . . . . . . . . . . . . . . . . . 6
3.2. Clients and Servers . . . . . . . . . . . . . . . . . . . 6 3.2. Clients and Servers . . . . . . . . . . . . . . . . . . . 6
3.3. Traffic Generation Requirements . . . . . . . . . . . . . 6 3.3. Traffic Generation Requirements . . . . . . . . . . . . . 6
3.4. Discussion of Network Mathematics . . . . . . . . . . . . 6 3.4. Discussion of Network Limitations . . . . . . . . . . . . 6
3.5. Framework for Traffic Specification . . . . . . . . . . . 8 3.5. Framework for Traffic Specification . . . . . . . . . . . 8
3.6. Multiple Client/Server Testing . . . . . . . . . . . . . . 8 3.6. Multiple Client/Server Testing . . . . . . . . . . . . . . 8
3.7. Device Configuration Considerations . . . . . . . . . . . 8 3.7. Device Configuration Considerations . . . . . . . . . . . 8
3.7.1. Network Addressing . . . . . . . . . . . . . . . . . . 8 3.7.1. Network Addressing . . . . . . . . . . . . . . . . . . 8
3.7.2. Network Address Translation . . . . . . . . . . . . . 9 3.7.2. Network Address Translation . . . . . . . . . . . . . 9
3.7.3. TCP Stack Considerations . . . . . . . . . . . . . . . 9 3.7.3. TCP Stack Considerations . . . . . . . . . . . . . . . 9
3.7.4. Other Considerations . . . . . . . . . . . . . . . . . 9 3.7.4. Other Considerations . . . . . . . . . . . . . . . . . 9
4. Benchmarking Tests . . . . . . . . . . . . . . . . . . . . . . 9 4. Benchmarking Tests . . . . . . . . . . . . . . . . . . . . . . 9
4.1. Maximum Application Flow Rate . . . . . . . . . . . . . . 9 4.1. Maximum Application Session Establishment Rate . . . . . . 9
4.1.1. Objective . . . . . . . . . . . . . . . . . . . . . . 10 4.1.1. Objective . . . . . . . . . . . . . . . . . . . . . . 10
4.1.2. Setup Parameters . . . . . . . . . . . . . . . . . . . 10 4.1.2. Setup Parameters . . . . . . . . . . . . . . . . . . . 10
4.1.2.1. Application-Layer Parameters . . . . . . . . . . . 10 4.1.2.1. Application-Layer Parameters . . . . . . . . . . . 10
4.1.3. Procedure . . . . . . . . . . . . . . . . . . . . . . 10 4.1.3. Procedure . . . . . . . . . . . . . . . . . . . . . . 10
4.1.4. Measurement . . . . . . . . . . . . . . . . . . . . . 10 4.1.4. Measurement . . . . . . . . . . . . . . . . . . . . . 10
4.1.4.1. Maximum Application Flow Rate . . . . . . . . . . 10 4.1.4.1. Maximum Application Flow Rate . . . . . . . . . . 10
4.1.4.2. Application Flow Duration . . . . . . . . . . . . 10 4.1.4.2. Application Flow Duration . . . . . . . . . . . . 11
4.1.4.3. Packet Loss . . . . . . . . . . . . . . . . . . . 11 4.1.4.3. Packet Loss . . . . . . . . . . . . . . . . . . . 11
4.1.4.4. Application Flow Latency . . . . . . . . . . . . . 11 4.1.4.4. Application Flow Latency . . . . . . . . . . . . . 11
4.2. Application Throughput . . . . . . . . . . . . . . . . . . 11 4.2. Application Throughput . . . . . . . . . . . . . . . . . . 11
4.2.1. Objective . . . . . . . . . . . . . . . . . . . . . . 11 4.2.1. Objective . . . . . . . . . . . . . . . . . . . . . . 11
4.2.2. Setup Parameters . . . . . . . . . . . . . . . . . . . 11 4.2.2. Setup Parameters . . . . . . . . . . . . . . . . . . . 11
4.2.2.1. Parameters . . . . . . . . . . . . . . . . . . . . 11 4.2.2.1. Parameters . . . . . . . . . . . . . . . . . . . . 11
4.2.3. Procedure . . . . . . . . . . . . . . . . . . . . . . 11 4.2.3. Procedure . . . . . . . . . . . . . . . . . . . . . . 11
4.2.4. Measurement . . . . . . . . . . . . . . . . . . . . . 11 4.2.4. Measurement . . . . . . . . . . . . . . . . . . . . . 11
4.2.4.1. Maximum Throughput . . . . . . . . . . . . . . . . 11 4.2.4.1. Maximum Throughput . . . . . . . . . . . . . . . . 12
4.2.4.2. Packet Loss . . . . . . . . . . . . . . . . . . . 12 4.2.4.2. Packet Loss . . . . . . . . . . . . . . . . . . . 12
4.2.4.3. Maximum Application Flow Rate . . . . . . . . . . 12 4.2.4.3. Maximum Application Flow Rate . . . . . . . . . . 12
4.2.4.4. Application Flow Duration . . . . . . . . . . . . 12 4.2.4.4. Application Flow Duration . . . . . . . . . . . . 12
4.2.4.5. Packet Loss . . . . . . . . . . . . . . . . . . . 12 4.2.4.5. Packet Loss . . . . . . . . . . . . . . . . . . . 12
4.2.4.6. Application Flow Latency . . . . . . . . . . . . . 12 4.2.4.6. Application Flow Latency . . . . . . . . . . . . . 12
4.3. Malicious Traffic Handling . . . . . . . . . . . . . . . . 12 4.3. Malformed Traffic Handling . . . . . . . . . . . . . . . . 12
4.3.1. Objective . . . . . . . . . . . . . . . . . . . . . . 12 4.3.1. Objective . . . . . . . . . . . . . . . . . . . . . . 12
4.3.2. Setup Parameters . . . . . . . . . . . . . . . . . . . 12 4.3.2. Setup Parameters . . . . . . . . . . . . . . . . . . . 12
4.3.2.1. Parameters . . . . . . . . . . . . . . . . . . . . 12
4.3.3. Procedure . . . . . . . . . . . . . . . . . . . . . . 13 4.3.3. Procedure . . . . . . . . . . . . . . . . . . . . . . 13
4.3.4. Measurement . . . . . . . . . . . . . . . . . . . . . 13 4.3.4. Measurement . . . . . . . . . . . . . . . . . . . . . 13
4.4. Malformed Traffic Handling . . . . . . . . . . . . . . . . 13 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
4.4.1. Objective . . . . . . . . . . . . . . . . . . . . . . 13 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13
4.4.2. Setup Parameters . . . . . . . . . . . . . . . . . . . 13 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.4.3. Procedure . . . . . . . . . . . . . . . . . . . . . . 13 7.1. Normative References . . . . . . . . . . . . . . . . . . . 14
4.4.4. Measurement . . . . . . . . . . . . . . . . . . . . . 14 7.2. Informative References . . . . . . . . . . . . . . . . . . 14
5. Appendix A: Example Test Case . . . . . . . . . . . . . . . . 14 Appendix A. Example Traffic Mix . . . . . . . . . . . . . . . . . 15
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 Appendix B. Malformed Traffic Algorithm . . . . . . . . . . . . . 17
7. Security Considerations . . . . . . . . . . . . . . . . . . . 16 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
8.1. Normative References . . . . . . . . . . . . . . . . . . . 16
8.2. Informative References . . . . . . . . . . . . . . . . . . 17
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17
1. Introduction 1. Introduction
Content-aware and deep packet inspection (DPI) device deployments Content-aware and deep packet inspection (DPI) device deployments
have grown significantly in recent years. No longer are devices have grown significantly in recent years. No longer are devices
simply using Ethernet and IP headers to make forwarding decisions. simply using Ethernet and IP headers to make forwarding decisions.
This class of device now uses application-specific data to make these This class of device now uses application-specific data to make these
decisions. For example, a web-application firewall (WAF) may use decisions. For example, a web-application firewall (WAF) may use
search criteria upon the HTTP uniform resource indicator (URI)[1] to search criteria upon the HTTP uniform resource indicator (URI)[1] to
decide whether a HTTP GET method may traverse the network. In the decide whether a HTTP GET method may traverse the network. In the
case of lawful/legal intercept technology, a device could use the case of lawful/legal intercept technology, a device could use the
phone number within the Session Description Protocol[11] to determine phone number within the Session Description Protocol[13] to determine
whether a voice-over-IP phone may be allowed to connect. In addition whether a voice-over-IP phone may be allowed to connect. In addition
to the development of entirely new classes of devices, devices that to the development of entirely new classes of devices, devices that
could historically be classified as 'stateless' or raw forwarding could historically be classified as 'stateless' or raw forwarding
devices are now performing DPI functionality. Devices such as core devices are now performing DPI functionality. Devices such as core
and edge routers are now being developed with DPI functionality to and edge routers are now being developed with DPI functionality to
make more intelligent routing and forwarding decisions. make more intelligent routing and forwarding decisions.
The Benchmarking Working Group (BMWG) has historically produced The Benchmarking Working Group (BMWG) has historically produced
Internet Drafts and Requests for Comment that are focused Internet Drafts and Requests for Comment that are focused
specifically on creating output metrics that are derived from a very specifically on creating output metrics that are derived from a very
specific and well-defined set of input parameters that are completely specific and well-defined set of input parameters that are completely
and unequivocally reproducible from test bed to test bed. The end and unequivocally reproducible from test bed to test bed. The end
goal of such methodologies is to, in the words of the RFC 2544 [2], goal of such methodologies is to, in the words of the RFC 2544 [2],
reduce "specsmanship" in the industry. Existing BMWG work has reduce "specsmanship" in the industry and hold vendors accountable
certainly met this stated goal. for performance claims.
The BMWG has historically avoided the use of the term "realistic" The end goal of this methodology is to generate performance metrics
throughout all of its drafts and RFCs. While this document will not in a lab environment that will more closely relate to actual observed
explicitly use this term, the end goal of the terminology and performance on production networks. By utilizing dynamic traffic
methodology is to generate performance metrics that will be as close patterns relevant to modern networks, this methodology should be able
as possible to equivalent metrics in a production environment. It to more closely tie laboratory and production metrics. It should be
should be further noted than any metrics acquired from a production further noted than any metrics acquired from production networks
network SHOULD be captured according to the policies and procedures SHOULD be captured according to the policies and procedures of the
of the IPPM or PMOL working groups. IPPM or PMOL working groups.
An explicit non-goal of this document is to replace existing An explicit non-goal of this document is to replace existing
methodology/terminology pairs such as RFC 2544 [2]/RFC 1242 [3] or methodology/terminology pairs such as RFC 2544 [2]/RFC 1242 [3] or
RFC 3511 [4]/RFC 2647 [5]. The explicit goal of this document is to RFC 3511 [4]/RFC 2647 [5]. The explicit goal of this document is to
create a methodology and terminology pair that is more suited for create a methodology more suited for modern devices while
modern devices while complementing the data acquired using existing complementing the data acquired using existing BMWG methodologies.
BMWG methodologies. Existing BMWG work generally relies on This document does not assume completely repeatable input stimulus.
completely repeatable input stimulus, expecting fully repeatable The nature of application-driven networks is such that a single
output. For unicast UDP streams, this makes complete sense. This dropped packet inherently changes the input stimulus from a network
document does not assume completely repeatable input stimulus. The
nature of application-driven networks is such that a single dropped
packet inherently changes the input stimulus from a network
perspective. While application flows will be specified in great perspective. While application flows will be specified in great
detail, it simply is not practical to require totally repeatable detail, it simply is not practical to require totally repeatable
input stimulus. input stimulus.
1.1. Requirements Language 1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [6]. document are to be interpreted as described in RFC 2119 [6].
2. Scope 2. Scope
Content-aware devices take many forms, shapes and architectures. Content-aware devices take many forms, shapes and architectures.
These devices are advanced network interconnect devices that inspect These devices are advanced network interconnect devices that inspect
deep into the application payload of network data packets to do deep into the application payload of network data packets to do
classification. They may be as simple as a firewall that uses classification. They may be as simple as a firewall that uses
application data inspection for rule set enforcement, or they may application data inspection for rule set enforcement, or they may
have advanced functionality such as performing protocol decoding and have advanced functionality such as performing protocol decoding and
validation, anti-virus, anti-spam and even application exploit validation, anti-virus, anti-spam and even application exploit
filtering. filtering. The document will universally call these devices
middleboxes, as defined by RFC 3234 [7].
This document is strictly focused on examining performance and This document is strictly focused on examining performance and
robustness across a focused set of metrics: throughput(min/max/avg/ robustness across a focused set of metrics: throughput(min/max/avg/
sample std dev), transaction rates(successful/failed), application sample std dev), transaction rates(successful/failed), application
response times, concurrent flows, and unidirectional packet latency. response times, concurrent flows, and unidirectional packet latency.
None of the metrics captured through this methodology are specific to None of the metrics captured through this methodology are specific to
a device, nor do they characterize the functional behavior of those a device and the results are DUT implementation independent.
devices. The metrics are implementation independent. Functional Functional testing of the DUT is outside the scope of this
testing of the DUT is outside the scope of this methodology. methodology.
Devices such as firewalls, intrusion detection and prevention Devices such as firewalls, intrusion detection and prevention
devices, application delivery controllers, deep packet inspection devices, wireless LAN controllers, application delivery controllers,
devices, wide-area network(WAN) optimization devices, and unified deep packet inspection devices, wide-area network(WAN) optimization
threat management systems generally fall into the content-aware devices, and unified threat management systems generally fall into
category. While this list may become obsolete, these are a subset of the content-aware category. While this list may become obsolete,
devices that fall under this scope of testing. these are a subset of devices that fall under this scope of testing.
3. Test Setup 3. Test Setup
This document will be applicable to most test configurations and will This document will be applicable to most test configurations and will
not be confined to a discussion on specific test configurations. not be confined to a discussion on specific test configurations.
Since each DUT/SUT will have their own unique configuration, users Since each DUT/SUT will have their own unique configuration, users
SHOULD configure their device with the same parameters that would be SHOULD configure their device with the same parameters that would be
used in the actual deployment of the device or a typical deployment, used in the actual deployment of the device or a typical deployment,
if the actual deployment is unknown. In order to improve if the actual deployment is unknown. In order to improve
repeatability, the DUT configuration SHOULD be published with the repeatability, the DUT configuration SHOULD be published with the
skipping to change at page 6, line 20 skipping to change at page 6, line 18
Content-aware device testing SHOULD involve multiple clients and Content-aware device testing SHOULD involve multiple clients and
multiple servers. As with RFC 3511 [4], this methodology will use multiple servers. As with RFC 3511 [4], this methodology will use
the terms virtual clients/servers because both the client and server the terms virtual clients/servers because both the client and server
will be represented by the tester and not actual clients/servers. will be represented by the tester and not actual clients/servers.
Similarly defined in RFC 3511 [4], a data source may emulate multiple Similarly defined in RFC 3511 [4], a data source may emulate multiple
clients and/or servers within the context of the same test scenario. clients and/or servers within the context of the same test scenario.
The test report SHOULD indicate the number of virtual clients/servers The test report SHOULD indicate the number of virtual clients/servers
used during the test. IANA has reserved address ranges for used during the test. IANA has reserved address ranges for
laboratory characterization. These are defined for IPv4 and IPv6 by laboratory characterization. These are defined for IPv4 and IPv6 by
RFC 2544 Appendix C [2] and RFC 5180 Section 5.2 [7] respectively and RFC 2544 Appendix C [2] and RFC 5180 Section 5.2 [8] respectively and
SHOULD be consulted prior to testing. SHOULD be consulted prior to testing.
3.3. Traffic Generation Requirements 3.3. Traffic Generation Requirements
The explicit purposes of content-aware devices vary widely, but these The explicit purposes of content-aware devices vary widely, but these
devices use information deeper inside the application flow to make devices use information deeper inside the application flow to make
decisions and classify traffic. This methodology will utilize decisions and classify traffic. This methodology will utilize
traffic flows that resemble real application traffic without traffic flows that resemble real application traffic without
utilizing captures from live production networks. Application Flows, utilizing captures from live production networks. Application Flows,
as defined in RFC 2722 [8] are able to be well-defined without simply as defined in RFC 2722 [9] are able to be well-defined without simply
referring to a network capture. An example traffic template is referring to a network capture. An example traffic template is
defined and listed in Section 5 of this document. A user of this defined and listed in Appendix A of this document. A user of this
methodology is free to utilize the example mix as provided in the methodology is free to utilize the example mix as provided in the
appendix. If a user of this methodology understands the traffic appendix. If a user of this methodology understands the traffic
patterns in their production network, that user SHOULD use the patterns in their production network, that user MAY use the template
template provided in Section 5 to describe a traffic mix appropriate provided in Appendix A to describe a traffic mix appropriate for
for their environment. their environment.
The test tool SHOULD be able to create application flows between The test tool SHOULD be able to create application flows between
every client and server, regardless of direction. The tester SHOULD every client and server, regardless of direction. The tester SHOULD
be able to open TCP connections on multiple destination ports and be able to open TCP connections on multiple destination ports and
SHOULD be able to direct UDP traffic to multiple destination ports. SHOULD be able to direct UDP traffic to multiple destination ports.
3.4. Discussion of Network Mathematics 3.4. Discussion of Network Limitations
Prior to executing the methodology as outlined in the following Prior to executing the methodology as outlined in the following
sections, it is imperative to understand the implications of sections, it is imperative to understand the implications of
utilizing representative application flows for the traffic content of utilizing representative application flows for the traffic content of
the benchmarking effort. One interesting aspect of utilizing the benchmarking effort. One interesting aspect of utilizing
application flows is that each flow is inherently different from application flows is that each flow is inherently different from
every other application flow. The content of each flow will vary every other application flow. The content of each flow will vary
from application to application, and in most cases, even varies from application to application, and in most cases, even varies
within the same type of application flow. The following description within the same type of application flow. The following description
of the methodology will individually benchmark every individual type of the methodology will individually benchmark every individual type
and subset of application flow, prior to performing similar tests and subset of application flow, prior to performing similar tests
with a traffic mix as specified either by the example mix in with a traffic mix as specified either by the example mix in
Section 5, or as defined by the user of this methodology. Appendix A, or as defined by the user of this methodology.
The purpose of this process is to ensure that any performance The purpose of this process is to ensure that any performance
implications that are discovered during the mixed testing aren't due implications that are discovered during the mixed testing aren't due
to the inherent physical network limitations. As an example of this to the inherent physical network limitations. As an example of this
phenomena, it is useful to examine a network device inserted into a phenomena, it is useful to examine a network device inserted into a
single path, as illustrated in the following diagram. single path, as illustrated in the following diagram.
+----------+ +----------+
+---+ 1gE | DUT/ | 1gE +---+ +---+ 1gE | DUT/ | 1gE +---+
|C/S|------| SUT |------|C/S| |C/S|------| SUT |------|C/S|
+---+ +----------+ +---+ +---+ +----------+ +---+
Simple Inline DUT Configuration Simple Inline DUT Configuration
Figure 1: Single Path Example Figure 1: Simple Middle-box Example
For the purpose of this discussion, let's take a theoretical For the purpose of this discussion, let's take a hypothetical
application flow that utilizes UDP for the transport layer. Assume application flow that utilizes UDP for the transport layer. Assume
that the sample transaction we will be using to model this particular that the sample transaction we will be using to model this particular
flow requires 10 UDP datagrams to complete the transaction. For flow requires 10 UDP datagrams to complete the transaction. For
simplicity, each datagram within the flow is exactly 64 bytes, simplicity, each datagram within the flow is exactly 64 bytes,
including associated Ethernet, IP, and UDP overhead. With any including associated Ethernet, IP, and UDP overhead. With any
network device,there are always three metrics which interact with network device,there are always three metrics which interact with
each other: number of concurrent application flows, number of each other: number of concurrent application flows, number of
application flows per second, and layer-7 throughput. application flows per second, and layer-7 throughput.
Our example test bed is a single-path device connected with 1 gigabit Our example test bed is a single-path device connected by 1 gigabit
Ethernet links. The purpose of this benchmark effort is to quantify Ethernet links. The purpose of this benchmark effort is to quantify
the number of application flows per second that may be processed the number of application flows per second that may be processed
through our device under test. Let's assume that the result from our through our device under test. Let's assume that the result from our
scenario is that the DUT is able to process 10,000 application flows scenario is that the DUT is able to process 10,000 application flows
per second. The question is whether that ceiling is the actual per second. The question is whether that ceiling is the actual
ceiling of the device, or if it is actually being limited by one of ceiling of the device, or if it is actually being limited by one of
the other metrics. If we do the appropriate math, 10000 flows per the other metrics. If we do the appropriate math, 10000 flows per
second, with each flow at 640 total bytes means that we are achieving second, with each flow at 640 total bytes means that we are achieving
a throughput of roughly 49 Mbps. This is dramatically less than the an aggregate bitrate of roughly 49 Mbps. This is dramatically less
1 gigabit physical link we are using. We can conclude that 10,000 than the 1 gigabit physical link we are using. We can conclude that
flows per second is in fact the performance limit of the device. 10,000 flows per second is in fact the performance limit of the
device.
If we change the example slightly and increase the size of each If we change the example slightly and increase the size of each
datagram to 1312 bytes, then it becomes necessary to recompute the datagram to 1312 bytes, then it becomes necessary to recompute the
math. Assuming the same observed DUT limitation of 10,000 flows per math. Assuming the same observed DUT limitation of 10,000 flows per
second, it must be ensured that this is an artifact of the DUT, and second, it must be ensured that this is an artifact of the DUT, and
not of physical limitations. For each flow, we'll require 104,960 not of physical limitations. For each flow, we'll require 104,960
bits. 10,000 flows per second implies a throughput of roughly 1 Gbps. bits. 10,000 flows per second implies a throughput of roughly 1 Gbps.
At this point, we cannot definitively answer whether the DUT is At this point, we cannot definitively answer whether the DUT is
actually limited to 10,000 flows per second. If we are able to actually limited to 10,000 flows per second. If we are able to
modify the scenario, and utilize 10 Gigabit interfaces, then perhaps modify the scenario, and utilize 10 Gigabit interfaces, then perhaps
skipping to change at page 8, line 51 skipping to change at page 8, line 49
3.7. Device Configuration Considerations 3.7. Device Configuration Considerations
The configuration of the DUT may have an effect on the observed The configuration of the DUT may have an effect on the observed
results of the following methodology. A comprehensive, but certainly results of the following methodology. A comprehensive, but certainly
not exhaustive, list of potential considerations is listed below. not exhaustive, list of potential considerations is listed below.
3.7.1. Network Addressing 3.7.1. Network Addressing
The IANA has issued a range of IP addresses to the BMWG for purposes The IANA has issued a range of IP addresses to the BMWG for purposes
of benchmarking. Please refer to RFC 2544 [2] and RFC 5180 [7] for of benchmarking. Please refer to RFC 2544 [2] and RFC 5180 [8] for
more details. more details. If more IPv4 addresses are required than the RFC 2544
allotment provides, then allocations from the private address space
as defined in RFC 1918 [10] may be used.
3.7.2. Network Address Translation 3.7.2. Network Address Translation
Many content-aware devices are capable of performing Network Address Many content-aware devices are capable of performing Network Address
Translation (NAT)[5]. If the final deployment of the DUT will have Translation (NAT)[5]. If the final deployment of the DUT will have
this functionality enabled, then the DUT SHOULD also have it enabled this functionality enabled, then the DUT SHOULD also have it enabled
during the execution of this methodology. It MAY be beneficial to during the execution of this methodology. It MAY be beneficial to
perform the test series in both modes in order to determine the perform the test series in both modes in order to determine the
performance differential when using NAT. The test report SHOULD performance differential when using NAT. The test report SHOULD
indicate whether NAT was enabled during the testing process. indicate whether NAT was enabled during the testing process.
3.7.3. TCP Stack Considerations 3.7.3. TCP Stack Considerations
The IETF has historically provided guidance and information on TCP The IETF has historically provided guidance and information on TCP
stack considerations. This methodology is strictly focused on stack considerations. This methodology is strictly focused on
performance metrics at layers above 4, thus does not specifically performance metrics at layers above 4, thus does not specifically
define any TCP stack configuration parameters of either the tester or define any TCP stack configuration parameters of either the tester or
the DUTs. The TCP configuration of the tester SHOULD remain constant the DUTs. The TCP configuration of the tester MUST remain constant
across all DUTs in order to ensure comparable results. While the across all DUTs in order to ensure comparable results. While the
following list of references is not exhaustive, each document following list of references is not exhaustive, each document
contains a relevant discussion on TCP stack considerations. contains a relevant discussion on TCP stack considerations.
Congestion control algorithms are discussed in Section 2 of RFC 3148 Congestion control algorithms are discussed in Section 2 of RFC 3148
[9] with even more detailed references. TCP receive and congestion [11] with even more detailed references. TCP receive and congestion
window sizes are discussed in detail in RFC 6349 [10]. window sizes are discussed in detail in RFC 6349 [12].
3.7.4. Other Considerations 3.7.4. Other Considerations
Various content-aware devices will have widely varying feature sets. Various content-aware devices will have widely varying feature sets.
In the interest of representative test results, the DUT features that In the interest of representative test results, the DUT features that
will likely be enabled in the final deployment SHOULD be used. This will likely be enabled in the final deployment SHOULD be used. This
methodology is not intended to advise on which features should be methodology is not intended to advise on which features should be
enabled, but to suggest using actual deployment configurations. enabled, but to suggest using actual deployment configurations.
4. Benchmarking Tests 4. Benchmarking Tests
Each of the following benchmark scenarios SHOULD be run with each of Each of the following benchmark scenarios SHOULD be run with each of
the single application flow templates. Upon completion of all the single application flow templates. Upon completion of all
iterations, the mixed test SHOULD be completed, subject to the iterations, the mixed test SHOULD be completed, subject to the
traffic mix as defined by the user. traffic mix as defined by the user.
4.1. Maximum Application Flow Rate 4.1. Maximum Application Session Establishment Rate
4.1.1. Objective 4.1.1. Objective
To determine the maximum rate through which a device is able to To determine the maximum rate through which a device is able to
establish and complete application flows as defined by establish and complete application flows as defined by
draft-ietf-bmwg-ca-bench-term-00. draft-ietf-bmwg-ca-bench-term-00.
4.1.2. Setup Parameters 4.1.2. Setup Parameters
The following parameters SHOULD be used and reported for all tests: The following parameters SHOULD be used and reported for all tests:
skipping to change at page 10, line 25 skipping to change at page 10, line 25
For each application protocol in use during the test run, the table For each application protocol in use during the test run, the table
provided in Section 3.5 SHOULD be published. provided in Section 3.5 SHOULD be published.
4.1.3. Procedure 4.1.3. Procedure
The test SHOULD generate application network traffic that meets the The test SHOULD generate application network traffic that meets the
conditions of Section 3.3. The traffic pattern SHOULD begin with an conditions of Section 3.3. The traffic pattern SHOULD begin with an
application flow rate of 10% of expected maximum. The test SHOULD be application flow rate of 10% of expected maximum. The test SHOULD be
configured to increase the attempt rate in units of 10% up through configured to increase the attempt rate in units of 10% up through
110% of expected maximum. The duration of each loading phase SHOULD 110% of expected maximum. In the case where expected maximum is
be at least 30 seconds. This test MAY be repeated, each subsequent limited by physical link rate as discovered through , the maximum
iteration beginning at 5% of expected maximum and increasing session rate will attempted will be 100% of expected maximum, or link
establishment rate to 10% more than the maximum observed from the capacity. The duration of each loading phase SHOULD be at least 30
previous test run. seconds. This test MAY be repeated, each subsequent iteration
beginning at 5% of expected maximum and increasing session
establishment rate to 110% of the maximum observed from the previous
test run.
This procedure MAY be repeated any number of times with the results This procedure MAY be repeated any number of times with the results
being averaged together. being averaged together.
4.1.4. Measurement 4.1.4. Measurement
The following metrics MAY be determined from this test, and SHOULD be The following metrics MAY be determined from this test, and SHOULD be
observed for each application protocol within the traffic mix: observed for each application protocol within the traffic mix:
4.1.4.1. Maximum Application Flow Rate 4.1.4.1. Maximum Application Flow Rate
skipping to change at page 11, line 14 skipping to change at page 11, line 21
4.1.4.3. Packet Loss 4.1.4.3. Packet Loss
The test tool SHOULD report the number of flow packets lost or The test tool SHOULD report the number of flow packets lost or
dropped from source to destination. dropped from source to destination.
4.1.4.4. Application Flow Latency 4.1.4.4. Application Flow Latency
The test tool SHOULD report the minimum, maximum and average amount The test tool SHOULD report the minimum, maximum and average amount
of time an application flow member takes to traverse the DUT, as of time an application flow member takes to traverse the DUT, as
defined by RFC 1242 [3], Section 3.13. This rate SHOULD be reported defined by RFC 1242 [3], Section 3.8. This rate SHOULD be reported
individually for each application protocol present within the traffic individually for each application protocol present within the traffic
mix. mix.
4.2. Application Throughput 4.2. Application Throughput
4.2.1. Objective 4.2.1. Objective
To determine the maximum rate through which a device is able to To determine the maximum rate through which a device is able to
forward bits when using application flows as defined in the previous forward bits when using application flows as defined in the previous
sections. sections.
skipping to change at page 12, line 37 skipping to change at page 12, line 42
dropped from source to destination. dropped from source to destination.
4.2.4.6. Application Flow Latency 4.2.4.6. Application Flow Latency
The test tool SHOULD report the minimum, maximum and average amount The test tool SHOULD report the minimum, maximum and average amount
of time an application flow member takes to traverse the DUT, as of time an application flow member takes to traverse the DUT, as
defined by RFC 1242 [3], Section 3.13. This rate SHOULD be reported defined by RFC 1242 [3], Section 3.13. This rate SHOULD be reported
individually for each application protocol present within the traffic individually for each application protocol present within the traffic
mix. mix.
4.3. Malicious Traffic Handling 4.3. Malformed Traffic Handling
4.3.1. Objective 4.3.1. Objective
To determine the effects on performance that malicious traffic may To determine the effects on performance and stability that malformed
have on the DUT. While this test is not designed to characterize traffic may have on the DUT.
accuracy of detection or classification, it MAY be useful to record
these measurements as specified below.
4.3.2. Setup Parameters 4.3.2. Setup Parameters
4.3.2.1. Parameters The same parameters SHOULD be used for Transport-Layer and
Application Layer Parameters previously specified in Section 4.1.2
The same parameters as described in Section 4.1.2 SHOULD be used. and Section 4.2.2.
Additionally, the following parameters SHOULD be used and reported
for all tests:
o Attack List: A listing of the malicious traffic that was generated
by the test.
4.3.3. Procedure 4.3.3. Procedure
This test will utilize the procedures specified previously in This test will utilize the procedures specified previously in
Section 4.1.3 and Section 4.2.3. When performing the procedures Section 4.1.3 and Section 4.2.3. When performing the procedures
listed previously, the tester should generate malicious traffic listed previously, the tester should generate malformed traffic at
representative of the final network deployment. The mix of attacks all protocol layers. This is commonly known as fuzzed traffic.
MAY include software vulnerability exploits, network worms, back-door Fuzzing techniques generally modify portions of packets, including
access attempts, network probes and other malicious traffic. checksum errors, invalid protocol options, and improper protocol
conformance.
If a DUT can be run with and without the attack mitigation, both The process by which the tester SHOULD generate the malformed traffic
procedures SHOULD be run with and without the feature enabled on the is outlined in detail in Appendix B.
DUT to determine the affects of the malicious traffic on the baseline
metrics previously derived. If a DUT does not have active attack
mitigation capabilities, this procedure SHOULD be run regardless.
Certain malicious traffic could affect device performance even if the
DUT does not actively inspect packet data for malicious traffic.
4.3.4. Measurement 4.3.4. Measurement
The metrics specified by Section 4.1.4 and Section 4.2.4 SHOULD be For each protocol present in the traffic mix, the metrics specified
determined from this test. by Section 4.1.4 and Section 4.2.4 MAY be determined. This data may
be used to ascertain the effects of fuzzed traffic on the DUT.
4.4. Malformed Traffic Handling 5. IANA Considerations
4.4.1. Objective This memo includes no request to IANA.
To determine the effects on performance and stability that malformed All drafts are required to have an IANA considerations section (see
traffic may have on the DUT. the update of RFC 2434 [14] for a guide). If the draft does not
require IANA to do anything, the section contains an explicit
statement that this is the case (as above). If there are no
requirements for IANA, the section will be removed during conversion
into an RFC by the RFC Editor.
4.4.2. Setup Parameters 6. Security Considerations
The same parameters SHOULD be used for Transport-Layer and Benchmarking activities as described in this memo are limited to
Application Layer Parameters previously specified in Section 4.1.2 technology characterization using controlled stimuli in a laboratory
and Section 4.2.2. environment, with dedicated address space and the other constraints
RFC 2544 [2].
4.4.3. Procedure The benchmarking network topology will be an independent test setup
and MUST NOT be connected to devices that may forward the test
traffic into a production network, or mis-route traffic to the test
management network
This test will utilize the procedures specified previously in 7. References
Section 4.1.3 and Section 4.2.3. When performing the procedures 7.1. Normative References
listed previously, the tester should generate malformed traffic at
all protocol layers. This is commonly known as fuzzed traffic.
Fuzzing techniques generally modify portions of packets, including
checksum errors, invalid protocol options, and improper protocol
conformance. This test SHOULD be run on a DUT regardless of whether
it has built-in mitigation capabilities.
4.4.4. Measurement [1] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986,
January 2005.
For each protocol present in the traffic mix, the metrics specified [2] Bradner, S. and J. McQuaid, "Benchmarking Methodology for
by Section 4.1.4 and Section 4.2.4 MAY be determined. This data may Network Interconnect Devices", RFC 2544, March 1999.
be used to ascertain the effects of fuzzed traffic on the DUT.
5. Appendix A: Example Test Case [3] Bradner, S., "Benchmarking terminology for network
interconnection devices", RFC 1242, July 1991.
[4] Hickman, B., Newman, D., Tadjudin, S., and T. Martin,
"Benchmarking Methodology for Firewall Performance", RFC 3511,
April 2003.
[5] Newman, D., "Benchmarking Terminology for Firewall
Performance", RFC 2647, August 1999.
[6] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, March 1997.
[7] Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and Issues",
RFC 3234, February 2002.
[8] Popoviciu, C., Hamza, A., Van de Velde, G., and D. Dugatkin,
"IPv6 Benchmarking Methodology for Network Interconnect
Devices", RFC 5180, May 2008.
[9] Brownlee, N., Mills, C., and G. Ruth, "Traffic Flow
Measurement: Architecture", RFC 2722, October 1999.
[10] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and E.
Lear, "Address Allocation for Private Internets", BCP 5,
RFC 1918, February 1996.
[11] Mathis, M. and M. Allman, "A Framework for Defining Empirical
Bulk Transfer Capacity Metrics", RFC 3148, July 2001.
[12] Constantine, B., Forget, G., Geib, R., and R. Schrage,
"Framework for TCP Throughput Testing", RFC 6349, August 2011.
7.2. Informative References
[13] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.
[14] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.
Appendix A. Example Traffic Mix
This appendix shows an example case of a protocol mix that may be This appendix shows an example case of a protocol mix that may be
used with this methodology. used with this methodology.
+---------------------------+-----------------------+-------------+ +---------------------------+-----------------------+-------------+
| Application Flow | Options | Value | | Application Flow | Options | Value |
+---------------------------+-----------------------+-------------+ +---------------------------+-----------------------+-------------+
| Web 1kB | | | | Web 1kB | | |
| | Flow Size (L7) | 1kB | | | Flow Size (L7) | 1kB |
| | Flow Percentage | 15% | | | Flow Percentage | 15% |
skipping to change at page 15, line 50 skipping to change at page 16, line 50
| | Flow Percentage | 10% | | | Flow Percentage | 10% |
| | Transport Protocol(s) | UDP | | | Transport Protocol(s) | UDP |
| | Destination Port(s) | 53 | | | Destination Port(s) | 53 |
| RTP | | | | RTP | | |
| | Flow Size (L7) | 100 MB | | | Flow Size (L7) | 100 MB |
| | Flow Percentage | 10% | | | Flow Percentage | 10% |
| | Transport Protocol(s) | UDP | | | Transport Protocol(s) | UDP |
| | Destination Port(s) | 20000-65535 | | | Destination Port(s) | 20000-65535 |
+---------------------------+-----------------------+-------------+ +---------------------------+-----------------------+-------------+
Table 1: Sample Traffic Pattern Table 1: Example Traffic Pattern
6. IANA Considerations
This memo includes no request to IANA.
All drafts are required to have an IANA considerations section (see
the update of RFC 2434 [12] for a guide). If the draft does not
require IANA to do anything, the section contains an explicit
statement that this is the case (as above). If there are no
requirements for IANA, the section will be removed during conversion
into an RFC by the RFC Editor.
7. Security Considerations
Benchmarking activities as described in this memo are limited to
technology characterization using controlled stimuli in a laboratory
environment, with dedicated address space and the other constraints
RFC 2544 [2].
The benchmarking network topology will be an independent test setup
and MUST NOT be connected to devices that may forward the test
traffic into a production network, or misroute traffic to the test
management network
8. References
8.1. Normative References
[1] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986,
January 2005.
[2] Bradner, S. and J. McQuaid, "Benchmarking Methodology for
Network Interconnect Devices", RFC 2544, March 1999.
[3] Bradner, S., "Benchmarking terminology for network
interconnection devices", RFC 1242, July 1991.
[4] Hickman, B., Newman, D., Tadjudin, S., and T. Martin,
"Benchmarking Methodology for Firewall Performance", RFC 3511,
April 2003.
[5] Newman, D., "Benchmarking Terminology for Firewall
Performance", RFC 2647, August 1999.
[6] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, March 1997.
[7] Popoviciu, C., Hamza, A., Van de Velde, G., and D. Dugatkin,
"IPv6 Benchmarking Methodology for Network Interconnect
Devices", RFC 5180, May 2008.
[8] Brownlee, N., Mills, C., and G. Ruth, "Traffic Flow Appendix B. Malformed Traffic Algorithm
Measurement: Architecture", RFC 2722, October 1999.
[9] Mathis, M. and M. Allman, "A Framework for Defining Empirical Each application flow will be broken into multiple transport
Bulk Transfer Capacity Metrics", RFC 3148, July 2001. segments, IP packets, and Ethernet frames. The malformed traffic
algorithm looks very similar to the IP Stack Integrity Checker
project at http://isic.sourceforge.net.
[10] Constantine, B., Forget, G., Geib, R., and R. Schrage, The algorithm is very simple and starts by defining each of the
"Framework for TCP Throughput Testing", RFC 6349, August 2011. fields within the TCP/IP stack that will be malformed during
transmission. The following table illustrates the Ethernet, IPv4,
IPv6, TCP, and UDP fields which are able to be malformed by the
algorithm. The first column lists the protocol, the second column
shows the actual header field name, with the third column showing the
percentage of packets that should have the field modified by the
malformation algorithm.
8.2. Informative References +--------------+--------------------------+-------------+
| Protocol | Header Field | Malformed % |
+--------------+--------------------------+-------------+
| Total Frames | | 1% |
| Ethernet | | |
| | Destination MAC | 0% |
| | Source MAC | 1% |
| | Ethertype | 1% |
| | CRC | 1% |
| IP Version 4 | | |
| | Version | 1% |
| | IHL | 1% |
| | Type of Service | 1% |
| | Total Length | 1% |
| | Identification | 1% |
| | Flags | 1% |
| | Fragment Offset | 1% |
| | Time to Live | 1% |
| | Protocol | 1% |
| | Header Checksum | 1% |
| | Source Address | 1% |
| | Destination Address | 1% |
| | Options | 1% |
| | Padding | 1% |
| UDP | | |
| | Source Port | 1% |
| | Destination Port | 1% |
| | Length | 1% |
| | Checksum | 1% |
| TCP | | |
| | Source Port | 1% |
| | Destination Port | 1% |
| | Sequence Number | 1% |
| | Acknowledgement Number | 1% |
| | Data Offset | 1% |
| | Reserved(3 bit) | 1% |
| | Flags(9 bit) | 1% |
| | Window Size | 1% |
| | Checksum | 1% |
| | Urgent Pointer | 1% |
| | Options(Variable Length) | 1% |
+--------------+--------------------------+-------------+
[11] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Table 2: Malformed Header Values
Description Protocol", RFC 4566, July 2006.
[12] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA This algorithm is to be used across the regular application flows
Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. used throughout the rest of the methodology. As each frame is
emitted from the test tool, a pseudo-random number generator will
indicate whether the frame is to be malformed by creating a number
between 0 and 100. If the number is less than the percentage defined
in the table, then that frame will be malformed. If the frame is to
be malformed, then each of the headers in the table present within
the frame will follow the same process. If it is determined that a
header field should be malformed, the same pseudo-random number
generator will be used to create a random number for the specified
header field.
Authors' Addresses Authors' Addresses
Mike Hamilton Mike Hamilton
BreakingPoint Systems BreakingPoint Systems
Austin, TX 78717 Austin, TX 78717
US US
Phone: +1 512 636 2303 Phone: +1 512 636 2303
Email: mhamilton@breakingpoint.com Email: mhamilton@breakingpoint.com
 End of changes. 58 change blocks. 
195 lines changed or deleted 232 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/