draft-ietf-ipsecme-ipsec-ha-00.txt   draft-ietf-ipsecme-ipsec-ha-01.txt 
Network Working Group Y. Nir Network Working Group Y. Nir
Internet-Draft Check Point Internet-Draft Check Point
Intended status: Informational February 25, 2010 Intended status: Informational April 14, 2010
Expires: August 29, 2010 Expires: October 16, 2010
IPsec High Availability and Load Sharing Problem Statement IPsec High Availability and Load Sharing Problem Statement
draft-ietf-ipsecme-ipsec-ha-00 draft-ietf-ipsecme-ipsec-ha-01
Abstract Abstract
This document describes a requirement from IKE and IPsec to allow for This document describes a requirement from IKE and IPsec to allow for
more scalable and available deployments for VPNs. It defines more scalable and available deployments for VPNs. It defines
terminology for high availability and load sharing clusters terminology for high availability and load sharing clusters
implementing IKE and IPsec, and describes gaps in the existing implementing IKE and IPsec, and describes gaps in the existing
standards. standards.
Status of this Memo Status of this Memo
skipping to change at page 1, line 40 skipping to change at page 1, line 40
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 29, 2010. This Internet-Draft will expire on October 16, 2010.
Copyright Notice Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 7 skipping to change at page 3, line 7
modifications of such material outside the IETF Standards Process. modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other it for publication as an RFC or to translate it into languages other
than English. than English.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Conventions Used in This Document . . . . . . . . . . . . . 4 1.1. Conventions Used in This Document . . . . . . . . . . . . 4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. The Problem Statement . . . . . . . . . . . . . . . . . . . . . 5 3. The Problem Statement . . . . . . . . . . . . . . . . . . . . 6
3.1. Lots of Long Lived State . . . . . . . . . . . . . . . . . 5 3.1. Lots of Long Lived State . . . . . . . . . . . . . . . . . 6
3.2. IKE and IPsec Counters . . . . . . . . . . . . . . . . . . 6 3.2. IKE Counters . . . . . . . . . . . . . . . . . . . . . . . 6
3.3. Missing Synch Messages . . . . . . . . . . . . . . . . . . 7 3.3. Outbound SA Counters . . . . . . . . . . . . . . . . . . . 7
3.4. Simultaneous use of IKE and IPsec SAs by Different 3.4. Inbound SA Counters . . . . . . . . . . . . . . . . . . . 7
Members . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.5. Missing Synch Messages . . . . . . . . . . . . . . . . . . 8
4. Security Considerations . . . . . . . . . . . . . . . . . . . . 8 3.6. Simultaneous use of IKE and IPsec SAs by Different
5. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Members . . . . . . . . . . . . . . . . . . . . . . . . . 8
6. Informative References . . . . . . . . . . . . . . . . . . . . 9 3.6.1. Outbound SAs using counter modes . . . . . . . . . . . 9
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 9 4. Security Considerations . . . . . . . . . . . . . . . . . . . 10
5. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6. Informative References . . . . . . . . . . . . . . . . . . . . 10
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 11
1. Introduction 1. Introduction
IKEv2, as described in [RFC4306] and [RFC4718], and IPsec, as IKEv2, as described in [RFC4306] and [RFC4718], and IPsec, as
described in [RFC4301] and others, allows deployment of VPNs between described in [RFC4301] and others, allows deployment of VPNs between
different sites as well as from VPN clients to protected networks. different sites as well as from VPN clients to protected networks.
As VPNs become increasingly important to the organizations deploying As VPNs become increasingly important to the organizations deploying
them, there is a demand to make IPsec solutions more scalable and them, there is a demand to make IPsec solutions more scalable and
less prone to down time, by using more than one physical gateway to less prone to down time, by using more than one physical gateway to
skipping to change at page 4, line 44 skipping to change at page 4, line 44
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
2. Terminology 2. Terminology
"Single Gateway" is an implementation of IKE and IPsec enforcing a "Single Gateway" is an implementation of IKE and IPsec enforcing a
certain policy, as described in [RFC4301]. certain policy, as described in [RFC4301].
"Cluster" is a set of two or more gateways, implementing the same "Cluster" is a set of two or more gateways, implementing the same
security policy, and protecting the same domain. security policy, and protecting the same domain. Clusters exist to
provide both high availability through redundancy, and scalability
through load sharing.
"Member" is one gateway in a cluster. "Member" is one gateway in a cluster.
"High Availability Cluster", or "HA Cluster" is a cluster where only "High Availability" is a condition of a system, not a configuration
one of the members is active at any one time. This member is also type. A system is said to have high availability if its expected
referred to as the the "active", whereas the others are referred to down time is low. High availability can be achieved in various ways,
as "stand-bys". one of which is clustering. All the clusters described in this
document achieve high availability.
"Fault Tolerance" is a condition related to high availability, where
a system maintains service availability, even when a specified set of
fault conditions occur. In clusters, we expect the system to
maintain service availability, when one or more of the cluster
members fails.
"Completely Transparent Cluster" is a cluster where the occurence of
a fault is never visible to the peers.
"Partially Transparent Cluster" is a cluster where the occurence of a
fault may be visible to the peers.
"Hot Standby Cluster", or "HS Cluster" is a cluster where only one of
the members is active at any one time. This member is also referred
to as the the "active", whereas the others are referred to as "stand-
bys". [VRRP] is one method of building such a cluster.
"Load Sharing Cluster", or "LS Cluster" is a cluster where more than "Load Sharing Cluster", or "LS Cluster" is a cluster where more than
one of the members may be active at the same time. one of the members may be active at the same time. The term "load
balancing" is also common, but it implies that the load is actually
balanced between the members, and we don't want to even imply that
this is a requirement.
"Failover" is the event where a stand-by member becomes active, and "Failover" is the event where a one member takes over some load from
the formerly active member becomes a stand-by. some other member. In a hot standby cluster, this hapens when a
standby memeber becomes active due to a failure of the former active
member, or because of an administrator command. In a load sharing
cluster this usually happens because of a failure of one of the
members, but certain load-balancing technologies may allow a
particular load (an SA) to move from one member to another to even
out the load, even without any failures.
"Tight Cluster" is a cluster where all the members share an IP "Tight Cluster" is a cluster where all the members share an IP
address. This could be accomplished using configured interfaces with address. This could be accomplished using configured interfaces with
specialized protocols or hardware, such as [VRRP], or through the use specialized protocols or hardware, such as VRRP, or through the use
of multicast addresses, but in any case, peers need only be of multicast addresses, but in any case, peers need only be
configured with one IP address in the PAD. configured with one IP address in the PAD.
"Loose Cluster" is a cluster where each member has a different IP "Loose Cluster" is a cluster where each member has a different IP
address. Peers find the correct member using some method such as DNS address. Peers find the correct member using some method such as DNS
queries or [REDIRECT]. queries or [REDIRECT].
"Synch Channel" is a communications channel among the cluster "Synch Channel" is a communications channel among the cluster
members, used to transfer state information. The synch channel may members, used to transfer state information. The synch channel may
or may not be IP based, may or may not be encrypted, and may work or may not be IP based, may or may not be encrypted, and may work
skipping to change at page 6, line 4 skipping to change at page 6, line 32
3.1. Lots of Long Lived State 3.1. Lots of Long Lived State
IKE and IPsec have a lot of long lived state: IKE and IPsec have a lot of long lived state:
o IKE SAs last for minutes, hours, or days, and carry keys and other o IKE SAs last for minutes, hours, or days, and carry keys and other
information. Some gateways may carry thousands to hundreds of information. Some gateways may carry thousands to hundreds of
thousands of IKE SAs. thousands of IKE SAs.
o IPsec SAs last for minutes or hours, and carry keys, selectors and o IPsec SAs last for minutes or hours, and carry keys, selectors and
other information. Some gateways may carry hundreds of thousands other information. Some gateways may carry hundreds of thousands
such IPsec SAs. such IPsec SAs.
o SPD Cache entries. While the SPD is unchanging, the SPD cache o SPD Cache entries. While the SPD is unchanging, the SPD cache
changes on the fly due to narrowing. Entries last at least as changes on the fly due to narrowing. Entries last at least as
long as the SAD entries, but tend to last even longer than that long as the SAD entries, but tend to last even longer than that.
A naive implementation of a high availability cluster would have no A naive implementation of a high availability cluster would have no
synchronized state, and a failover would produce an effect similar to synchronized state, and a failover would produce an effect similar to
that of a rebooted gateway. [resumption] describes how new IKE and that of a rebooted gateway. [resumption] describes how new IKE and
IPsec SAs can be recreated in such a case. IPsec SAs can be recreated in such a case.
3.2. IKE and IPsec Counters 3.2. IKE Counters
We can overcome the first problem described in Section 3.1, by We can overcome the first problem described in Section 3.1, by
synchronizing states - whenever an SA is created, we can share this synchronizing states - whenever an SA is created, we can synch this
new state with all other members. There is, however, another new state to all other members. However, those states are not only
problem. Those states are not only long-lived, but they are ever long-lived, they are also ever changing.
changing.
IKE has message counters. A peer may not process message n until IKE has message counters. A peer may not process message n until
after it has processed message n-1. Skipping message IDs is not after it has processed message n-1. Skipping message IDs is not
allowed. So a newly-active member needs to know the last message IDs allowed. So a newly-active member needs to know the last message IDs
both received and transmitted. both received and transmitted.
ESP and AH have an anti-replay feature, where every encrypted packet Often, it is feasible to synchronize the IKE message counters for
carries a counter number. Repeating counter numbers is considered an every IKE exchange. This way, the newly active member knows what
attack, so the newly-active member SHOULD NOT use a replay counter messages it is allowed to process, and what message IDs to use on IKE
number that has already been used. requests, so that peers process them.
In some cases, it is feasible to synchronize the IKE message counters 3.3. Outbound SA Counters
for every IKE exchange, but it is almost never feasible to
synchronize the IPsec message counters for every IPsec packet
transmitted or received. So we have to assume that at least for
IPsec, the replay counter will not be up-to-date on the newly-active
member.
A possible solution to the IPsec problem is to send replay counter ESP and AH have an optional anti-replay feature, where every
information not for each packet processed, but only at regular protected packet carries a counter number. Repeating counter numbers
intervals, say, every 10,000 packets. After a failover, the newly- is considered an attack, so the newly-active member must not use a
active member advances the counters for outbound SAs by 10,000. To replay counter number that has already been used. The peer will drop
the peer this looks like up to 10,000 packets were lost, but this those packets as duplicates and/or warn of an attack.
should be acceptable, as neither ESP nor AH are reliable protocols.
This still has the problem of what to do with inbound IPsec packets,
for which the newly-active member is unable to determine if they are
replayed or not.
Another possible solution to the IPsec problem is to rekey all child Though it may be feasible to synchronize the IKE message counters, it
SAs following a failover. This may or may not be feasible depending is almost never feasible to synchronize the IPsec packet counters for
on the implementation and the configuration. every IPsec packet transmitted. So we have to assume that at least
for IPsec, the replay counter will not be up-to-date on the newly-
active member, and the newly-active member may repeat a counter.
3.3. Missing Synch Messages A possible solution is to synch replay counter information, not for
each packet emitted, but only at regular intervals, say, every 10,000
packets or every 0.5 seconds. After a failover, the newly-active
member advances the counters for outbound SAs by 10,000. To the peer
this looks like up to 10,000 packets were lost, but this should be
acceptable, as neither ESP nor AH guarantee reliable delivery.
3.4. Inbound SA Counters
An even tougher issue, is the synchronization of packet counters for
inbound SAs. If a packet arrives at a newly-active member, there is
no way to determine whether this packet is a replay or not. The
periodic synch does not solve the problem at all, because suppose we
synchronize every 10,000 packets, and the last synch before the
failover had the counter at 170,000. It is probable, though not
certain, that packet number 180,000 has not yet been processed, but
if packet 175,000 arrives at the newly- active member, it has no way
of determining whether or not that packet has or has not already been
processed. The synchronization does prevent the processing of really
old packets, such as those with counter number 165,000. Ignoring all
counters below 180,000 won't work either, because that's up to 10,000
dropped packets, which may be very noticeable.
The easiest solution is to learn the replay counter from the incoming
traffic. This is allowed by the standards, because replay counter
verification is an optional feature. The case can even be made that
it is relatively secure, because non-attack traffic will reset the
counters to what they should be, so an attacker faces the dual
challenge of a very narrow window for attack, and the need to time
the attack to a failover event. Unless the attacker can actually
cause the failover, this would be very difficult. It should be
noted, though, that although this solution is acceptable as far as
RFC 4301 goes, it is a matter of policy whether this is acceptable.
Another possible solution to the inbound SA problem is to rekey all
child SAs following a failover. This may or may not be feasible
depending on the implementation and the configuration.
3.5. Missing Synch Messages
The synch channel is very likely not to be infallible. Before The synch channel is very likely not to be infallible. Before
failover is detected, some synchronization messages may have been failover is detected, some synchronization messages may have been
missed. For example, the active member may have created a new Child missed. For example, the active member may have created a new Child
SA using message n. The new information (entry in the SAD and update SA using message n. The new information (entry in the SAD and update
to counters of the IKE SA) is sent on the synch channel. Still, with to counters of the IKE SA) is sent on the synch channel. Still, with
every possible technology, the update may be missed before the every possible technology, the update may be missed before the
failover. failover.
This is a bad situation, because the IKE SA is doomed. the newly- This is a bad situation, because the IKE SA is doomed. the newly-
skipping to change at page 7, line 34 skipping to change at page 8, line 44
retransmissions and rejections, the whole IKE SA with all retransmissions and rejections, the whole IKE SA with all
associated IPsec SAs will get dropped. associated IPsec SAs will get dropped.
The above scenario may be rare enough that it is acceptable that on a The above scenario may be rare enough that it is acceptable that on a
configuration with thousands of IKE SAs, a few will need to be configuration with thousands of IKE SAs, a few will need to be
recreated from scratch or using session resumption techniques. recreated from scratch or using session resumption techniques.
However, detecting this may take a long time (several minutes) and However, detecting this may take a long time (several minutes) and
this negates the goal of creating a high availability cluster in the this negates the goal of creating a high availability cluster in the
first place. first place.
3.4. Simultaneous use of IKE and IPsec SAs by Different Members 3.6. Simultaneous use of IKE and IPsec SAs by Different Members
For load sharing clusters, all active members may need to use the For load sharing clusters, all active members may need to use the
same SAs, both IKE and IPsec. This is an even greater problem than same SAs, both IKE and IPsec. This is an even greater problem than
in the case of HA, because consecutive packets may need to be sent by in the case of HA, because consecutive packets may need to be sent by
different members to the same peer gateway. different members to the same peer gateway.
The solution to the IKE SA issue is up to the application. It's The solution to the IKE SA issue is up to the application. It's
possible to create some locking mechanism over the synch channel, or possible to create some locking mechanism over the synch channel, or
else have one member "own" the IKE SA and manage the child SAs for else have one member "own" the IKE SA and manage the child SAs for
all other members. For IPsec, solutions fall into two broad all other members. For IPsec, solutions fall into two broad
skipping to change at page 8, line 32 skipping to change at page 9, line 43
cause problems with current gateways. It is also impossible to cause problems with current gateways. It is also impossible to
mandate against this, because the definition of "flow" varies from mandate against this, because the definition of "flow" varies from
one implementation to another. one implementation to another.
o Reply packets may arrive with an IPsec SA that is not "matched" to o Reply packets may arrive with an IPsec SA that is not "matched" to
the one used for the outgoing packets. Also, they might arrive at the one used for the outgoing packets. Also, they might arrive at
a different member. This problem is beyond the scope of this a different member. This problem is beyond the scope of this
document and should be solved by the application, perhaps by document and should be solved by the application, perhaps by
forwarding misdirected packets to the correct gateway for deep forwarding misdirected packets to the correct gateway for deep
inspection. inspection.
3.6.1. Outbound SAs using counter modes
For SAs involving counter mode ciphers such as [CTR] or [GCM] there
is yet another complication. The initial vector for such modes must
never be repeated, and senders use methods such as counters or LFSRs
to ensure this. An SA shared between more than one active member, or
even failing over from one member to another need to make sure that
they do not generate the same initial vector. See [COUNTER_MODES]
for a discussion of this problem in another context.
4. Security Considerations 4. Security Considerations
Implementations running on clusters MUST be as secure as Implementations running on clusters MUST be as secure as
implementations running on single gateways. In other words, no implementations running on single gateways. In other words, no
extension or interpretation used to allow operation in a cluster may extension or interpretation used to allow operation in a cluster may
facilitate attacks that are not possible for single gateways. facilitate attacks that are not possible for single gateways.
Moreover, thought must be given to the synching requirements of any Moreover, thought must be given to the synching requirements of any
protocol extension, to make sure that it does not create an protocol extension, to make sure that it does not create an
opportunity for denial of service attacks on the cluster. opportunity for denial of service attacks on the cluster.
As mentioned in Section 3.4, allowing an inbound child SA to fail
over to another member has the effect of disabling replay counter
protection for a short time. Though the threat is arguably low, it
is a policy decision whether this is acceptable.
5. Change Log 5. Change Log
This is the first version, re-spun as an WG document This is the first version, re-spun as an WG document
6. Informative References 6. Informative References
[COUNTER_MODES]
McGrew, D. and B. Weis, "Using Counter Modes with
Encapsulating Security Payload (ESP) and Authentication
Header (AH) to Protect Group Traffic",
draft-ietf-msec-ipsec-group-counter-modes (work in
progress), March 2010.
[CTR] Housley, R., "Using Advanced Encryption Standard (AES)
Counter Mode", RFC 3686, January 2009.
[GCM] Viega, J. and D. McGrew, "The Use of Galois/Counter Mode
(GCM) in IPsec Encapsulating Security Payload (ESP)",
RFC 4106, June 2005.
[REDIRECT] [REDIRECT]
Devarapalli, V. and K. Weniger, "Redirect Mechanism for Devarapalli, V. and K. Weniger, "Redirect Mechanism for
IKEv2", draft-ietf-ipsecme-ikev2-redirect (work in IKEv2", RFC 5685, November 2009.
progress), August 2009.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the [RFC4301] Kent, S. and K. Seo, "Security Architecture for the
Internet Protocol", RFC 4301, December 2005. Internet Protocol", RFC 4301, December 2005.
[RFC4306] Kaufman, C., "Internet Key Exchange (IKEv2) Protocol", [RFC4306] Kaufman, C., "Internet Key Exchange (IKEv2) Protocol",
RFC 4306, December 2005. RFC 4306, December 2005.
[RFC4718] Eronen, P. and P. Hoffman, "IKEv2 Clarifications and [RFC4718] Eronen, P. and P. Hoffman, "IKEv2 Clarifications and
Implementation Guidelines", RFC 4718, October 2006. Implementation Guidelines", RFC 4718, October 2006.
[VRRP] Hinden, R., "Virtual Router Redundancy Protocol (VRRP)", [VRRP] Hinden, R., "Virtual Router Redundancy Protocol (VRRP)",
RFC 3768, April 2004. RFC 3768, April 2004.
[resumption] [resumption]
Sheffer, Y. and H. Tschofenig, "IKEv2 Session Resumption", Sheffer, Y. and H. Tschofenig, "IKEv2 Session Resumption",
draft-ietf-ipsecme-ikev2-resumption (work in progress), RFC 5723, January 2010.
June 2009.
Author's Address Author's Address
Yoav Nir Yoav Nir
Check Point Software Technologies Ltd. Check Point Software Technologies Ltd.
5 Hasolelim st. 5 Hasolelim st.
Tel Aviv 67897 Tel Aviv 67897
Israel Israel
Email: ynir@checkpoint.com Email: ynir@checkpoint.com
 End of changes. 24 change blocks. 
61 lines changed or deleted 149 lines changed or added

This html diff was produced by rfcdiff 1.38. The latest version is available from http://tools.ietf.org/tools/rfcdiff/