--- 1/draft-ietf-ipsecme-ddos-protection-01.txt 2015-07-04 23:15:00.742755152 -0700 +++ 2/draft-ietf-ipsecme-ddos-protection-02.txt 2015-07-04 23:15:00.798756522 -0700 @@ -1,20 +1,20 @@ IPSecME Working Group Y. Nir Internet-Draft Check Point Intended status: Standards Track V. Smyslov -Expires: September 9, 2015 ELVIS-PLUS - March 8, 2015 +Expires: January 6, 2016 ELVIS-PLUS + July 5, 2015 Protecting Internet Key Exchange (IKE) Implementations from Distributed Denial of Service Attacks - draft-ietf-ipsecme-ddos-protection-01 + draft-ietf-ipsecme-ddos-protection-02 Abstract This document recommends implementation and configuration best practices for Internet-connected IPsec Responders, to allow them to resist Denial of Service and Distributed Denial of Service attacks. Additionally, the document introduces a new mechanism called "Client Puzzles" that help accomplish this task. Status of This Memo @@ -25,21 +25,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on September 9, 2015. + This Internet-Draft will expire on January 6, 2016. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -200,33 +200,33 @@ At this point, filling up the half-open SA database in no longer the most efficient DoS attack. The attacker has two ways to do better: 1. Go back to spoofed addresses and try to overwhelm the CPU that deals with generating cookies, or 2. Take the attack to the next level by also sending an Authentication request. - I don't think the first thing is something we can deal with at the - IKE level. It's probably better left to Intrusion Prevention System - (IPS) technology. + It seems that the first thing cannot be dealt with at the IKE level. + It's probably better left to Intrusion Prevention System (IPS) + technology. - Sending an Authentication request is surprisingly cheap. It requires - a proper IKE header with the correct IKE SPIs, and it requires a - single encrypted payload. The content of the payload might as well - be junk. The responder has to perform the relatively expensive key - derivation, only to find that the Authentication request does not - decrypt. Depending on the responder implementation, this can be - repeated with the same half-open SA (if the responder does not delete - the half-open SA following an unsuccessful decryption - see - discussion in Section 4). + On the other hand sending an Authentication request is surprisingly + cheap. It requires a proper IKE header with the correct IKE SPIs, + and it requires a single encrypted payload. The content of the + payload might as well be junk. The responder has to perform the + relatively expensive key derivation, only to find that the + Authentication request does not decrypt. Depending on the responder + implementation, this can be repeated with the same half-open SA (if + the responder does not delete the half-open SA following an + unsuccessful decryption - see discussion in Section 4). Here too, the number of half-open SAs that the attacker can achieve is crucial, because each one of them allows the attacker to waste some CPU time. So making it hard to make many half-open SAs is important. A strategy against DDoS has to rely on at least 4 components: 1. Hardening the half-open SA database by reducing retention time. @@ -238,53 +238,51 @@ 4. Increasing cost of half-open SA up to what is tolerable for legitimate clients. Puzzles have their place as part of #4. 3. Puzzles The puzzle introduced here extends the cookie mechanism from RFC 7296. It is loosely based on the proof-of-work technique used in - BitCoins ([bitcoins]). Future versions of this document will have - the exact bit structure of the notification payloads, but for now, I - will only describe the semantics of the content. + BitCoins ([bitcoins]). A puzzle is sent to the Initiator in two cases: o The Responder is so overloaded, than no half-open SAs are allowed to be created without the puzzle, or o The Responder is not too loaded, but the rate-limiting in Section 5 prevents half-open SAs from being created with this particular peer address or prefix without first solving a puzzle. When the Responder decides to send the challenge notification in response to a IKE_SA_INIT request, the notification includes three fields: 1. Cookie - this is calculated the same as in RFC 7296. As in RFC 7296, the process of generating the cookie is not specified. 2. Algorithm, this is the identifier of a PRF algorithm, one of those proposed by the Initiator in the SA payload. - 3. Zero Bit Count. This is a number between 8 and 255 that - represents the length of the zero-bit run at the end of the - output of the PRF function calculated over the Keyed-Cookie - payload that the Initiator is to send. Since the mechanism is - supposed to be stateless for the Responder, the same value is - sent to all Initiators who are receiving this challenge. The - values 0 and 1-8 are explicitly excluded, because the value zero - is meaningless, and the values 1-8 create a puzzle that is too - easy to solve for it to make any difference in mitigating DDoS - attacks. + 3. Zero Bit Count. This is a number between 8 and 255 (or a special + value - 0, see Section 8.1.1.1) that represents the length of the + zero-bit run at the end of the output of the PRF function + calculated over the Keyed-Cookie payload that the Initiator is to + send. Since the mechanism is supposed to be stateless for the + Responder, either the same value is sent to all Initiators who + are receiving this challenge or the value is somehow encoded in + the cookie. The values 1-8 are explicitly excluded, because they + create a puzzle that is too easy to solve for it to make any + difference in mitigating DDoS attacks. Upon receiving this challenge payload, the Initiator attempts to calculate the PRF using different keys. When a key is found such that the resulting PRF output has a sufficient number of trailing zero bits, that result is sent to the Responder in a Keyed-Cookie notification, as described in Section 3.1. When receiving a request with a Keyed-Cookie, the Responder verifies two things: @@ -326,21 +324,21 @@ | 0b13cd9a | 00b97bb323d6d33350000000 | 28 | 247.914 | | 37dc96e4 | 1e24babc92234aa3a0000000 | 29 | 1237.170 | | 7a1a56d8 | c98f0061e380a49e00000000 | 33 | 2726.150 | +----------+--------------------------+----------+------------------+ Table 1: COOKIE=fdbcfa5a430d7201282358a2a034de0013cfe2ae The figures above were obtained on a 2.4 GHz single core i5. Run times can be halved or quartered with multi-core code, but would be longer on mobile phone processors, even if those are multi-core as - well. With these figures I believe that 20 bits is a reasonable + well. With these figures 20 bits is believed to be a reasonable choice for puzzle level difficulty for all Initiators, with 24 bits acceptable for specific hosts/prefixes. 3.1. The Keyed-Cookie Notification To be added 3.2. The Puzzle-Required Notification To be added @@ -495,21 +493,21 @@ Initiators. This will force the attacker to use real source addresses, and help avoid the need to impose a greater burden in the form of cookies on the general population of initiators. This makes the per-node or per-prefix soft limit more effective. When Cookies are activated for all requests and the attacker is still managing to consume too many resources, the Responder MAY increase the difficulty of puzzles imposed on IKE_SA_INIT requests coming from suspicious nodes/prefixes. It should still be doable by all legitimate peers, but it can degrade experience, for example by - taking up to 10 seconds to calculate the cookie extension. + taking up to 10 seconds to solve the puzzle. If the load on the Responder is still too great, and there are many nodes causing multiple half-open SAs or IKE_AUTH failures, the Responder MAY impose hard limits on those nodes. If it turns out that the attack is very widespread and the hard caps are not solving the issue, a puzzle MAY be imposed on all Initiators. Note that this is the last step, and the Responder should avoid this if possible. @@ -522,109 +520,103 @@ the IKE_SESSION_RESUME response message, as allowed by RFC 5723, Sec. 4.3.2. Note that the Responder SHOULD cache tickets for a short time to reject reused tickets (Sec. 4.3.1), and therefore there should be no issue of half-open SAs resulting from replayed IKE_SESSION_RESUME messages 7. Operational Considerations [This section needs a lot of expanding] - Not all Initiators support the puzzles, but all initiators are - supposed to support stateless cookies. If this notification is sent - to a non-supporting but legitimate initiator, the exchange will fail. - Responders are advised to first try to mitigate the DoS using - stateless cookies, even imposing them generally before resorting to - using puzzles. - The difficulty level should be set by balancing the requirement to minimize the latency for legitimate initiators and making things difficult for attackers. A good rule of thumb is for taking about 1 second to solve the puzzle. A typical initiator or bot-net member in 2014 can perform slightly less than a million hashes per second per core, so setting the difficulty level to n=20 is a good compromise. It should be noted that mobile initiators, especially phones are considerably weaker than that. Implementations should allow administrators to set the difficulty level, and/or be able to set the difficulty level dynamically in response to load. Initiators should set a maximum difficulty level beyond which they won't try to solve the puzzle and log or display a failure message to the administrator or user. 8. Using Puzzles in the Protocol 8.1. Puzzles in IKE_SA_INIT Exchange - IKE initiator indicates the desire to create new IKE SA by sending + IKE initiator indicates the desire to create a new IKE SA by sending IKE_SA_INIT request message. The message may optionally contain - COOKIE notification if this is a repeated request after the responder - asked initiator to return a cookie. + COOKIE notification if this is a repeated request performed after the + responder's demand to return a cookie. HDR, [N(COOKIE),] SA, KE, Ni, [V+][N+] --> According to the plan, described in Section 6, IKE responder should monitor incoming requests to detect whether it is under attack. If the responder learns that (D)DoS attack is likely to be in progress, - then it either requests the initiator to return cookie or, if the + then it either requests the initiator to return a cookie or, if the volume is so high, that puzzles need to be used for defense, it - requests the initiator to solve the puzzle. + requests the initiator to solve a puzzle. The responder MAY choose to process some fraction of IKE_SA_INIT requests without presenting a puzzle even being under attack to allow - legacy clients, that don't support puzzles, to have chances be + legacy clients, that don't support puzzles, to have chances to be served. The decision whether to process any particular request must be probabilistic, with the probability depending on the responder's - load (i.e. on the volume of the attack). Only those requests, that + load (i.e. on the volume of attack). Only those requests, that contain COOKIE notification, must participate in this lottery. In other words, the responder MUST first perform return routability check before allowing any legacy client to be served if it is under attack. See Section 8.1.3 for details. 8.1.1. Presenting Puzzle If the responder takes a decision to use puzzles, then it includes two notifications in its response message - the COOKIE notification and the PUZZLE notification. The format of the PUZZLE notification is described in Section 10.1. <-- HDR, N(COOKIE), N(PUZZLE), [V+][N+] - The presence of these notifications in IKE_SA_INIT response message - indicates to the initiator that it should solve the puzzle to get - better chances to be served. + The presence of these notifications in an IKE_SA_INIT response + message indicates to the initiator that it should solve the puzzle to + get better chances to be served. 8.1.1.1. Selecting Puzzle Difficulty Level The PUZZLE notification contains the difficulty level of the puzzle - the minimum number of trailing zero bits that the result of PRF must contain. In diverse environments it is next to impossible for the responder to set any specific difficulty level that will result in roughly the same amount of work for all initiators, because computation power of different initiators may vary by the order of magnitude, or even more. The responder may set difficulty level to 0, meaning that the initiator is requested to spend as much power to solve puzzle, as it can afford. In this case no specific number of trailing zero bits is required from the initiator, however the more bits initiator is able to get, the higher chances it will have to be served by the responder. In diverse environments it is RECOMMENDED - that the initiator sets difficulty level to 0. + that the initiator sets difficulty level to 0, unless the attack + volume is very high. If the responder sets non-zero difficulty level, then the level should be determined by analyzing the volume of the attack. The responder MAY set different difficulty levels to different requestd depending on the IP address the request has come from. 8.1.1.2. Selecting Puzzle Algorithm The PUZZLE notification also contains identificator of the algorithm, - that must be used by initiator in puzzle solution. + that must be used by initiator to compute puzzle. Cryptographic algorithm agility is considered an important feature for modern protocols ([ALG-AGILITY]). This feature ensures that protocol doesn't rely on a single build-in set of cryptographic algorithms, but has a means to replace one set with another and negotiate new set with the peer. IKEv2 fully supports cryptographic algorithm agility for its core operations. To support this feature in case of puzzles the algorithm, that is used to compute puzzle, needs to be negotiated during IKE_SA_INIT @@ -641,21 +633,21 @@ reason to return a puzzle. In this case the responder returns NO_PROPOSAL_CHOSEN notification. Note that PRF is a mandatory transform type for IKE SA (see Sections 3.3.2 and 3.3.3 of [RFC7296]) and at least one transform of this type must always be present in SA payload in IKE_SA_INIT exchange. 8.1.1.3. Generating Cookie If responder supports puzzles then cookie should be computed in such a manner, that the responder is able to learn some important - information from the sole cookie, when it is returned back by + information from the sole cookie, when it is later returned back by initiator. In particular - the responder should be able to learn the following information: o Whether the puzzle was given to the initiator or only the cookie was requested. o The difficulty level of the puzzle given to the initiator. o The number of consecutive puzzles given to the initiator. @@ -751,23 +743,23 @@ First, the responder determines if it requested only a cookie, or presented a puzzle to the initiator. If no puzzle was given, then it means that at the time the responder requested a cookie it didn't detect the (D)DoS attack or the attack volume was low. In this case the received request message must not contain the PS payload, and this payload MUST be ignored if for any reason the message contains it. Since no puzzle was given, the responder marks the request with the lowest priority since the initiator spent a little resources creating it. - If the responder learns from the cookie that the puzzle was given to - the initiator, then it looks for the PS payload to determine whether - its request to solve the puzzle was honored or not. If the incoming + If the responder learns from the cookie that puzzle was given to the + initiator, then it looks for the PS payload to determine whether its + request to solve the puzzle was honored or not. If the incoming message doesn't contain PS payload, then it means that the initiator either doesn't support puzzles or doesn't want to deal with them. In either case the request is marked with the lowest priority since the initiator spent a little resources creating it. If PS payload is found in the message then the responder MUST verify the puzzle solution that it contains. The result must contain at least the requested number of trailing zero bits (that is also learned from the cookie, as well as the PRF algorithm used in puzzle solution). If the result of the solution contais fewer bits, than @@ -810,21 +802,21 @@ The responder SHOULD accept incoming request if its priority is high - it means that the initiator spent quite a lot of resources. The responder MAY also accept some of low-priority requests where the initiators don't support puzzles. The percentage of accepted legacy requests depends on the responder's current load. If initiator solved the puzzle, but didn't spend much resources for it (the selected puzzle difficulty level appeared to be low and the initiator solved it quickly), then the responder SHOULD give it - another puzzle. The more puzzles the initiator solve the higher + another puzzle. The more puzzles the initiator solves the higher would be its chances ro be served. The details of how the responder takes decision on any particular request are implementation dependant. The responder can collect all the incoming requests for some short period of time, sort them out based on their priority, calculate the number of alailable memory slots for half-open IKE SAs and then serve that number of the requests from the head of the sorted list. The rest of requests can be either discarded or responded to with new puzzles. @@ -833,28 +825,28 @@ priority and the available resources. 8.2. Puzzles in IKE_AUTH Exchange Once the IKE_SA_INIT exchange is completed, the responder has created a state and is awaiting for the first message of the IKE_AUTH exchange from initiator. At this point the initiator has already passed return routability check and has proved that it has performed some work to complete IKE_SA_INIT exchange. However, the initiator is not yet authenticated and this fact allows malicious initiator to - conduct an attack, described in Section 2. Unlike DoS attack in + perform an attack, described in Section 2. Unlike DoS attack in IKE_SA_INIT exchange, which is targeted on the responder's memory resources, the goal of this attack is to exhaust responder's CPU power. The attack is performed by sending the first IKE_AUTH message containing garbage. This costs nothing to the initiator, but the responder has to do relatively costly operations of computing the Diffie-Hellman shared secret and deriving SK_* keys to be able to - verify authenticity of the message. If the responder doesn't save + verify authenticity of the message. If the responder doesn't keep the computed keys after unsuccessful verification of IKE_AUTH message, then the attack can be repeated several times on the same IKE SA. The responder can use puzzles to make this attack more costly for the initiator. The idea is that the responder includes puzzle in the IKE_SA_INIT response message and the initiator includes puzzle solution in the first IKE_AUTH request message outside the Encrypted payload, so that the responder is able to verify puzzle solution before computing Diffie-Hellman shared secret. The difficulty level @@ -878,23 +870,24 @@ puzzle has been previously presented and solved in the preceeding IKE_SA_INIT exchange. <-- HDR, SA, KE, Nr, N(PUZZLE), [V+][N+] 8.2.1.1. Selecting Puzzle Difficulty Level The difficulty level of the puzzle in IKE_AUTH should be chosen so, that the initiator would spend more time to solve the puzzle, than the responder to compute Diffie-Hellman shared secret and the keys, - needed to decrypt and verify IKE_AUTH message. On the other hand, - the difficulty level should not be too high, otherwise the legitimate - clients would experience additional delay while establishing IKE SA. + needed to decrypt and verify the IKE_AUTH request message. On the + other hand, the difficulty level should not be too high, otherwise + the legitimate clients would experience additional delay while + establishing IKE SA. Note, that since puzzles in the IKE_AUTH exchange are only allowed to be used if they were used in the preceeding IKE_SA_INIT exchange, the responder would be able to estimate the computing power of the initiator and to select the difficulty level accordingly. Unlike puzzles in IKE_SA_INIT, the requested difficulty level for IKE_AUTH puzzles MUST NOT be zero. In other words, the responder must always set specific difficulty level and must not let the initiator to choose it on its own. @@ -915,27 +908,27 @@ puzzle is solved the initiator sends the IKE_AUTH request message, containing the Puzzle Solution payload. HDR, PS, SK {IDi, [CERT,] [CERTREQ,] [IDr,] AUTH, SA, TSi, TSr} --> The Puzzle Solution payload is placed outside the Encrypted payload, so that the responder would be able to verify the puzzle before calculating the Diffie-Hellman shared secret and the SK_* keys. - If IKE Fragmentation is used, then the PS payload MUST be present - only in the first IKE Fragment message, in accordance with the - Section 2.5.3 of [RFC7383]. Note, that calculation of the puzzle in - the IKE_AUTH exchange doesn't depend on the content of the IKE_AUTH - message (see Section 8.2.2.1). Thus the responder has to solve the - puzzle only once and the solution is valid for both unfragmented and - fragmented IKE messages. + If IKE Fragmentation [RFC7383] is used in IKE_AUTH exchange, then the + PS payload MUST be present only in the first IKE Fragment message, in + accordance with the Section 2.5.3 of RFC7383. Note, that calculation + of the puzzle in the IKE_AUTH exchange doesn't depend on the content + of the IKE_AUTH message (see Section 8.2.2.1). Thus the responder + has to solve the puzzle only once and the solution is valid for both + unfragmented and fragmented IKE messages. 8.2.2.1. Computing Puzzle The puzzle in the IKE_AUTH exchange is computed differently, than in the IKE_SA_INIT exchange (see Section 8.1.2.1). The general principle is the same, the difference is in constructing of the string S. Unlike the IKE_SA_INIT exchange, where S is the cookie, in the IKE_AUTH exchange S is a concatenation of Nr and SPIr. In other words, the task for IKE initiator is to find the key K for the agreed upon PRF such that the result of PRF(K,Nr | SPIr) has sufficient @@ -946,50 +939,50 @@ 8.2.3. Receiving Puzzle Solution If the responder requested the initiator to solve puzzle in the IKE_AUTH exchange, then it SHOULD silently discard all the IKE_AUTH request messages without the Puzzle Solution payload. Once the message containing solution for the puzzle is received the responder SHOULD verify the solution before performing computationly intensive operations - computing the Diffie-Hellman shared secret and the SK_* keys. The responder MUST silently discard the received - message if the puzzle solution is not correct. If the puzzle is - successfully verified and the SK_* key are calculated, but the - message authenticity check fails, the responder SHOULD save the - calculated keys in the IKE SA state while waiting for the - retransmissions from the initiator. In this case the responder may - skip verification of the puzzle solution and ignore the Puzzle - Solution payload in the retransmitted messages. + message if the puzzle solution is not correct (has insufficient + number of trailing zero bits). If the puzzle is successfully + verified and the SK_* key are calculated, but the message + authenticity check fails, the responder SHOULD save the calculated + keys in the IKE SA state while waiting for the retransmissions from + the initiator. In this case the responder may skip verification of + the puzzle solution and ignore the Puzzle Solution payload in the + retransmitted messages. If the initiator uses IKE Fragmentation, then it is possible, that due to packets loss and/or reordering the responder would receive non-first IKE Fragment messages before receiving the first one, containing the PS payload. In this case the responder MAY choose to keep the received fragments until the first fragment containing the solution to the puzzle is received. However in this case the - responder SHOULD NOT try to verify authenticity (that would require - the calculation of the SK_* keys) untill the first fragment with the - PS payload is received and the solution to the puzzle is verified. - After successful verification of the puzzle the responder would - calculate the SK_* key and verify authenticity of the collected - fragments. + responder SHOULD NOT try to verify authenticity of the kept fragments + untill the first fragment with the PS payload is received and the + solution to the puzzle is verified. After successful verification of + the puzzle the responder would calculate the SK_* key and verify + authenticity of the collected fragments. 9. DoS Protection after IKE SA is created Once IKE SA is created there is usually no much traffic over it. In most cases this traffic consists of exchanges aimed to create additional Child SAs, rekey or delete them and check the liveness of the peer. With a typical setup and typical Child SA lifetimes there must be no more than a few such exchanges in a minute, often less. Some of these exchanges require relatively little resources (like - liveness check), while others may be resourse consuming (like + liveness check), while others may be resource consuming (like creating or rekeying Child SA with Diffie-Hellman exchange). Since any endpoint can initiate new exchange, there is a possibility that a peer would initiate too many exchanges, that could exhaust host resources. For example the peer can perform endless continuous Child SA rekeying or create overwhelming number of Child SAs with the same Traffic Selectors etc. Such behaviour may be caused by buggy implementation, misconfiguration or be intentional. The latter becomes more real threat if the peer uses NULL Authentication, described in [NULL-AUTH]. In this case the peer remains anonymous, @@ -1059,21 +1052,22 @@ o PRF (2 octets) - Transform ID of the PRF algorithm that must be used to solve the puzzle. Readers should refer to the section "Transform Type 2 - Pseudo-random Function Transform IDs" in [IKEV2-IANA] for the list of possible values. o Difficulty (1 octet) - Difficulty Level of the puzzle. Specifies minimum number of trailing zero bit, that the result of PRF must contain. Value 0 means that the responder doesn't request any specific difficulty level and the initiator is free to select - appropriate difficulty level of its own. + appropriate difficulty level of its own (see Section 8.1.1.1 for + details). This notification contains no data. 10.2. Puzzle Solution Payload The solution to the puzzle is returned back to the responder in a dedicated payload, called Puzzle Solution payload and denoted as PS in this document. 1 2 3 @@ -1135,27 +1129,27 @@ [RFC5723] Sheffer, Y. and H. Tschofenig, "Internet Key Exchange Protocol Version 2 (IKEv2) Session Resumption", RFC 5723, January 2010. [bitcoins] Nakamoto, S., "Bitcoin: A Peer-to-Peer Electronic Cash System", October 2008, . [ALG-AGILITY] Housley, R., "Guidelines for Cryptographic Algorithm - Agility", draft-iab-crypto-alg-agility-02 (work in + Agility", draft-iab-crypto-alg-agility-05 (work in progress), December 2014. [NULL-AUTH] Smyslov, V. and P. Wouters, "The NULL Authentication Method in IKEv2 Protocol", draft-ietf-ipsecme-ikev2-null- - auth-02 (work in progress), January 2015. + auth-07 (work in progress), January 2015. Authors' Addresses Yoav Nir Check Point Software Technologies Ltd. 5 Hasolelim st. Tel Aviv 6789735 Israel EMail: ynir.ietf@gmail.com