HTTPbis Working Group                                            R. Peon
Internet-Draft                                               Google, Inc
Intended status: Informational                                H. Ruellan
Expires: January 10, February 22, 2014                                     Canon CRF
                                                           July 09,
                                                         August 21, 2013

                      HTTP/2.0 Header Compression
                draft-ietf-httpbis-header-compression-01

                                 HPACK
                draft-ietf-httpbis-header-compression-02

Abstract

   This document describes HPACK, a format adapted to efficiently
   represent HTTP headers in the context of HTTP/2.0.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 10, February 22, 2014.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . .   2 .  3
   2.  Overview . . . . . . . . . . . . . . . . . . . . . . . . . .   2 .  3
     2.1.  Outline  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Header Encoding  . . . . . . . . . . . . . . . . . . . . . . .   3  4
     3.1.  Encoding Components Concepts  . . . . . . . . . . . . . . . . . . .   3
     3.2. .  4
       3.1.1.  Encoding Context . . . . . . . . . . . . . . . . . . .  4
       3.1.2.  Header Table . . . . . . . . . . . . . . . . . . . . . .  4
     3.3.  Header Representation
       3.1.3.  Reference Set  . . . . . . . . . . . . . . . . . . . .  5
       3.3.1.  Literal Representation
       3.1.4.  Header set . . . . . . . . . . . . . . .   5
       3.3.2.  Indexed . . . . . . .  6
       3.1.5.  Header Representation  . . . . . . . . . . . . . . . .  6
     3.4.  Differential Coding
       3.1.6.  Header Emission  . . . . . . . . . . . . . . . . . . .  6
   4.  Detailed Format
     3.2.  Header Set Processing  . . . . . . . . . . . . . . . . . .  7
       3.2.1.  Header Representation Processing . . . . .   7
     4.1.  Header Blocks . . . . . .  7
       3.2.2.  Reference Set Emission . . . . . . . . . . . . . . . .   7
     4.2.  Low-level representations  8
       3.2.3.  Header Set Completion  . . . . . . . . . . . . . . . .   7
       4.2.1.  Integer representation  8
       3.2.4.  Header Table Management  . . . . . . . . . . . . . . .   7
       4.2.2.  String literal representation  8
       3.2.5.  Specific Use Cases . . . . . . . . . . . .   9
     4.3.  Indexed Header Representation . . . . . .  8
   4.  Detailed Format  . . . . . . . .   9
     4.4.  Literal Header Representation . . . . . . . . . . . . . .  10
       4.4.1.  Literal Header without Indexing .  9
     4.1.  Low-level representations  . . . . . . . . . .  10
       4.4.2.  Literal Header with Incremental Indexing . . . . . .  10
       4.4.3.  Literal Header with Substitution Indexing  9
       4.1.1.  Integer representation . . . . . .  11
   5.  Parameter Negotiation . . . . . . . . . .  9
       4.1.2.  Header Name Representation . . . . . . . . . .  12
   6.  Security Considerations . . . . 11
       4.1.3.  Header Value Representation  . . . . . . . . . . . . . 11
     4.2.  Indexed Header Representation  . .  13
   7.  IANA Considerations . . . . . . . . . . . . 11
     4.3.  Literal Header Representation  . . . . . . . . . . . .  13
   8.  Informative References . . 12
       4.3.1.  Literal Header without Indexing  . . . . . . . . . . . 12
       4.3.2.  Literal Header with Incremental Indexing . . . . . . . 13
   Appendix A.  Initial header names
       4.3.3.  Literal Header with Substitution Indexing  . . . . . . 14
   5.  Parameter Negotiation  . . . . . . . . . .  13
     A.1.  Requests . . . . . . . . . . 15
   6.  Security Considerations  . . . . . . . . . . . . . .  14
     A.2.  Responses . . . . . 15
   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . .  15
   Appendix B.  Example . . 16
   8.  Informative References . . . . . . . . . . . . . . . . . . . . 16
     B.1.  First header set
   Appendix A.  Change Log (to be removed by RFC Editor before
                publication . . . . . . . . . . . . . . . . . . . . . 16
     B.2.  Second header set
     A.1.  Since draft-ietf-httpbis-header-compression-01 . . . . . . 16
   Appendix B.  Initial Header Tables . . . . . . . . . . . . . . . . 17
   Authors' Addresses
     B.1.  Requests . . . . . . . . . . . . . . . . . . . . . . .  18

1.  Introduction

   This document describes a format adapted to . . 17
     B.2.  Responses  . . . . . . . . . . . . . . . . . . . . . . . . 18
   Appendix C.  Example . . . . . . . . . . . . . . . . . . . . . . . 19
     C.1.  First header set . . . . . . . . . . . . . . . . . . . . . 19
     C.2.  Second header set  . . . . . . . . . . . . . . . . . . . . 21

1.  Introduction

   This document describes HPACK, a format adapted to efficiently
   represent HTTP headers in the context of HTTP/2.0.

2.  Overview

   In HTTP/1.X, HTTP headers, which are necessary for the functioning of
   the protocol, are transmitted with no transformations.
   Unfortunately, the amount of redundancy in both the keys and the
   values of these headers is astonishingly high, and is the cause of increased
   latency on lower bandwidth links.  This indicates that an alternate
   more compact encoding for headers would be beneficial to latency, and
   that is what is proposed here.

   As shown by SPDY [SPDY], Deflate compresses HTTP very effectively.
   However, the use of a compression scheme which allows for arbitrary
   matches against the previously encoded data (such as Deflate) exposes
   users to security issues.  In particular, the compression of
   sensitive data, together with other data controlled by an attacker,
   may lead to leakage of that sensitive data, even when the resultant
   bytes are transmitted over an encrypted channel.

   Another consideration is that processing and memory costs of a
   compressor such as Deflate may also be too high for some classes of
   devices, for example when doing forward or reverse proxying.

2.1.  Outline

   The HTTP header representation encoding described in this document is based on
   indexing tables a
   header table that store map (name, value) pairs, called header tables
   in the remainder of this document. pairs to index values.  This
   scheme is believed to be safe for all known attacks against the
   compression context today.  Header tables are incrementally updated
   during the whole HTTP/2.0 session.  Two independent header tables are used during a HTTP/2.0
   session, one for HTTP request headers and one for HTTP response
   headers.

   The encoder is responsible for deciding which headers to insert as
   (name, value) pairs
   new entries in the header table.  The decoder then does exactly what
   the encoder prescribes, ending in a state that exactly matches the
   encoder's state.  This enables decoders to remain simple and
   understand a wide variety of encoders.

   A header may be represented as a literal or as an index.  If
   represented as a literal, the representation specifies whether this
   header is used to update the indexing table.  The different
   representations are described

   As two consecutive sets of headers often have headers in Section 3.3.

   A common, each
   set of headers is coded as a difference from the previous set of
   headers.  The goal is to only encode the changes (headers present in
   one of the set and not in the other) between the two sets of headers.

   An example illustrating the use of these different mechanisms to
   represent headers is available in Appendix B. C.

3.  Header Encoding

3.1.  Encoding Components Concepts

   The encoding and decoding of headers relies on a few components.
   First, a some components and
   concepts.  The set of components used form an encoding context.

   Header Table:  The header table (see Section 3.2) 3.1.2) is a component
      used to associate headers to index values.  Second, a

   Reference Set:  The reference set (see Section 3.1.3) is a component
      containing a group of headers is encoded used as a difference
   from the previous reference set for the
      differential encoding of a new set of headers.

   Header Set:  A header set (see Section 3.1.4) is a group of headers
      that are encoded jointly.  A complete set of key-value pairs as
      encoded in an HTTP request or response is a header set.

   Header Representation:  A header can be represented in encoded form
      either as a literal or as an index (see Section 3.4).

   As 3.1.5).  The
      indexed representation is based on the header table.

   Header Emission:  When decoding a set of headers, some operations
      emit a header (see Section 3.1.6).  An emitted header is added to
      the set of headers.  Once emitted, a header can't be removed from
      the set of headers.

3.1.1.  Encoding Context

   The set of components used to encode or decode a header set form an
   encoding context: an encoding context contains a header table and a
   reference set.

   Using HTTP, messages are exchanged in two directions, from between a client to server and from a server to client, there are two sets in
   both direction.  To keep the encoding of components: headers in each direction
   independent from the other direction, there is one encoding context
   for each direction.  All the

   The headers sent contained in messages from the client to
   the a PUSH_PROMISE frame sent by a server to a
   client are encoded (and decoded) using one set of components.
   All within the same context as the headers sent contained
   in messages the HEADERS frame corresponding to a response sent from the server
   to the client
   (including headers contained in PUSH_PROMISE frame) are encoded using
   the other set of compotents.

3.2. client.

3.1.2.  Header Table

   A header table consists of an ordered list of (name, value) pairs.  A
   pair is either inserted at the end
   The first entry of the a header table or replaces an
   existing pair depending on is assigned the chosen representation. index 0.

   A pair header can be represented as by an index which is its position in the table, starting
   with 0 for entry of the first entry.

   An input header name matches the table if they
   match.  A header name of a (name, value) pair
   stored in the Header Table and an entry match if both their name and their
   value match.  A header name and an entry name match if they are equal
   using a character-based, _case sensitive_ comparison.  An input insensitive_ comparison (the case
   insensitive comparison is used because HTTP header value matches the names are defined
   in a case insensitive way).  A header value of a (name, value) pair stored in the Header Table and an entry value match
   if they are equal using a character-based, _case sensitive_
   comparison.
   An input header (name, value) pair matches a pair in the Header Table
   if both the name and value are matching as per above.

   Generally, the header table will not contain duplicate header (name,
   value) entries.
   However, implementations MUST be prepared to accept duplicates
   without signaling signalling an error.  If duplicates are added to
   the table, they MUST be treated as distinct entries with their own
   index positions.

   The

   Initially, a header table is progressively updated based on headers
   represented as literal (as defined in Section 3.3.1). contains a list of common headers.  Two update
   mechanisms
   initial lists of header are defined:

   o  Incremental indexing: provided in Appendix B.  One list is for
   headers transmitted from a client to a server, the represented other for the
   reverse direction.

   A header table is inserted modified by either adding a new entry at the end of
   the table, or by replacing an existing entry.

   The encoder decides how to update the header table and as a (name, value) pair.  The inserted
      pair index such can
   control how much memory is set to used by the next free index in header table.  To limit the table: it is equal
      to
   memory requirements on the number of headers in decoder side, the header table before its insertion.

   o  Substitution indexing: size is
   bounded (see the represented header contains an index to
      an existing (name, value) pair. SETTINGS_MAX_BUFFER_SIZE in Section 5).

   The existing pair value size of an entry is
      replaced by the pair representing the new header.

   Incremental and substitution indexing are optional.  If none sum of them
   is selected in a header representation, the header table is not
   updated.  In particular, no update happens on the header table when
   processing an indexed representation.

   The header table size can be bounded so as to limit the memory
   requirements (see the SETTINGS_MAX_BUFFER_SIZE in Section 5).  The
   header table size is defined as the sum of the size of each entry of
   the table.  The size of an entry is the sum of the length its name's length in bytes (as
   defined in Section 4.2.2) 4.1.2), of its name, of value's length in bytes
   (Section 4.1.3) and of 32 bytes.  The 32 bytes (for are an accounting for
   the entry structure overhead).
   The header table size MUST NOT exceed this limit.

   Before adding a new overhead.  For example, an entry structure using
   two 64-bits pointers to reference the header table or changing an existing
   one, a check has to be performed to ensure that name and the change will not
   cause value and the table
   entry, and two 64-bits integer for counting the number of references
   to grow in these name and value would use 32 bytes.

   The size beyond the SETTINGS_MAX_BUFFER_SIZE
   limit.  If necessary, one or more items from the beginning of the a header table are removed until there is enough free space available to make
   the modification.  Dropping an entry from the beginning sum of the table
   causes the index positions size of the remaining entries in the table to
   be decremented by 1.  [[Feedback is needed on this automatic eviction
   strategy.  ]]

   When using substitution indexing, it its entries.

3.1.3.  Reference Set

   A reference set is possible that the existing
   item being replaced might be one defined as an unordered set of the items removed when performing
   the necessary size adjustment.  In such cases, the substituted value
   being added references to
   entries of the header table table.

   The initial reference set is inserted at the beginning empty set.

   The reference set is updated during the processing of a set of
   headers.

   Using the differential encoding, a header table (at index position #0) and the index positions of that is not present in the
   other remaining entries in
   reference set can be encoded either with an indexed representation
   (if the table are incremented by 1.

   To optimize header is present in the header table), or with a literal
   representation of the headers exchanged at (if the
   beginning of an HTTP/2.0 session, header is not present in the header table table).

   A header that is to be removed from the reference set is initialized encoded with common headers.  Two lists
   an indexed representation.

3.1.4.  Header set

   A header set is a group of initial headers header fields that are provided in
   Appendix A.  One is for messages sent from encoded as a client to whole.
   Each header field is a server, the
   other (name, value) pair.

   A header set is for messages sent from encoded using an ordered list of zero or more header
   representations.  All the header representations describing a server to header
   set a client.

3.3. grouped into a header block.

3.1.5.  Header Representation

3.3.1.

   A header can be represented either as a literal or as an index.

   Literal Representation

   The Representation:  A literal representation defines a new
      header.  A literal  The header name is represented as:

   o  A either literally or as a
      reference to an entry of the header name, with two possible representations: table.  The header value is
      represented literally.

      Three different literal representations are provided:

      *  A literal string, as described in representation that does not add the header to the
         header table (see Section 4.2.2. 4.3.1).

      *  A index in literal representation that adds the header table referencing at the name end of the
         corresponding header.  The index is represented as an integer,
         as described in Section 4.2.1.

   o  The
         header value, represented as a table (see Section 4.3.2).

      *  A literal string, as described in representation that uses the header to replace an
         existing entry of the header table (see Section 4.2.2.

3.3.2. 4.3.3).

   Indexed Representation Representation:  The indexed representation defines a header
      as a match to a (name,
   value) pair reference in the header table.  An indexed table (see Section 4.2).

3.1.6.  Header Emission

   The emission of header is represented
   as:

   o  An integer representing the index process of adding a header to the matching (name, value)
      pair, as described in Section 4.2.1.

3.4.  Differential Coding

   A
   current set of headers headers.  Once an header is encoded as a difference emitted, it can't be
   removed from the previous
   reference current set of headers.

   The initial reference set of headers is
   the empty set.

   An indexed representation toggles the presence concept of the header in the
   current set of headers.  If the emission allows a decoder to know when it can
   pass a header corresponding safely to a higher level on the indexed
   representation was not receiver side.  This
   allows a decoder to be implemented in the set, it is added a streaming way, and as such to
   only keep in memory the header table and the reference set.  If  With
   such an implementation, the
   header index was in amount of memory used by the set, it decoder is removed from it.

   A literal representation adds
   bounded, even in presence of a header to the current very large set of headers.

   To ensure a correct decoding  The
   management of a set memory for handling very large sets of headers, headers can
   therefore be deferred to the following steps
   or equivalent ones MUST application, which may be executed by able to emit
   the decoder.

   First, upon starting header to the decoding of a new set wire and thus free up memory quickly.

3.2.  Header Set Processing

   The processing of headers, the
   reference an encoded header set to obtain a list of headers
   is interpreted into the working set defined in this section.  To ensure a correct decoding of
   headers: for each a header in the reference
   set, an entry is added to a decoder MUST obey the working set, containing following rules.

3.2.1.  Header Representation Processing

   All the header name, its value, and its
   current index representations contained in the header table.

   Then, the a header representations block are
   processed in their the order of
   occurrence in the frame.

   For which they are presented, as specified
   below.

   An _indexed representation_ corresponding to an indexed representation, entry _not present_
   in the decoder checks whether reference set entails the index following actions:

   o  The header corresponding to the entry is present in emitted.

   o  The entry is added to the working reference set.  If true, the

   An _indexed representation_ corresponding to an entry _present_ in
   the reference set entails the following actions:

   o  The entry is removed from the working reference set.  If several entries correspond

   A _literal representation_ that is _not added_ to this
   encoded index, all these entries are removed from the working set.
   If header table
   entails the index following action:

   o  The header is not present in the working set, it emitted.

   A _literal representation_ that is used _added_ to
   retrieve the corresponding header from table
   entails the following actions:

   o  The header is emitted.

   o  The header is added to the header table, and a at the location defined
      by the representation.

   o  The new entry is added to the working reference set.

3.2.2.  Reference Set Emission

   Once all the representations contained in a header block have been
   processed, the headers that are in common with the previous header
   set representing this header. are emitted, during the reference set emission.

   For a literal representation, the reference set emission, each header contained in the
   reference set that has not been emitted during the processing of the
   header block is emitted.

3.2.3.  Header Set Completion

   Once all of the header representations have been processed, and the
   remaining items in the reference set have been emitted, the header
   set is complete.

3.2.4.  Header Table Management

   The header table can be modified by either adding a new entry is added to it
   or by replacing an existing one.  Before doing such a modification,
   it has to be ensured that the working set
   representing this header.  If header table size will stay lower than
   or equal to the literal representation specifies SETTINGS_MAX_BUFFER_SIZE limit (see Section 5).  To
   achieve this, repeatedly, the first entry of the header table is
   removed, until enough space is available for the modification.

   A consequence of removing one or more entries at the beginning of the
   header table is that the remaining entries are renumbered.  The first
   entry of the header table is always associated to be indexed, the index 0.

   When the modification of the header table is added accordingly to the header table, and its index replacement of an
   existing entry, the replaced entry is included the one indicated in the
   literal representation before any entry in is removed from the
   working set.  Otherwise, header
   table.  If the entry in the working set contains an
   undefined index.

   When all to be replaced is removed from the header representations have been processed, table
   when performing the working
   set contains all size adjustment, the headers of replacement entry is
   inserted at the set beginning of headers. the header table.

   The new reference set addition of headers is computed by removing from a new entry with a size greater than the
   working set
   SETTINGS_MAX_BUFFER_SIZE limit causes all the headers that are not present in entries from the header table.

   It should
   table to be noted that during dropped and the decoding of new entry not to be added to the header
   representations,
   table.  The replacement of an existing entry with a new entry with a
   size greater than the SETTINGS_MAX_BUFFER_SIZE has the same index may be associated
   consequences.

3.2.5.  Specific Use Cases

   Three occurrences of the same indexed representation, corresponding
   to different
   headers an entry not present in the working set and in reference set, emit the associated
   header table.

4.  Detailed Format

4.1.  Header Blocks

   A twice:

   o  The first occurrence emits the header block consists of a set of first time and adds the
      corresponding entry to the reference set.

   o  The second occurrence removes the header's entry from the
      reference set.

   o  The third occurrence emits the header fields, a second time and adds again
      its entry to the reference set.

   This allows for headers sets which are name-
   value pairs.  Each include duplicate header field is entries
   to be encoded using one efficiently and faithfully.

   The first occurrence of the header
   representation.

4.2. indexed representation can be replaced by
   a literal representation creating an entry for the header.

4.  Detailed Format

4.1.  Low-level representations

4.2.1.

4.1.1.  Integer representation

   Integers are used to represent name indexes, pair indexes or string
   lengths.  The integer representation keeps byte-alignment as much as
   possible as this allows various processing optimizations as well as
   efficient use of DEFLATE.  For that purpose,  To allow for optimized processing, an integer
   representation always finishes at the end of a byte.

   An integer is represented in two parts: a prefix that fills the
   current byte and an optional list of bytes that are used if the
   integer value does not fit in the prefix.  The number of bits of the
   prefix (called N) is a parameter of the integer representation.

   The N-bit prefix allows filling the current byte.  If the value is
   small enough (strictly less than 2^N-1), it is encoded within the
   N-bit prefix.  Otherwise all the bits of the prefix are set to 1 and
   the value is encoded using an unsigned variable length integer [1]
   representation.

   The algorithm to represent an integer I is as follows:

   1.

   If I < 2^N - 1, encode I on N bits

   2.  Else,
   Else
       encode 2^N - 1 on N bits and do the following steps:

   3.

       1.  Set I to (I - (2^N - 1)) and Q to 1
       2.
       While Q > 0

       3.

           1.  Compute Q and R, quotient and remainder of I divided by
               2^7

           2.  If Q is strictly greater than 0, write one 1 bit;
               otherwise, write one 0 bit

           3. >= 128
            Encode R (I % 128 + 128) on the next 7 8 bits

           4.
            I = Q

4.2.1.1. I / 128
       encode (I) on 8 bits

4.1.1.1.  Example 1: Encoding 10 using a 5-bit prefix

   The value 10 is to be encoded with a 5-bit prefix.

   o  10 is less than 31 (= 2^5 - 1) and is represented using the 5-bit
      prefix.

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | X | X | X | 0 | 1 | 0 | 1 | 0 |   10 stored on 5 bits
   +---+---+---+---+---+---+---+---+

4.2.1.2.

4.1.1.2.  Example 2: Encoding 1337 using a 5-bit prefix

   The value I=1337 is to be encoded with a 5-bit prefix.

   o

      1337 is greater than 31 (= 2^5 - 1).

   o

      *

         The 5-bit prefix is filled with its max value (31).

   o  The value to represent on next bytes is

      I = 1337 - (2^5 - 1) = 1306.

   o

      *  1306 = 128*10 + 26, i.e.  Q=10 and R=26.

      *  Q

         I (1306) is greater than 1, bit 8 is set or equal to 1.

      *  The remainder R=26 128, the while loop body
         executes:

            I % 128 == 26

            26 + 128 == 154

            154 is encoded on next 7 bits.

      * in 8 bits as: 10011010

            I is replaced by the quotient Q=10.

   o  The value set to represent on next bytes is I = 10.

   o

      * 10 = 128*0 + 10, i.e.  Q=0 and R=10.

      *  Q (1306 / 128 == 10)

            I is no longer greater than or equal to 0, bit 16 is set to 0.

      *  The remainder R=10 128, the while loop
            terminates.

         I, now 10, is encoded on next 7 bits.

      *  I is replaced by the quotient Q=0.

   o 8 bits as: 00001010

      The process ends.

  0   1   2   3   4   5   6   7
+---+---+---+---+---+---+---+---+
| X | X | X | 1 | 1 | 1 | 1 | 1 |   Prefix = 31 31, I = 1306
| 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 |   Q>=1, R=26   1306>=128, encode(154), I = 1306/128
| 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 |   Q=0 , R=10   10<128, encode(10), done
+---+---+---+---+---+---+---+---+

4.2.2.  String literal representation

   Literal strings can represent header

4.1.2.  Header Name Representation

   Header names or header values. are sequences of ASCII characters that MUST conform to
   the following header-name ABNF construction:

     LOWERALPHA = %x61-7A
     header-char = "!" / "#" / "$" / "%" / "&" / "'" /
                   "*" / "+" / "-" / "." / "^" / "_" /
                   "`" / "|" / "~" / DIGIT / LOWERALPHA
     header-name = [":"] 1*header-char

   They are encoded in two parts:

   1.  The length of the text, defined as the number of octets of
       storage required to store the text, represented as a variable-
       length-quantity (Section 4.1.1).

   2.  The specific sequence of ASCII octets

4.1.3.  Header Value Representation

   Header values are encoded as sequences of UTF-8 encoded text.  They
   are encoded in two parts:

   1.  The string length, length of the text, defined as the number of bytes needed octets of
       storage required to store
       its UTF-8 representation, is represented as an integer with a
       zero bits prefix.  If the string length is strictly less than
       128, it is text, represented as one byte. a variable-
       length-quantity (Section 4.1.1).

   2.  The string value represented as a list specific sequence of octets representing the UTF-8 characters.

4.3. text.

   Invalid UTF-8 octet sequences, "over-long" UTF-8 encodings, and UTF-8
   octets that represent invalid Unicode Codepoints MUST NOT be used.

4.2.  Indexed Header Representation

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 1 |        Index (7+)         |
   +---+---------------------------+

                              Indexed Header

   This representation starts with the '1' 1-bit pattern, followed by
   the index of the matching pair, represented as an integer with a
   7-bit prefix.

4.4.

4.3.  Literal Header Representation

4.4.1.

4.3.1.  Literal Header without Indexing

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |    Index (5+)     |
   +---+---+---+-------------------+
   |       Value Length (8+)       |
   +-------------------------------+
   | Value String (Length octets)  |
   +-------------------------------+

              Literal Header without Indexing - Indexed Name

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +-------------------------------+
   |       Value Length (8+)       |
   +-------------------------------+
   | Value String (Length octets)  |
   +-------------------------------+

                Literal Header without Indexing - New Name

   This representation, which does not involve updating the header
   table, starts with the '011' 3-bit pattern.

   If the header name matches the header name of a (name, value) pair
   stored in the Header Table, the index of the pair increased by one
   (index + 1) is represented as an integer with a 5-bit prefix.  Note
   that if the index is strictly below 31, one byte is used.

   If the header name does not match a header name entry, the value 0 is
   represented on 5 bits followed by the header name, represented as a
   literal string. name (Section 4.1.2).

   Header name representation is followed by the header value
   represented as a literal string as described in Section 4.2.2.

4.4.2. 4.1.3.

4.3.2.  Literal Header with Incremental Indexing

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Index (5+)     |
   +---+---+---+-------------------+
   |       Value Length (8+)       |
   +-------------------------------+
   | Value String (Length octets)  |
   +-------------------------------+

                Literal Header with Incremental Indexing -
                               Indexed Name

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +-------------------------------+
   |       Value Length (8+)       |
   +-------------------------------+
   | Value String (Length octets)  |
   +-------------------------------+

                Literal Header with Incremental Indexing -
                                 New Name

   This representation starts with the '010' 3-bit pattern.

   If the header name matches the header name of a (name, value) pair
   stored in the Header Table, the index of the pair increased by one
   (index + 1) is represented as an integer with a 5-bit prefix.  Note
   that if the index is strictly below 31, one byte is used.

   If the header name does not match a header name entry, the value 0 is
   represented on 5 bits followed by the header name, represented as a
   literal string. name (Section 4.1.2).

   Header name representation is followed by the header value
   represented as a literal string as described in Section 4.2.2.

4.4.3. 4.1.3.

4.3.3.  Literal Header with Substitution Indexing

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |      Index (6+)       |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +-------------------------------+
   |       Value Length (8+)       |
   +-------------------------------+
   | Value String (Length octets)  |
   +-------------------------------+

                Literal Header with Substitution Indexing -
                               Indexed Name

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |           0           |
   +---+---+-----------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +-------------------------------+
   |    Substituted Index (8+)     |
   +-------------------------------+
   |       Value Length (8+)       |
   +-------------------------------+
   | Value String (Length octets)  |
   +-------------------------------+

                Literal Header with Substitution Indexing -
                                 New Name

   This representation starts with the '00' 2-bit pattern.

   If the header name matches the header name of a (name, value) pair
   stored in the Header Table, the index of the pair increased by one
   (index + 1) is represented as an integer with a 6-bit prefix.  Note
   that if the index is strictly below 62, one byte is used.

   If the header name does not match a header name entry, the value 0 is
   represented on 6 bits followed by the header name, represented as a
   literal string. name (Section 4.1.2).

   The index of the substituted (name, value) pair is inserted after the
   header name representation as a 0-bit prefix integer.

   The index of the substituted pair MUST correspond to a position in
   the header table containing a non-void entry.  An index for the
   substituted pair that corresponds to empty position in the header
   table MUST be treated as an error.

   This index is followed by the header value represented as a literal
   string as described in Section 4.2.2. 4.1.3.

5.  Parameter Negotiation

   A few parameters can be used to accomodate accommodate client and server
   processing and memory requirements.  [[These [[anchor3: These settings are
   currently not supported as they have not been integrated in the main
   specification.  Therefore, the maximum buffer size for the header
   table is fixed at 4096 bytes.  ]] bytes.]]

   SETTINGS_MAX_BUFFER_SIZE:  Allows the sender to inform the remote
      endpoint of the maximum size it accepts for the header table.
      The default value is 4096 bytes.
      [[Is
      [[anchor4: Is this default value OK?  Do we need a maximum size?
      Do we want to allow infinite buffer?]]
      When the remote endpoint receives a SETTINGS frame containing a
      SETTINGS_MAX_BUFFER_SIZE setting with a value smaller than the one
      currently in use, it MUST send as soon as possible a HEADER frame
      with a stream identifier of 0x0 containing a value smaller than or
      equal to the received setting value.
      [[This
      [[anchor5: This changes slightly the behaviour of the HEADERS
      frame, which should be updated as follows: ]] follows:]]
      A HEADER frame with a stream identifier of 0x0 indicates that the
      sender has reduced the maximum size of the header table.  The new
      maximum size of the header table is encoded on 32-bit.  The
      decoder MUST reduce its own header table by dropping entries from
      it until the size of the header table is lower than or equal equal to
      the transmitted maximum size.

6.  Security Considerations

   This compressor exists to solve security issues present in stream
   compressors such as DEFLATE whereby the compression context can be
   efficiently probed to reveal secrets.  A conformant implementation of
   this specification should be fairly safe against that kind of attack,
   as the reaping of any information from the compression context
   requires more work than guessing and verifying the plaintext data
   directly with the server.  As with any secret, however, the longer
   the length of the secret, the more difficult the secret is to guess.
   It is inadvisable to have short cookies that are relied upon to
   remain secret for any duration of time.

   A proper security-conscious implementation will also need to prevent
   timing attacks by ensuring that the amount of time it takes to do
   string comparisons is always a function of the total length of the
   strings, and not a function of the number of matched characters.

   Another common security problem is when the remote endpoint
   successfully causes the local endpoint to exhaust its memory.  This
   compressor attempts to deal with the most obvious ways that this
   could occur by limiting both the peak and the steady-state amount of
   memory consumed in the compressor state, by providing ways for the
   application to consume/flush the emitted headers in small chunks, and
   by considering overhead in the state size calculation.  Implementors
   must still be careful in the creation of APIs to an implementation of
   this compressor by ensuring that header keys and values are either
   emitted as a stream, or that the compression implementation have a
   limit on the maximum size of a key or value.  Failure to implement
   these kinds of safeguards may still result in a scenario where the transmitted maximum size.

6.  Security Considerations

   TODO?
   local endpoint exhausts its memory.

7.  IANA Considerations

   This memo includes no request to IANA.

8.  Informative References

   [SPDY]  Belshe, M. and R. Peon, "SPDY Protocol", February 2012,
           <http://tools.ietf.org/html/draft-mbelshe-httpbis-spdy>.

   [1]  <http://en.wikipedia.org/wiki/Variable-length_quantity>

Appendix A.  Initial  Change Log (to be removed by RFC Editor before publication

A.1.  Since draft-ietf-httpbis-header-compression-01

   o  Refactored of Header Encoding Section: split definitions and
      processing rule.

   o  Backward incompatible change: Updated reference set management as
      per issue #214.  This changes how the interaction between the
      reference set and eviction works.  This also changes the working
      of the reference set in some specific cases.

   o  Backward incompatible change: modified initial header names

   [[The list, as per
      issue #188.

   o  Added example of 32 bytes entry structure (issue #191).

   o  Added Header Set Completion section.  Reflowed some text.
      Clarified some writing which was akward.  Added text about
      duplicate header entry encoding.  Clarified some language w.r.t
      Header Set. Changed x-my-header to mynewheader.  Added text in the
      HeaderEmission section indicating that the application may also be
      able to free up memory more quickly.  Added information in
      Security Considerations section.

Appendix B.  Initial Header Tables

   [[anchor9: The tables in this section should be updated based on
   statistical analysis of header names frequency and specific HTTP 2.0
   header rules (like removal of some headers).  ]]
   [[These headers).]]
   [[anchor10: These tables are not adapted for headers contained in
   PUSH_PROMISE frames.  Either the tables can be merged, or the table
   for responses can be updated.  ]]

A.1. updated.]]

B.1.  Requests

   The following table lists the pre-defined headers that make-up the
   initial header table user to represent requests sent from a client to
   a server.

              +-------+---------------------+--------------+
              | Index | Header Name         | Header Value |
              +-------+---------------------+--------------+
              | 0     | :scheme             | http         |
              | 1     | :scheme             | https        |
              | 2     | :host               |              |
              | 3     | :path               | /            |
              | 4     | :method             | GET          |
              | 5     | accept              |              |
              | 6     | accept-charset      |              |
              | 7     | accept-encoding     |              |
              | 8     | accept-language     |              |
              | 9     | cookie              |              |
              | 10    | if-modified-since   |              |
              | 11    | keep-alive          |              |
              | 12    | user-agent          |              |
              | 13    | proxy-connection    |              |
              | 14 12    | referer             |              |
              | 15    | accept-datetime     |              |
              | 16 13    | authorization       |              |
              | 17 14    | allow               |              |
              | 18 15    | cache-control       |              |
              | 19 16    | connection          |              |
              | 20 17    | content-length      |              |
              | 21    | content-md5         |              |
              | 22 18    | content-type        |              |
              | 23 19    | date                |              |
              | 24 20    | expect              |              |
              | 25 21    | from                |              |
              | 26 22    | if-match            |              |
              | 27 23    | if-none-match       |              |
              | 28 24    | if-range            |              |
              | 29 25    | if-unmodified-since |              |
              | 30 26    | max-forwards        |              |
              | 31    | pragma              |              |
              | 32 27    | proxy-authorization |              |
              | 33 28    | range               |              |
              | 34    | te                  |              |
              | 35    | upgrade             |              |
              | 36 29    | via                 |              |
              | 37    | warning             |              |
              +-------+---------------------+--------------+

                Table 1

A.2. 1: Initial Header Table for Requests

B.2.  Responses

   The following table lists the pre-defined headers that make-up the
   initial header table used to represent responses sent from a server
   to a client.  The same header table is also used to represent request
   headers sent from a server to a client in a PUSH_PROMISE frame.

          +-------+-----------------------------+--------------+
          | Index | Header Name                 | Header Value |
          +-------+-----------------------------+--------------+
          | 0     | :status                     | 200          |
          | 1     | age                         |              |
          | 2     | cache-control               |              |
          | 3     | content-length              |              |
          | 4     | content-type                |              |
          | 5     | date                        |              |
          | 6     | etag                        |              |
          | 7     | expires                     |              |
          | 8     | last-modified               |              |
          | 9     | server                      |              |
          | 10    | set-cookie                  |              |
          | 11    | vary                        |              |
          | 12    | via                         |              |
          | 13    | access-control-allow-origin |              |
          | 14    | accept-ranges               |              |
          | 15    | allow                       |              |
          | 16    | connection                  |              |
          | 17    | content-disposition         |              |
          | 18    | content-encoding            |              |
          | 19    | content-language            |              |
          | 20    | content-location            |              |
          | 21    | content-md5                 |              |
          | 22    | content-range               |              |
          | 23 22    | link                        |              |
          | 24 23    | location                    |              |
          | 25    | p3p                         |              |
          | 26    | pragma                      |              |
          | 27 24    | proxy-authenticate          |              |
          | 28 25    | refresh                     |              |
          | 29 26    | retry-after                 |              |
          | 30 27    | strict-transport-security   |              |
          | 31    | trailer                     |              |
          | 32 28    | transfer-encoding           |              |
          | 33    | warning                     |              |
          | 34 29    | www-authenticate            |              |
          +-------+-----------------------------+--------------+

                Table 2 2: Initial Header Table for Responses

Appendix B. C.  Example

   Here is an example that illustrates different representations and how
   tables are updated.  [[This [[anchor13: This section needs to be updated to integrate
   differential coding.]]

B.1.
   better reflect the new processing of header fields, and include more
   examples.]]

C.1.  First header set

   The first header set to represent is the following:

   :path: /my-example/index.html
   user-agent: my-user-agent
   x-my-header:
   mynewheader: first

   The header table is empty, all headers are represented as literal
   headers with indexing.  The 'x-my-header' 'mynewheader' header name is not in the
   header name table and is encoded literally.  This gives the following
   representation:

   0x44      (literal header with incremental indexing, name index = 3)
   0x16      (header value string length = 22)
   /my-example/index.html
   0x4D      (literal header with incremental indexing, name index = 12)
   0x0D      (header value string length = 13)
   my-user-agent
   0x40      (literal header with incremental indexing, new name)
   0x0B      (header name string length = 11)
   x-my-header
   mynewheader
   0x05      (header value string length = 5)
   first

   The header table is as follows after the processing of these headers:

   Header table
   +---------+----------------+---------------------------+
   |  Index  | Header Name    | Header Value              |
   +---------+----------------+---------------------------+
   |    0    | :scheme        | http                      |
   +---------+----------------+---------------------------+
   |    1    | :scheme        | https                     |
   +---------+----------------+---------------------------+
   |   ...   | ...            | ...                       |
   +---------+----------------+---------------------------+
   |   37    | warning        |                           |
   +---------+----------------+---------------------------+
   |   38    | :path          | /my-example/index.html    | added header
   +---------+----------------+---------------------------+
   |   39    | user-agent     | my-user-agent             | added header
   +---------+----------------+---------------------------+
   |   40    | x-my-header mynewheader    | first                     | added header
   +---------+----------------+---------------------------+

   As all the headers in the first header set are indexed in the header
   table, all are kept in the reference set of headers, which is:

   Reference Set:
   :path, /my-example/index.html
   user-agent, my-user-agent
   x-my-header,
   mynewheader, first

B.2.

C.2.  Second header set

   The second header set to represent is the following:

   :path: /my-example/resources/script.js
   user-agent: my-user-agent
   x-my-header:
   mynewheader: second

   Comparing this second header set to the reference set, the first and
   third headers are from the reference set are not present in this
   second header set and must be removed.  In addition, in this new set,
   the first and third headers have to be encoded.  The path header is
   represented as a literal header with substitution indexing.  The x
   -my-header
   mynewheader will be represented as a literal header with incremental
   indexing.

   0xa6       (indexed header, index = 38: removal from reference set)
   0xa8       (indexed header, index = 40: removal from reference set)
   0x04       (literal header, substitution indexing, name index = 3)
   0x26       (replaced entry index = 38)
   0x1f       (header value string length = 31)
   /my-example/resources/script.js
   0x5f 0x0a  (literal header, incremental indexing, name index = 40)
   0x06       (header value string length = 6)
   second

   The header table is updated as follow:

   Header table
   +---------+----------------+---------------------------+
   |  Index  | Header Name    | Header Value              |
   +---------+----------------+---------------------------+
   |    0    | :scheme        | http                      |
   +---------+----------------+---------------------------+
   |    1    | :scheme        | https                     |
   +---------+----------------+---------------------------+
   |   ...   | ...            | ...                       |
   +---------+----------------+---------------------------+
   |   37    | warning        |                           |
   +---------+----------------+---------------------------+
   |   38    | :path          | /my-example/resources/    | replaced
   |         |                |     script.js             | header
   +---------+----------------+---------------------------+
   |   39    | user-agent     | my-user-agent             |
   +---------+----------------+---------------------------+
   |   40    | x-my-header mynewheader    | first                     |
   +---------+----------------+---------------------------+
   |   41    | x-my-header mynewheader    | second                    | added header
   +---------+----------------+---------------------------+

   All the headers in this second header set are indexed in the header
   table, therefore, all are kept in the reference set of headers, which
   becomes:

   Reference Set:
   :path, /my-example/resources/script.js
   user-agent, my-user-agent
   x-my-header,
   mynewheader, second

Authors' Addresses

   Roberto Peon
   Google, Inc

   EMail: fenix@google.com

   Herve Ruellan
   Canon CRF

   EMail: herve.ruellan@crf.canon.fr