Chapter 8. HPACK header compression
This chapter covers
The next topic is header compression. HTTP/1 has always allowed HTTP bodies to be compressed, but only since HTTP/2 has it been possible to compress the HTTP headers too.
It’s true that, in general, HTTP headers are relatively small in comparison with HTTP bodies, but they’re still chatty and repetitive. A typical HTTP/2 GET request from Chrome looks like this:
:authority: www.example.com :method: GET :path: /url :scheme: https accept: text/html,application/xhtml+xml,application/xml;q=0.9, image/webp,image/apng,*/*;q=0.8 accept-encoding: gzip, deflate, br accept-language: en-GB,en-US;q=0.9,en;q=0.8 upgrade-insecure-requests: 1 user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537 .36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36 !@%STYLE%@! {"css":"{\"css\": \"font-weight: bold;\"}","target":"[[{\"line\":1,\"ch\":9},{\"line\":1,\"ch\":12}],[{\"line\":2,\"ch\":7},{\"line\":2,\"ch\":11}]]"} !@%STYLE%@!
Only the parts that are highlighted in bold are liable to change for the next request sent to this server: the method of the request (GET) and the path (/url). Of the 403 characters in this code, only 7 won’t be repeated next time. In fact, even the method is probably set to GET for the large majority of web page requests, though web services might use the other methods. Therefore, only the :path or URL is liable to change, so 399 characters of every Chrome request are duplicated each time—a huge amount of waste.
The problem is exacerbated because some of these headers are long, such as the accept and user-agent headers shown in the preceding code, and cookie headers can get even longer. Following is the request header to Twitter (with the cookie values obfuscated):
:authority: twitter.com :method: GET :path: / :scheme: https accept: text/html,application/xhtml+xml,application/xml;q=0.9, image/webp,image/apng,*/*;q=0.8 accept-encoding: gzip, deflate, br accept-language: en-GB,en-US;q=0.9,en;q=0.8 cookie: _ga=GA1.2.123432087.1234567890; eu_cn=1; dnt=1; kdt=rmnAfbecvko4123 4oRYSzztq7n12345abcdABCD12; remember_checked_on=1; personalization_id="v1_k 0123451/EKaVeysDnuhKg=="; guest_id=v1%3A152383314680123456; ads_prefs="HBES AAA="; twid="u=3374717733"; auth_token=12791876dfc0e57eae12345897b7940f55ac 7dfd; tfw_exp=1; csrf_same_site_set=1; csrf_same_site=1; lang=en; _twitter_ sess=BAh7CSIKZ12345678zonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNo%250ASGFza HsABjoKQHVzZWR7ADoPY3JlYXRlZF9hdGwrCPdpPx12345HaWQiJWY4%250AZGUwOGM3ZjRiYzJ mYjRiAbCdEfGwNjIyZTk1Ogxjc3JmX2lkIiVkYjg3%2501234kZTVkMDdlMTAxMGI2YTgyZDFhN TA0MmZiNQ%253D%253D--fd52ba1537f8fb9bf35dbd6080a6cd413edc6cd2; ct0=713653a0 6266b507960945523226bcc4; _gid=GA1.2.1893258168.1525002762; external_refere r=1234567890w%3D|0|S381234567896Dak8Eqj76tqsc12345Lq4vYdCl5zxIvK6Q123vRkA%3 D%3D; app_shell_visited=1 referer: https://twitter.com/ upgrade-insecure-requests: 1 user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537 .36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36
This code is now 1,278 characters, and again, it’s likely that only the :path pseudo-header will change with the next request, so 1,277 characters are pretty much wasted for each request. Responses are no better. Figure 8.1 shows the response for Twitter.
This response is a frankly ridiculous 5,804 characters, made up mostly of a detailed (and large) content-security-policy header, a security feature that allows the website to tell the browser what type of content it’s allowed to load on this site.[1]
Computer scientists abhor repetition, and all this repetition is one of the reasons why HTTP header compression was a key part of HTTP/2 and why it was built into its predecessor (SPDY) from the beginning. Compressing and decompressing data takes time and processing power, but relatively small amounts of that time compared with the time required to send network requests, so compressing data before sending it on a network is almost always worthwhile. Also, HTTPS requires encryption, which is more computationally expensive than compression. It’s better to compress first and then encrypt the smaller amount of data.
To understand the rest of this section on HTTP compression, you need a little background on data compression. This topic is fairly high-level, and I deliberately avoid the complex mathematics, but this section should give you enough detail to understand how and why HTTP/2 implements header compression the way it does.
Some compression is lossy compression; some of the detail can be discarded because it’s not needed. This type of compression is typically used for media: music files, images, and videos can be compressed heavily without losing the overall meaning of the data. If you compress too much, you lose detail, so that an image can no longer be zoomed in, for example. Lossy compression is a careful balance between reducing size and maintaining quality.
HTTP headers are important data, even if they’re repeated a lot, so lossy compression isn’t an option, even though it usually delivers better compression. Lossless compression works by removing repetitive data that’s easily added back later when the data needs to be uncompressed. You have three ways to do this:
I describe each of these methods in the following sections.
The first method involves taking long, repeated bits of data and replacing them with references. Uncompressing involves replacing the references with the original text from the lookup table. This process can be dynamic but works particularly well for data that’s structured the same way. Look at a simple GET request:
:authority: www.example.com :method: GET :path: /url :scheme: https !@%STYLE%@! {"css":"{\"css\": \"font-weight: bold;\"}","target":"[[{\"line\":1,\"ch\":9},{\"line\":1,\"ch\":12}],[{\"line\":2,\"ch\":7},{\"line\":2,\"ch\":11}]]"} !@%STYLE%@!
This request is 64 characters. If you had a lookup table like this
$1=:authority: $2=:method: $3=:path: $4=:scheme:
you could encode this text like this:
$1 www.example.com $2 GET $3 /url $4 https
This request is 39 characters—a 40% improvement! The lookup table, however, may need to be included in the compressed version, depending on whether it’s a standard lookup table known by the format or a dynamic one generated by the specific text. If the table is included, the overall size may be the same size or larger than the original text, which would defeat the point. Lookup tables benefit you only if the values are repeated often.
In this simple example, the lookups are commonly used HTTP header names, so a preagreed static lookup table could be used for them and would not need to be sent each time. This table could be supplemented by a dynamic lookup table for extra values to use on top of that preagreed list. The domain www.example.com, for example, is likely to be reused in subsequent requests, so it could be added to a dynamic list for subsequent referencing.
Several techniques fall into this category, but they all recognize the fact that the data being compressed could be represented in a smaller size if a more specific way is used to represent the data.
Pixel-based images, for example, can be based on a 1-bit pixel range (black and white). If you have a picture that has only red and yellow, you could use an 8-bit color palette and use only two of those colors. Alternatively, you could use a 1-bit palette, but state at the beginning that 0 is yellow and 1 is red. Similarly, for text you can use ASCII (7 bits), UTF-8 (8 bits for the ASCII characters, 16 bits for commonly used Western characters, 24 bits for most other common characters, and up to 32 bits for other characters), or UTF-16 (16 bits for most characters and up to 32 bits for other characters). If you’re writing in English, UTF-8 makes the most sense, but if you’re writing in non-Western languages, UTF-16 may make more sense, as UTF-8 often uses 24 bits for these languages, compared with the 16 bits in UTF-16. Picking the appropriate format can result in more efficient encoding.
For even more savings, look at variable-length encoding. Most encoding techniques involve fixed-size encoding. ASCII, for example, uses 7-bit characters, as shown in table 8.1.
Table 8.1. A subset of the ASCII codes
Binary |
Character |
---|---|
01000001 | A |
01000010 | B |
01000011 | C |
01000100 | D |
01000101 | E |
01000110 | F |
01000111 | G |
And so on |
This arrangement is nice and simple but not too efficient, because each letter takes up the same space regardless of how often it’s used. In English, E is the most common letter, followed by T and A, down to the least common letters (X, J, Q, and Z, respectively). Why should you treat them all as equal, given that they won’t be used equally?
Rather than using 7 bits for all ASCII characters, it would be more efficient for English text to use variable-length characters. This technique would use binary values sized less than 7 bits for commonly used characters and binary values sized larger than 7 bits for less frequently used letters. Unicode (UTF-8 and UTF-16) uses this method to some extent via distinct blocks of characters (1–4 octets), depending on how commonly used the characters are. The main complication is recognizing the boundaries between characters (because they’re not all 7 bits long anymore in this format).
Huffman coding takes variable-length encoding to a more extreme level. It works by assigning a unique code to each value based on the frequency with which it’s used, and it ensures that no code is the prefix of another code. How the unique codes are calculated is beyond the scope of this book, but it might lead to a table like table 8.2 (based on the Huffman codes based on the English-language letter-frequency distribution).
Table 8.2. A subset of the ASCII codes with Huffman coding
Binary |
Huffman coding |
Character |
---|---|---|
01000001 | 1111 | A |
01000010 | 101000 | B |
01000011 | 01010 | C |
01000100 | 11011 | D |
01000101 | 100 | E |
01000110 | 01011 | F |
01000111 | 00001 | G |
In Huffman encoding, no code is fully represented as the start of another code. The code 0101 isn’t used to represent a letter, as that code could be confused as being that letter, the beginning of C, or the beginning of F. By choosing the coding strings carefully, it’s possible to reliably decode the data. As long as you start at the beginning of the text, you can decode it by reading along until you match a letter; then you start the next letter and continue until the whole text is unencoded.
Using these codes as examples, if you wanted to encode the word face, you could use regular ASCII (4 x 7 bits = 28 bits) or Huffman coding (5 + 4 + 5 + 3 = 17 bits). Huffman encoding is smaller even in this simple example. With a longer piece of text, the gains would be considerable.
Huffman code compression is an extension of the lookup tables. Like the more regular lookup tables discussed earlier in this chapter, Huffman tables can be defined in advance based on the known structure to which the data is likely to be similar (such as English-language text); they can be generated dynamically based on the data being encrypted; or they can be a combination of both.
Lookback compression involves referencing repeated text in terms of the current placing. This type of compression is demonstrated particularly well in HTML text:
<html> <body> <h1>This is a title</h1> <p>This is some text</p> </body> </html>
<html> <body> <h1>This is a title</(-20,3) <p>(-24,6)some text</(-19,2) </(-58,4) </(-73,4) !@%STYLE%@! {"css":"{\"css\": \"font-weight: bold;\"}","target":"[[{\"line\":2,\"ch\":21},{\"line\":2,\"ch\":28}],[{\"line\":3,\"ch\":3},{\"line\":3,\"ch\":10}],[{\"line\":3,\"ch\":21},{\"line\":3,\"ch\":28}],[{\"line\":4,\"ch\":2},{\"line\":4,\"ch\":9}],[{\"line\":5,\"ch\":2},{\"line\":5,\"ch\":9}]]"} !@%STYLE%@!
Each repeated bit of text is replaced by a reference stating how far back the decompressor can find the repeated text and how much of it to take. The reference (-20,3) says to go back 20 characters and take the next three characters, which is what should replace this reference. As you can see, this process works well for HTML text, in which closing tags repeat opening tags (although as there are few tags in this example, a lookup table might be better for HTML). You can also use this type of compression for HTTP headers, such as the accept header:
accept: text/html,application/xhtml+xml,application/xml;q=0.9, image/webp,image/apng,*/*;q=0.8 !@%STYLE%@! {"css":"{\"css\": \"font-weight: bold;\"}","target":"[[{\"line\":0,\"ch\":13},{\"line\":0,\"ch\":17}],[{\"line\":0,\"ch\":13},{\"line\":0,\"ch\":17}]]"} !@%STYLE%@!
As you can see, html is repeated twice in this header—as are application, xml, and image—so it could be encoded with lookback functions. A computer doesn’t care about looking back for whole words only; I’m only doing this to make the explanation clearer. ml is used in html and xml, for example, so maybe it’s better to use it.
HTTP body compression generally is used for text data. Media is typically precompressed depending on the particular format and is often excluded from compression. JPEG, for example, is a specific compression format for images that shouldn’t then be further compressed by the web server; the image won’t compress more (and may even end up being larger) and therefore wastes processing. Text works well for all of the preceding compression techniques. The techniques used by web servers and browsers (deflate, gzip, and brotli) are fairly similar; they’re variants on the Deflate-based algorithm and use a combination of techniques to achieve good compression rates. When making a request, the browser informs the server what compression algorithms it supports by using the accept-encoding HTTP header:
accept-encoding: gzip, deflate, br
The server picks one of these algorithms and compresses the header, and in the response, it informs the browser which algorithm it used to compress the resource:
content-encoding: gzip
This technology allows new compression algorithms to be introduced (such as brotli), and these algorithms will be used only when both client and server support them.
Deflate-based compression has one major flaw: it has proved to be insecure. The problem is that you can use the length to guess the contents, particularly if you can influence some of those contents. Although HTTP bodies can contain some sensitive data (such as if your name or account number is displayed on the page), most security concerns are about HTTP headers, as they contain cookies and other tokens used to supply authentication. Suppose that you have the following request:
:authority: www.example.com :method: GET :path: /secretpage :scheme: https cookie: token=secret
If you could get hold of that token value (secret), you’d be able to impersonate this user. You can’t do that, of course, because the message is encrypted—and ideally, the cookie is marked as HttpOnly,[2] so it can’t be seen by any JavaScript, even if you could inject it into the page.
If you could get access to the page, however, you could send out the following requests with slightly different URLs and measure the length of the message sent:
https://www.example.com/secretpage?testtoken=a https://www.example.com/secretpage?testtoken=b https://www.example.com/secretpage?testtoken=c ...etc.
Because Deflate-based compression techniques work by recognizing and replacing repeat patterns, you may notice eventually that one test (testtoken=s) is shorter than the other tests because it repeats the first part of the real cookie (token=secret). Now you know the first letter of the token! You can repeat this process until you get the full token:
https://www.example.com/secretpage?testtoken=sa https://www.example.com/secretpage?testtoken=sb https://www.example.com/secretpage?testtoken=sc https://www.example.com/secretpage?testtoken=sd https://www.example.com/secretpage?testtoken=se - this is shorter! https://www.example.com/secretpage?testtoken=sea https://www.example.com/secretpage?testtoken=seb https://www.example.com/secretpage?testtoken=sec - this is shorter! https://www.example.com/secretpage?testtoken=seca https://www.example.com/secretpage?testtoken=secb https://www.example.com/secretpage?testtoken=secc ...etc. !@%STYLE%@! {"css":"{\"css\": \"font-weight: bold;\"}","target":"[[{\"line\":4,\"ch\":48},{\"line\":4,\"ch\":66}],[{\"line\":7,\"ch\":49},{\"line\":7,\"ch\":67}],[{\"line\":4,\"ch\":48},{\"line\":4,\"ch\":66}],[{\"line\":7,\"ch\":49},{\"line\":7,\"ch\":67}]]"} !@%STYLE%@!
This process may sound long, but it’s easy to script and is a real practical attack known as CRIME (Compression Ratio Info-leak Made Easy).[3] This attack was demonstrated against SPDY, which used gzip for HTTP header compression.
Due to the insecurities demonstrated with CRIME, HTTP/2 needed to use a different compression method that wasn’t susceptible to such attacks. The HTTP Working Group created a new specification called HPACK (not an acronym) that was based on lookup tables and Huffman encoding but (crucially) not lookback-based compression.
HPACK[4] is a separate specification from HTTP/2. At one point, there were discussions about merging the two specifications, but in the end, the working group decided to keep them separate. The HTTP/2 specification is light on details about HPACK, deferring most details to the separate HPACK specification,[5] but it does state that header compression is part of HTTP/2 and that it’s stateful (more on why in section 8.4.2).
One interesting fact about HPACK is that unlike many HTTP specifications, it’s not flexible or designed to be extended. In fact, the HPACK specification explicitly says:[6]
The HPACK format is intentionally simple and inflexible. Both characteristics reduce the risk of interoperability or security issues due to implementation error. No extensibility mechanisms are defined; changes to the format are only possible by defining a complete replacement.
Although I’m not sure that I agree with calling HPACK simple, I agree that it’s unusually rigid for an internet specification, which (as the quote explains) was done for security reasons. Eventually, a new version of HPACK will undoubtedly be introduced (possibly QPACK as part of QUIC; see chapter 9). How this version will be implemented will need to be defined (likely a new setting on connection establishment), but for now, HPACK is fairly rigidly defined.
HPACK has a static table of 61 common HTTP header names (and in some cases values), part of which is shown in table 8.3. For the full table, see the HPACK specification.[7]
Table 8.3. Part of the HPACK static table
Index |
Header name |
Header value |
---|---|---|
1 | :authority | |
2 | :method | GET |
3 | :method | POST |
4 | :path | / |
5 | :path | /index.html |
6 | :scheme | http |
7 | :scheme | https |
8 | :status | 200 |
9 | :status | 204 |
10 | :status | 206 |
11 | :status | 304 |
12 | :status | 400 |
13 | :status | 404 |
14 | :status | 500 |
15 | accept-charset | |
16 | accept-encoding | gzip, deflate |
17 | accept-language | |
18 | accept-ranges | |
19 | Accept | |
... | ... | ... |
60 | Via | |
61 | www-authenticate |
This table is used for both requests and responses, and allows an HTTP message to efficiently compress commonly used names, as well as some commonly used name and value pairs. As a result, the header
:method: GET
As another example, the header
:method: DELETE
doesn’t exist in the table but can be compressed with a reference to index 2 for the header name and the encoded value of DELETE. That is, even the name/value pair table entries (such as :method: GET) can be used to provide the name part only (such as :method). The reverse isn’t true, however; there’s no facility to look up a value associated with another header. A header1: GET header for example, couldn’t use the GET value from index 2.
It would be equally correct to encode this :method: DELETE header to index 3 and the encoded value of DELETE instead of index 2. Both methods refer to the :method header name, so both are valid. I return to this topic in section 8.6.
In addition to the static table, HPACK has a connection-level dynamic table starting at position 62 (after the static table) up to the maximum table size defined by the SETTINGS_HEADER_TABLE_SIZE value in the SETTINGS frame. The default is 4,096 octets if not explicitly defined. When the maximum table size is reached, the oldest entry is evicted. To make this process easier, each entry is incremented when the table is written to. If a request contains the two custom headers
Header1: Value1 Header2: Value2
Header1 would be given table entry 62 initially. When Header2 was seen, Header1 would be moved to table entry 63, and Header2 would be added as table entry 62. That is, the table entry position for a header isn’t static; it increments continually as new headers are added to the table in both this request and future requests. For this reason, HEADER and CONTINUATION frames must be received in order to maintain the integrity of the dynamic table. TCP guarantees this order, so in HTTP/2 the dynamic table is unique to each TCP connection.
This process is complicated and best illustrated with a real example, which I provide in section 8.5. First, you need to understand a bit more about how these headers are referenced in static and dynamic tables.
Headers can be set to add to the dynamic table or not. There are four types of HPACK headers, described in the following sections.
Literal header field representation (which starts with 1) is a straight lookup from the table (either static or dynamic), so it’s used when both the header name and value already exist in the table. This header consists of a table index value padded out to 7 bits minimum. Figure 8.2 shows the format.
For larger numbers that need 7 bits or more, some additional logic exists.[8] I don’t cover that logic here, but suffice it to say that it involves filling all these 7 bits with 1 and using additional octet(s) to specify the larger index value.
To encode :method: GET, for example, use index 2 (binary 10, or 000 0010 when padded out with leading zeros). When you add the leading 1 for literal header field compression, you end up with 1000 0010, or 82 in hexadecimal, as you can see if you use Wireshark to look at traffic (figure 8.3). Section 8.5 explains how to view these frames in Wireshark, so take these screenshots as examples.
The literal header field with incremental indexing type (which starts with 01) is used when the header value isn’t available in the table but should be added to the dynamic table for use later.
This type contains the header name (which may be an index reference to the header name already in the table or an actual header name not already in the table) and the header value.
If an indexed header name is used (the header name already exists in the table), the bits following 01 (padded to a minimum 6 bits) define the index value, followed by the header value itself, as shown in figure 8.4.
The header value string can be Huffman-encoded or not (depending on whether this process would make it shorter), and the 1-bit H value is set to 1 if Huffman encoding is used or 0 if it’s the ASCII value. Ideally, the shortest encoding should be used, so some headers may be Huffman-encoded and others may be straight ASCII text.
Figure 8.5 shows a real-world example. The :authority: header is index 1, so it should be encoded as 01000001, or 41 in hexadecimal, followed by the Huffman-encoded value (which I discuss in section 8.4.4).
Otherwise, for a new header name that isn’t in the lookup table, the 6 bits after the initial 01 are set to 0, and then the header name and value are given as length/value pairs, as shown in figure 8.6.
The first 8 bits, therefore, are 01000000 or 40. Figure 8.7 shows a real-world example that starts with 40.
The literal header field without indexing type (which starts with 0000) is used for items that are likely to change in each request, which would make adding to the dynamic table wasteful (such as the path). This header type contains the header name (which may be an index reference to the field name in the table or an actual field name), but the header name and value aren’t saved as entries in the dynamic table. The two formats (depending whether the header name is referenced with a table index or given in full) are shown in figures 8.8 and 8.9.
Figure 8.10 shows an example. As the encoded header starts with 00 in hex (or 0000 0000 in binary), you know that this header is the second type and isn’t referencing the header name (:path) from the table.
The :path name is Huffman-encoded into 84 b9 58 d3 3f, and the header value (/security/hsts-pixel.gif) is Huffman-encoded into 91 61 05...etc., as I discuss in section 8.4.4, giving the total compressed header as 00 84 b9 58 d3 3f 91 61 05 and so on.
This coding seems to be a little odd and wasteful, because the :path header already exists in the table (at positions 4 and 5). The header could have been encoded with format 1 and would have been 5 octets shorter:
- Format 1 (index 5): 05 91 61 05 and so on
- Format 2 (no index): 00 84 b9 58 d3 3f 91 61 05 and so on
If you try the same thing with Firefox, it uses index 5, but Chrome seems to prefer to send the header name for unknown reasons. This example shows you that clients may not encode in the way you always expect!
The literal header field never indexed type (which starts with 0001) is similar to the preceding value except that the value must not be added to a dynamic table in any subsequent reencodings (such as when a server is acting as a proxy between two HTTP/2 implementations). This header type is used for sensitive information (such as username, password, or both) that you don’t like to see implemented in a shared HTTP header index. The header is an instruction on how to handle reencodings as well as the existing encoding if the header is transmitted on.
Should cookies be stored in an HPACK table?
Cookies are sensitive data and seem to be exactly what this last type was designed for. The downside is the reduced compression for cookies on subsequent requests. Cookies can be large and repeated, so ideally, they should be compressed.
Unlike with partial lookback compression, with HPACK the whole cookie needs to be guessed before the effect can be seen (perhaps the request size is smaller). Some implementations (such as Firefox and nghttp)[a] use only the Never Index type with small cookies (less than 20 bytes), the theory being that larger cookies are harder to guess, making the compression gains worthwhile. For larger cookies, these implementations index the value so that it can be referenced with subsequent requests. Chrome seems to use the Index type regardless of the length of the cookie, so it doesn’t appear to use this Never Index compression type.
If the sender chooses to use this type, the settings are similar to those of the preceding “without indexing” formats (figures 8.11 and 8.12), depending on whether the header is referenced with an index to the current table or given in full.
Huffman encoding depends on defining a table of codes to use for each character in the text. For HPACK, this table is defined in the specification, so both client and server know the values to use to encode and decode header names and values. Table 8.4 shows part of the HPACK Huffman codes. For the full table, see the HPACK specification.[9]
Table 8.4. Part of the HPACK Huffman encoding values
Symbol |
ASCII code |
Huffman code (binary) |
Length (bits) |
---|---|---|---|
' ' | ( 32) | |010100 | [ 6] |
'!' | ( 33) | |11111110|00 | [10] |
'"' | ( 34) | |11111110|01 | [10] |
'#' | ( 35) | |11111111|1010 | [12] |
'$' | ( 36) | |11111111|11001 | [13] |
... | ... | ... | ... |
'0' | ( 48) | |00000 | [ 5] |
'1' | ( 49) | |00001 | [ 5] |
... | ... | ... | ... |
'A' | ( 65) | |100001 | [ 6] |
'B' | ( 66) | |1011101 | [ 7] |
'C' | ( 67) | |1011110 | [ 7] |
'D' | ( 68) | |1011111 | [ 7] |
'E' | ( 69) | |1100000 | [ 7] |
... | ... | ... | ... |
'L' | ( 76) | |1100111 | [ 7] |
... | ... | ... | ... |
'T' | ( 84) | |1101111 | [ 7] |
... | ... | ... | ... |
To return to a previous example, look at the header
:method: DELETE
Letter |
D |
E |
L |
E |
T |
E |
---|---|---|---|---|---|---|
Huffman code | 1011111 | 1100000 | 1100111 | 1100000 | 1101111 | 1100000 |
When grouped up to octets, the header would be 1011 1111 1000 0011 0011 1110 0000 1101 1111 1000 00, with the last octet padded out with ones to 0011. This header translates to bf 83 3e 0d f8 3f in hex, which needs to be preceded by the Huffman flag and the length. The length is 6 octets in this case, which is 110 in binary or 000 0110 when padded out to 7 bits. Adding the Huffman encoding bit as 1 at the beginning of the length octet (1000 0110 or 86), you end up with the fully encoded header as 86 bf 83 3e 0d f8 3f.
HPACK Huffman encoding and decoding can easily be automated. The following listing shows one such implementation in Perl. Note that only part of the Huffman table is shown for space reasons. The full listing is available at the book’s GitHub page.[10]
Listing 8.1. A simple HPACK Huffman encoder
#!/usr/bin/perl use strict; use warnings; #Read in the string to convert from the command line. my ($input_string) = @ARGV; if (not defined $input_string) { die "Need input string\n"; } #Set up and populate a hash variable with all the Huffman lookup values. #Note that only printable values are used in this simple example. my %hpack_huffman_table; $hpack_huffman_table{' '} = '010100'; $hpack_huffman_table{'!'} = '1111111000'; $hpack_huffman_table{'\"'} = '1111111001'; $hpack_huffman_table{'#'} = '111111111010'; ...etc. $hpack_huffman_table{'}'} = '11111111111101'; $hpack_huffman_table{'~'} = '1111111111101'; #Set up a binary string variable my $binary_string=""; #Split the input string by character my @input_array = split(//, $input_string); #For each inoput character, look up the string in the Huffman hash table #And add it to the binary_string variable. foreach (@input_array) { $binary_string = $binary_string . $hpack_huffman_table{$_}; } #Pad out the binary string to ensure that it's divisble by 8 while (length($binary_string) % 8 != 0) { $binary_string = $binary_string . "1"; }; #Calculate the length by dividing by 8. my $string_length = length($binary_string)/8; #This simple implementation doesn't handle large strings #(left as an exercise for the reader). if ($string_length > 127) { die "Error string length > 127 which isn't handled by this program\n"; } #Set the most significant bit (128) to indicate that Huffman encoding is used #and include the length #(again, this simple version naively assumes 7 bits for the length). printf("Huffman Encoding Flag + Length: %x\n",128+$string_length); #Iterate though each 4-bit value and convert to hexidecimal printf("Huffman Encoding Value : "); for(my $count=0;$count<length($binary_string);$count = $count + 4) { printf("%x",oct("0b" . substr($binary_string,$count,4))); } printf ("\n");
This code encodes strings into hexadecimal HPACK Huffman-encoded strings. Following is an example:
$ ./hpack_huffman_encoding.pl DELETE Huffman Encoding Flag + Length: 86 Huffman Encoding Value : bf833e0df83f
You could use a similar script to decode HPACK Huffman-encoded strings. This script could be enhanced to take in a list of headers and handle a dynamic table state. I leave both of these tasks as exercises for you to complete in your language of choice.
Although Huffman encoding is complicated for people to handle, it’s easy to implement in code and efficient for computers to encode and decode. As a result, it’s unlikely that you’ll want to encode or decode manually, as I’ve done in this chapter.
For some values, Huffman encoding may lead to larger values than if plain ASCII had been used. If you decide to encode delete by using ASCII, for example, you can jump straight to hex (because each ASCII code is 1 octet long):
Letter |
D |
E |
L |
E |
T |
E |
---|---|---|---|---|---|---|
ASCII hex code | 44 | 45 | 4C | 45 | 54 | 45 |
You can see that the ASCII coding version is also of length 6, and with the Huffman encoding flag set to 0, you precede the header with 06 and get 06 44 45 4C 45 54 45 as the fully encoded header.
In this case, there’s no difference between using Huffman encoding and not using it—both are 7 octets long. Even though all the Huffman codes used in this particular example are 7 bits (compared with 8-bit ASCII codes), padding is needed to round up to complete octets, so the codes ended up being the same size. For some other headers, ASCII encoding may be smaller, as would be the case if some infrequent characters from the Huffman table (longer than 8 bits) were used in the header. For this reason, the HPACK specification allows Huffman encoding to be used or not as the client sees fit, and the encoding can change with each header. Whichever encoding can express the value in the fewest octets should be used.
In general, however, Huffman encoding is often more efficient than ASCII. This efficiency is due in part to the fact that ASCII requires only 7 bits, but the full 8-bit octet is used, so 1 bit is wasted for every ASCII encoded value. Huffman encoding allows the use of variable-length encodings, so in theory, bits aren’t wasted. In this variable-length encoding, however, less frequently used characters use more than 8 bits, so those values may be more efficiently encoded in ASCII. These values should by definition be rare (assuming that the HPACK Huffman encoding table reflects real-life use). Finally, looking up from the static or dynamic tables is always going to be more efficient than encoding in either Huffman or ASCII format.
I presented a lot of theory in the preceding sections. In this section, I show you some real examples to help you understand all this theory. Most HTTP/2-aware tools take care of HPACK for you and don’t expose all the extra hard work it has to do for you on this front, so head back to Wireshark to see the raw data being sent on the wire (and the decoded version of that data). I cover Wireshark in chapter 4, so refer to that chapter to get it working.
Assuming that you have HTTP/2 traffic sniffing working, start looking at the HTTP/2 headers. Figure 8.13 shows one example of a request to Facebook with the first HEADERS frame (the second line in the window) selected.
At the bottom of the figure, you see that the Wireshark frame (a generic term for the packet and not to be confused with an HTTP/2 frame) is 361 bytes. When that frame is decrypted, it becomes 273 bytes, and when it’s decompressed, it becomes 498 bytes. Header compression even for this first frame is great, with a 45% saving (273/498 = 55%), though some of that saving is used up again when the data is encrypted.
When you strip out the HTTP/2 frame details (such as the header type, the flags, the weight, and so on), you get the encoded header in hexadecimal format:
82 41 8c f1 e3 and so on
Decode this header with your newfound knowledge. Because you’re dealing with variable-length Huffman encoding, looking at a header as octets isn’t useful, so convert it to binary:
82418cf1e3... = 1000 0010 0100 0001 1000 1100 1111 0001 1110 0011...
At this point, you can start to read the header. You know that there are four types of headers:
- Literal header field representation (starts with 1)
- Literal header field with incremental indexing (starts with 01)
- Literal header field without indexing (starts with 0000)
- Literal header field never indexed (starts with 0001)
This header block starts with 1, so it’s the first type. The next 7 bits give the table index, which is 2, which equates to :method: GET—your first uncompressed header! This header was stored in 1 octet (82) as opposed to the 12 octets needed to encode each of those 12 characters in ASCII—a huge saving. So the first 8 bits are decoded:
1000 0010 0100 0001 1000 1100 1111 0001 1110 0011... !@%STYLE%@! {"css":"{\"css\": \"font-weight: bold;\"}","target":"[[{\"line\":0,\"ch\":0},{\"line\":0,\"ch\":9}]]"} !@%STYLE%@!
The next part starts with 01, so you know that it’s a literal header field with incremental indexing. Because it’s not followed by 6 zeros, you know that the index name is referenced by those 6 bytes. Figure 8.14 repeats the incremental format with the binary values filled in.
Now decode the first octets of the next header (0100 0001).
After you strip off 01, the index number is 00 0001 or 1, which is the :authority header in the table. The first 16 bits are decoded:
1000 0010 0100 0001 1000 1100 1111 0001 1110 0011 1100... !@%STYLE%@! {"css":"{\"css\": \"font-weight: bold;\"}","target":"[[{\"line\":0,\"ch\":0},{\"line\":0,\"ch\":19}]]"} !@%STYLE%@!
Now you need to find the value for that :authority header. The first character of the next octet is 1, so you know that the upcoming value is a Huffman-encoded value and that the length is the remainder of that octet, or 000 1100 = 8 + 4 = 12 octets:
1000 1100 Huffman encoded-string with length of 12.
1000 0010 0100 0001 1000 1100 1111 0001 1110 0011 1100... !@%STYLE%@! {"css":"{\"css\": \"font-weight: bold;\"}","target":"[[{\"line\":0,\"ch\":0},{\"line\":0,\"ch\":29}]]"} !@%STYLE%@!
When you look at the next 12 octets, you see the following:
1111 0001 1110 0011 1100 0010 1111 0010 1000 1100 1000 0101 1000 1100 1110 0111 1110 1010 1011 1001 0000 1111 0100 1111
Each bit is read until a unique Huffman value is found. I’ll do the Huffman table lookups to save you time (easy to program but much more difficult to do manually):
- 1111000 is uniquely identified as w.
- 1111000 is uniquely identified as w.
- 1111000 is uniquely identified as w.
- 010111 is uniquely identified as .. and so on
- 00100 is uniquely identified as c.
- 00111 is uniquely identified as o.
- 101001 is uniquely identified as m.
- 111 is padding to fill out the last octet.
The full value is www.facebook.com, and when it’s coupled with the header name (:authority), you get the full header:
:authority: www.facebook.com
This header is added to the dynamic table with index 62. The static index ends at 61 indexed values, so 62 is the next free value.
Continuing through the rest of the HEADERS frame in a similar way, you end up with the dynamic table shown in table 8.5.
Table 8.5. Dynamic header after first HEADERS frame is received
Index |
Header |
Value |
---|---|---|
62 | accept-language | en-GB,en-US;q=0.9,en;q=0.8 |
63 | accept-encoding | gzip, deflate, br |
64 | Accept | text/html,application/xhtml+xml, application/xml;q=0.9,image/webp, image/apng,*/*;q=0.8 |
65 | user-agent | Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36 |
66 | upgrade-insecure-requests | 1 |
67 | cache-control | no-cache |
68 | Pragma | no-cache |
69 | :authority | www.facebook.com |
As you can see, the :authority: index header has been pushed all the way down to index 69. The ordering of the headers, therefore, is important and may not match the headers shown in developer tools: Chrome orders the headers alphabetically in developer tools but doesn’t send them alphabetically, for example. Note also that not all headers were added. The :scheme: https and :path: / pseudoheaders, for example, were already in the static table (at index 7 and 4, respectively), so they weren’t added to the dynamic table.
The first request likely involves filling up the dynamic table, as these values won’t be in the table, so you won’t get optimum compression. Subsequent requests can use these values and get much better compression, as shown in figure 8.15.
I’ll skip the next received HEADERS frame and instead go to the next sent HEADERS frame. You can see the direction in which each request is traveling based on the source and destination IP addresses. Sent and received headers are handled separately but in an identical fashion. Here, I concentrate on the client sending side; everything applies equally on the server sending side, but with a separately managed dynamic header table.
For the second sent request, you start with 82, which is the same as before, and you know that it’s uncompressed/unencoded to :method: GET, so I won’t repeat myself here. The second header is more interesting. 41 is the :authority header, identical to the previous request, which is followed by the Huffman-encoded authority (facebook.com). What’s interesting is that this domain is different from the first request (facebook.com as opposed to www.facebook.com), so the previously stored dynamic value can’t be reused, and instead the value must be sent. This example also shows connection coalescing in action, because a separate HTTP/2 connection wasn’t needed for this request even though the domain is different. See chapter 6 for more information on connection coalescing.
The next few headers are interesting, and I demonstrate them with the user-agent header, as shown in figure 8.16.
As you can see, the first request requires the full long user-agent header,[11] using 94 octets to send it. In the second request, the user-agent header hasn’t changed, so it can be sent in two octets (c2) for a massive saving! This saving is even bigger compared with the 131 octets that would be needed for a plain-text HTTP/1.1 ASCII header.
11Check out https://webaim.org/blog/user-agent-string-history/ if you’re curious about why this header is so long.
To see how c2 translates to the user-agent header, first you need to realize that a new header (:authority: facebook.com) was added and that it would have been stored in the dynamic table, shifting everything up as shown in table 8.6.
Table 8.6. Dynamic header after first HEADERS frame is received
Index |
Header |
Value |
---|---|---|
62 | :authority | facebook.com |
63 | accept-language | en-GB,en-US;q=0.9,en;q=0.8 |
64 | accept-encoding | gzip, deflate, br |
65 | Accept | text/html,application/xhtml+xml, application/xml;q=0.9,image/webp, image/apng,*/*;q=0.8 |
66 | user-agent | Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36 |
67 | upgrade-insecure-requests | 1 |
68 | cache-control | no-cache |
69 | Pragma | no-cache |
70 | :authority | www.facebook.com |
Then you can unpack the user-agent header:
c5 = 1100 0010
The first 1 is a literal lookup of the header name and value, and the 100 0010 translates to 66, which (as you see in table 8.6) is the previously stored user-agent header.
This makes sense only if both the sender and receiver keep their dynamic HPACK table in sync, however, which you can do only if you preserve the ordering of the HEADERS frames—which, thanks to TCP’s guaranteed delivery, is the case. This process may seem to be complex, especially if you’re doing it manually, but it can be automated easily by computer. Using tools such as Wireshark is much less painful and error-prone, but now you know how to decode manually, should manual decoding be required.
The gains from this easily automated complexity are impressive. In figure 8.15, you may have noticed that the decrypted SSL size was 109 bytes, compared with 273 bytes from the first request in figure 8.13. Even for this simple site, by loading three resources over this connection (the rest are loaded from a sharded domain) without any large cookies or complex headers, you achieve savings of 68% of the bytes you’d send in HEADERS frames, as shown in table 8.7.
Table 8.7. HPACK header savings
Request |
Decrypted SSL |
Decompressed header |
Saving |
---|---|---|---|
1 | 273 | 498 | 45% |
2 | 109 | 477 | 77% |
3 | 99 | 547 | 82% |
Total | 481 | 1522 | 68% |
As the first request is the least compressed, average gains increase the more the connection is used.
Take our tour and find out more about liveBook's features:
- Search - full text search of all our books
- Discussions - ask questions and interact with other readers in the discussion forum.
- Highlight, annotate, or bookmark.
Before I close this chapter, consider a few more important points. First, there are several duplicated headers in the static and dynamic tables, some of which are shown in table 8.8.
Table 8.8. Examples of duplicate headers in HPACK static and dynamic tables
I mentioned earlier in this chapter that a :method of DELETE could be referenced with index 2 or 3, followed by the value DELETE, as both index 2 and 3 refer to the :method header. Similarly, the :path has two entries in the table for common :path values, and after the first request is received in the example, so does accept. The HPACK specification doesn’t give any guidance as to which index should be used in cases like this one. Senders can decide to use the first instance, the last instance, or one in the middle (if one exists), or they can choose not to refer to a previously defined header name index number and add another one (as Chrome did with the :path header name).
As another example of how browsers can handle encoding differently, Chrome and Firefox use slightly different methods for multiple headers within the same request.[12] If you send two cookie headers, for example, after the first header is encoded, you have two references to the cookie header name in the table: the original reference in the static table and another reference in the dynamic table for the value you encoded. To encode the second cookie header, Chrome uses the static table reference at index 32, and Firefox uses the cookie header it added to the dynamic table (at an index position greater than 62).
To be clear, all these methods are valid. As long as the reference ultimately resolves to the correct header name, the sender is free to use whichever index lookup it likes. All the examples I’ve given so far are for duplicate header names, with different values. But the specification explicitly states that entire name/value pairs can be duplicated exactly if the client so desires. The client can send the same header and value that are already in the table, with instructions to index them instead of referencing them from previously indexed values:[13]
The dynamic table can contain duplicate entries (entries with the same name and same value). Therefore, duplicate entries MUST NOT be treated as an error by a decoder.
This process would lead to less compression for that header the first time the duplicate header was used, but technically, it’s allowed.
Finally, it’s not required that the dynamic table be used by the sender. The nginx web server uses only the static table, for example,[14] presumably because it’s easier to implement and manage. A patch is available that implements full HPACK encoding.[15] This patch improves compression by 40% to 95%, according to the authors of the patch, but it isn’t included in the base nginx code at this writing.[16]
HPACK can be complicated and intimidating at first sight (the RFC for it is nearly the size of the whole HTTP/2 specification itself), but I’ve squeezed it into a single chapter. I hope that this chapter has taken some of the mystery out of HPACK and will make the RFC itself less intimidating to tackle, should you need to. Most HTTP/2 users and web developers can ignore the intricate details and accept that HTTP/2 headers are compressed in an efficient manner, leading to considerable space savings, especially on the request side, where headers are the majority of the data. Cloudflare, one of the largest CDNs, saw a 53% reduction in request data when HPACK was enabled[17]. On the response side, although HTTP headers can be large, they’re typically dwarfed by the HTTP bodies, so savings seem to be less significant. (Cloudflare saw only 1.4% savings on average.) But most consumer network connections have limited upload bandwidth compared with download, and requesting resources is the first stage, so the request side is arguably where the gains are more important anyway.
- There are different methods of compressing data.
- HTTP headers contain sensitive data such as cookies, so they can’t use the same compression technique as HTTP bodies, because they can leak data through various attacks.
- HPACK is a compression format specifically written for HTTP header compression in HTTP/2.
- HPACK format has a specific binary format that uses a predefined static table of common header names (and in some cases values) and a dynamic table created during the session.
- Values that aren’t table references can be transmitted in ASCII or Huffman-encoded format.
- Huffman-encoded format typically results in smaller values.
- There are multiple ways to send HTTP headers in HPACK, and browsers may encode HTTP headers differently.