Chapter 9. TCP, QUIC, and HTTP/3

published book

This chapter covers

TCP inefficiencies
TCP optimizations
An introduction to QUIC
Differences between QUIC and HTTP/2

HTTP/2’s aim was to improve the inefficiencies inherent in the HTTP protocol, mainly by allowing a single, multiplexed connection. Under HTTP/1.1, the connection was vastly underused, because it could be used for only one resource at a time. If there were any delays in answering a request (such as because the server was busy generating that resource), the connection was blocked and not being used. HTTP/2 allows the connection to be used for multiple resources, so other resources can still use the connection in this scenario.

In addition to preventing wasted connections, HTTP/2 provides improved performance, because HTTP connections are inefficient in themselves. There’s a cost to creating an HTTP connection; otherwise, there’d be no real benefit in multiplexing. The costs aren’t due to HTTP itself, but to the two underlying technologies used to create this connection: TCP and TLS used to provide HTTPS.

In this chapter, I investigate these inefficiencies and show that although HTTP/2 is better at handling most of these inefficiencies, in certain scenarios, HTTP/2 can be slower than HTTP/1.1 because of these inefficiencies. Then I discuss QUIC, which makes several improvements.

9.1. TCP inefficiencies and HTTP

HTTP depends on a network connection that guarantees that data is delivered reliably and in order. Until recently, that guaranteed connection was achieved by using TCP (Transmission Control Protocol). TCP allows a connection between two endpoints (typically, browser and web server) and takes care of passing the messages, ensuring that they arrive, dealing with any retransmissions if the messages don’t arrive, and ensuring that the messages are ordered correctly before being passed to any application layer (HTTP). HTTP doesn’t need to implement any of these complications; instead, it assumes that these criteria have been met. The protocol was built on that assumption.

TCP enforces this guaranteed integrity by assigning a sequence number to each TCP packet and then rearranging the packets on arrival (if they’re received out of order) or rerequesting any missing sequence number packets (if a packet is detected to be missing). TCP works on a CWND basis (which also formed the basis of how HTTP/2 flow control works; see chapter 7), whereby the maximum amount of data that can be sent is decided on (the CWND size), and sent messages decrease this window and acknowledged packets increase it again. The window starts small but grows over time, as the capacity of the network proves to be able to handle the increased load. The window can also shrink if it appears that the client can’t keep up. This process works reasonably well, and TCP/IP has been the backbone of the internet because of it. The fundamental way that TCP works, however, also leads to five main problems with the protocol, at least where HTTP is concerned:

There’s a setup delay. Sequence numbers that are to be used by sender and receiver must be agreed on at the start of the connection.
The TCP slow start algorithm restricts the performance of TCP, as it’s cautious about how much data it sends to avoid retransmissions as much as possible.
Underuse of the connection causes TCP to throttle back. TCP scales the CWND back down if the connection isn’t fully used, as it can’t be sure that the network parameters haven’t changed since the last optimal CWND was established.
Lost packets cause TCP to throttle back. TCP assumes that all packet loss is due to congestion, which may not always be the case.
Packets may be queued. Packets received out of order may be held back to ensure that order is guaranteed.

These problems haven’t changed under HTTP/2, and some of them are reasons why using a single TCP connection is better under HTTP/2. The last two issues, however, can cause HTTP/2 to be slower than HTTP/1.1 under certain lossy conditions.

9.1.1. Setup delay in creating an HTTP connection

I discussed the TCP three-way handshake in chapter 2. That handshake, coupled with HTTPS setup that’s increasingly required by HTTP (and by all browsers for HTTP/2 connections), result in a significant delay before the first HTTP message is sent, as shown in figure 9.1.

Figure 9.1. TCP and HTTPS setup traffic required for an HTTPS connection

Depending on the size of the HTTPS handshake messages, it takes at least three round trips to set up a connection to a server (1.5 for TCP, 2 for HTTPS, with an overlap of 0.5) before you can send your first request. This diagram doesn’t include any DNS lookup, which is likely to add another delay.

These connection setup steps cause noticeable delays in real life, especially under HTTP/1.1, but also under HTTP/2. Figure 9.2 shows the waterfall diagram for Amazon from chapter 2, with all the connection delays highlighted.

Under HTTP/2, it’s considerably better to use a single connection, but an initial delay still occurs for each connection. Also, any separate domains that can’t be coalesced (see chapter 6) are subject to these delays. Amazon has upgraded to HTTP/2 since figure 9.2, but I still see connection delays for the initial connection and for any subsequent connection that can’t use the same HTTP/2 connection (because it doesn’t resolve to the same server or because it’s authenticated versus unauthenticated), as shown in figure 9.3.

Figure 9.2. Connection setup delays for Amazon under HTTP/1.1

HTTP/2 massively reduces the number of connections and therefore dramatically reduces the 15 or so connection delays shown in figure 9.3, but it would be better to resolve these three remaining delays, too.

Figure 9.3. Connection delays are greatly reduced, but remain under HTTP/2.

9.1.2. Congestion control inefficiencies in TCP

Even after the connection is made, TCP inefficiencies can cause other performance problems, primarily due to the guaranteed nature of TCP: all TCP packets are guaranteed to arrive in order. This seemingly simple statement requires several considerations to be built into the protocol, in particular congestion control.

Congestion control aims to prevent network collapse, when the network spends more time retransmitting dropped packets than sending new packets. This concept was close to becoming reality in the mid-1980s, when the internet started to take off.^[1]

¹https://tools.ietf.org/html/rfc2914#section-3.1

To prevent these problems, TCP was enhanced in the late 1980s with various congestion control features that continue to be tweaked and changed to this day. These congestion control algorithms and concepts introduced stability but also inefficiencies, especially for HTTP.

TCP slow start

The TCP slow start mechanism finds the optimal throughput of TCP over the network without swamping, and potentially endangering, the network. TCP is a cautious algorithm that starts at a low rate and builds up to full capacity, during which time it carefully monitors the connection and capacity that it thinks it can handle.

The amount of data that a TCP connection can send is based on the congestion window size. This congestion window starts conservatively, with 10 segments of 1460 bytes maximum segment size (MSS), or about 14 KB, for modern PCs and servers (a relatively new change,^[2] because many servers are still on the four-segment size used previously). During slow start, the congestion window doubles in size with each round trip, as shown in table 9.1.

²https://tools.ietf.org/html/rfc6928

Table 9.1. Typical TCP slow start growth

Round trip	MSS	CWND size (segments)	CWND size (KB)
1	1460	10	14
2	1460	20	28
3	1460	40	57
4	1460	80	114
5	1460	160	228

This doubling in size produces exponential growth, and after several round trips reaches the full capacity that the receiver said it’s willing to accept (see the first part of figure 9.4).

Figure 9.4. TCP slow start ramps up to optimum capacity.

The capping limit is also often much lower than shown in figure 9.4, and 65 KB is common, as that was an initial limit under TCP (see the discussion of window scaling in section 9.1.4). After maximum capacity is reached, assuming that no packet loss occurred, the TCP congestion control enters the congestion avoidance phase and continues to increase, but at a much slower linear rate (as opposed to the exponential rate during slow start), until it starts to see dropped packets and is assumed to hit capacity, as shown in the second part of figure 9.4.

Is TCP slow start that slow?

Due to the exponential nature of TCP slow start, it’s not slow by most definitions. In fact, the congestion avoidance phase of TCP is a much slower rate of growth. Slow start refers to the initially small size, which is why it’s called slow start rather than slow growth.

It’s certainly slower than starting at the maximum that the server can send and scaling down. Every TCP connection goes through this growth, so the protocol is slow initially—deliberately, intentionally slow, but slow nonetheless.

Unfortunately for HTTP, the initial stage is where you’re likely to want full capacity. In a Facebook session, for example, the home page alone is 125 KB, which isn’t reached until the fourth round trip at least. After the initial download of the web page and all its assets, there’s often less need to download data, which often happens when TCP reaches its optimal capacity.

Squeezing as much as possible into the first 14 KB

One web performance tip that’s often touted is to fit all your critical resources into the first 14 KB of your HTML. The theory is that the first 14 KB will be downloaded in the first 10 TCP packets, preventing any TCP acknowledgment delays. Any critical inlined CSS, for example, should be included in the initial 14 KB (assuming that the browser is happy to start processing partial HTML pages, as many browsers do).

This situation changes under an HTTPS connection and in particular under an HTTP/2 connection, in which some of these initial 10 TCP packets would be used by the following (at least):

Two HTTPS responses (Server Hello and Change Spec)
Two HTTP/2 SETTINGS frames (the server sending one and acknowledging the client’s SETTINGS frame)
One HEADERS frame responding to the first request

That connection leaves 5 packets, or about 7 KB, in the best-case scenario. In reality, any of those messages could be larger than one TCP packet, using more than five packets. Also, after you add any WINDOW_UPDATE or PUSH_PROMISE frames, this figure might be smaller still.

Luckily, however, the client acknowledges those TCP packets as they’re sent, which increases the CWND size. In some ways, the initial delays due to HTTPS may mean that the CWND size is already larger by the time you use HTTP, though this fact is offset by the initial setup cost of HTTPS itself.

The main point of putting your critical resources high up in your HTML still stands, but in my opinion, there’s no need to get hung up on 14 KB under HTTPS or HTTP/2.

I state in chapter 2 that HTTP/2, with its single connection, has an advantage over HTTP/1.1, but this statement may not be 100% accurate when you get into the details. On one hand, HTTP/1.1 gets to download more initially due to the multiple connections (usually six per domain, more if sharding is used), so it effectively gets multiple initial CWNDs compared with HTTP/2 (assuming that all connections are opened at the same time), as shown in table 9.2.

Table 9.2. Typical TCP slow start growth with six connections

Round trip	MSS	CWND size (segments)	CWND size (KB)	CWND size for six connections (KB)
1	1460	10	14	85
2	1460	20	28	171
3	1460	40	57	342
4	1460	80	114	684
5	1460	160	228	1368

On the other hand, if only one connection is used initially (as is often the case in downloading an HTML web page), any additional new connections under HTTP/1.1 start with the slower, lower limit than the single HTTP/2 connection, which has likely already reached full capacity. TCP slow start affects both versions of the protocol to some extent.

Idle connections degrade performance

The TCP slow start algorithm causes delays at the start of a connection, as well as when the connection is idle. TCP is cautious, and after a period of idleness, the network conditions may have changed, so TCP throttles back its CWND size and restarts the slow start mechanism to find the optimum CWND size again.

Unfortunately, web browsing is by its very nature made up of bursts of traffic (as you navigate to a new page) followed by periods of idleness (as you read the web page). Then the cycle is repeated, so resetting back to the start during idle periods can have a large effect on web browsing.

In the Amazon HTTP/1.1 example, the page is loaded from the main Amazon domain, but most of the assets used are from subdomains, leading to large periods of inactivity on the initial connection to the main domain, as highlighted in figure 9.5.

Although this inactivity is particularly bad for the first connection highlighted (as it’ll likely be used again on any subsequent page navigation), you can see in figure 9.5 that lots of other connections are underused. These gaps show inefficient usage of this connection, as highlighted in chapter 2, but from a TCP point of view, it’s worse than you may realize, as TCP throttles back the connection during these periods of inactivity. When those connections need to be used again (such as on page navigation), the process almost starts from the beginning again, although at least the TCP handshake and HTTPS handshake don’t need to be repeated if the connection is kept open.

Figure 9.5. Connection use by Amazon under HTTP/1

HTTP/2, with its use of a single connection per domain, fares much better in this situation. Each resource helps keep the single TCP connection active, so it’s less likely to be idle. This situation is particularly relevant if any connection regularly communicates back to the server through XHR polling, server-sent events, or similar technology. Such activity keeps the connection warmed up and ready for the next page navigation.

Packet loss degrades TCP performance

In addition to taking a while to get up to capacity at the beginning and when the connection has been idle for some time, TCP handles packet loss as an extreme event. It assumes that this event is due to capacity constraints and reacts sharply, halving the CWND and thereby halving the capacity (depending on the TCP congestion algorithm in use).^[3] Then TCP uses the congestion avoidance algorithm to build up capacity and continues in a congestion avoidance phase (again depending on the TCP algorithm), as illustrated in figure 9.6.

³https://ieeexplore.ieee.org/document/7796870

Figure 9.6. TCP CWND size is affected by packet loss.

This halving of the CWND causes particular problems. Packet loss can occur for many reasons, and network congestion is only one of them. Mobile networks, for example, can be less reliable than wired connections and can lose packets at random, regardless of how much capacity the network has. Therefore, it can be wrong to assume that packet loss is purely due to congestion, and so should result in a dramatic reduction in capacity.

Newer TCP congestion control algorithms treat these network drops less extremely than the preceding example: they may drop the CWND by less than half, grow faster after packet loss (similar to slow start), or use alternatives to packet loss to decide capacity (such as treating average round-trip time as a better indicator than packet loss). But at a high level, the concepts are roughly similar. Packet loss results in slowdown of the network.

This problem is particularly severe for HTTP/2 with a single connection. A single packet lost in an HTTP/2 world causes all resources being downloaded to suffer. Compare this protocol with HTTP/1.1, which potentially has six independent connections. A single packet loss will slow one of those connections, but the other five will continue at full capacity.

A second packet loss, before the connection has recovered, could have even more dire consequences under HTTP/2, which halves down again to 25% capacity (again assuming the use of basic TCP congestion control). Under HTTP/1.1, this second packet loss could happen on the connection that already experienced the problem (also reducing capacity to 25%) or on a separate connection (reducing it to 50%), but the other TCP connections are unaffected. Table 9.3 shows the results of six resources being downloaded over HTTP/2 and under the two HTTP/1.1 scenarios.

As you see in table 9.3, the average capacity under HTTP/2 is down to 25% for this example, because the whole connection (over which all six resources are being downloaded) is affected, whereas the effect on six independent connections is a reduction to between 83% and 88%, depending on which connections were affected.

Table 9.3. Results of second packet loss on HTTP/2 versus HTTP/1.1 connections

Resource	HTTP/2	HTTP/1.1: same connection	HTTP/1.1: different connection
Resource 1	25%	25%	50%
Resource 2	25%	100%	50%
Resource 3	25%	100%	100%
Resource 4	25%	100%	100%
Resource 5	25%	100%	100%
Resource 6	25%	100%	100%
Average	25%	88%	83%

If this connection is particularly bad, or if there’s a genuine capacity bottleneck, both HTTP/1.1 and HTTP/2 will suffer. The effect is always greater under HTTP/2, however; its single connection always bears the full brunt of any packet loss.

Packet loss causes items to queue

HTTP/2 multiplexing allows several streams of requests to be in flight in parallel on the same TCP connection. Doing the same thing under HTTP/1.1 requires multiple TCP connections. An HTTP/2-specific problem arises when you have packet loss as well as reduction in capacity. Suppose that you have three assets in flight, and they’re downloading, as shown in figure 9.7.

Figure 9.7. Several responses in flight

Now assume that a TCP packet from the first response—the style.css headers response being sent on stream 5—goes missing for some reason. In this case, the client won’t acknowledge that packet, and after a while, the server resends it. This retransmission is added to the end of the queue, as shown in figure 9.8. Note that the figure blurs the lines between HTTP/2 frames and TCP packets somewhat for simplicity’s sake.

Figure 9.8. TCP retransmission of part of an HTTP/2 frame

If no other packet losses occur, streams 7 and 9 will be received in their entirety before the retransmission arrives. Those responses must be queued, however, because TCP guarantees the order, so script.js and image.jpg can’t be used despite being downloaded in full. Under HTTP/1.1, this process would be carried out under three independent TCP connections, as shown in figure 9.9.

Figure 9.9. TCP retransmissions under HTTP/1.1 affect only the connection that needs the retransmission.

The browser, therefore, can process script.js and image.jpg as soon as they arrive; only style.css is delayed. In this example, the browser may wait until style.css is available, depending on whether it considers this resource to be a critical resource (as CSS often is). The point remains that HTTP/2 is adding a constraint here that isn’t present under HTTP/1.1 with multiple connections. Worse, if the connection is unable to queue up all out-of-sequence packets due to limited TCP buffer size, it may drop some packets, requiring them to be retransmitted as well!

HTTP/2 has solved the head-of-line (HOL) blocking issue at HTTP level, because with multiplexing, a single delayed response doesn’t prevent the HTTP connection from being used for other resources. HOL blocking is still present at TCP level, however. A single dropped packet from one stream effectively blocks all the other streams, even though they may not need to be held up.

9.1.3. Effect of TCP inefficiencies on HTTP/2

I’ve shown that TCP inefficiencies can cause problems for HTTP, but what is the real effect, and is it any different under HTTP/1 and HTTP/2?

I indicated earlier that HTTP/2 generally outperforms HTTP/1.1. Also, Google’s experiments with SPDY demonstrated a considerable speed gain in both laboratory experiments and the real world.

The effect of performance loss isn’t to be underestimated, however. Hooman Beheshti of Fastly did some experiments^[4] with the WebPagetest tool and showed that HTTP/2 performs consistently worse than HTTP/1.1 when a consistent 2% packet loss occurs. Granted, a consistent 2% packet loss indicates a very poor network; most networks lose an occasional packet rather than experience this consistent level of loss. But the experiments show that HTTP/2 may not be the silver bullet for all scenarios. More in-depth studies^[5] similarly showed the impact of packet loss and even went so far as to recommend using limited sharding under HTTP/2, which seems to be counterintuitive.

⁴https://www.youtube.com/watch?v=CkFEoZwWbGQ and https://www.youtube.com/watch?v=wR1gF5Lhcq0

⁵https://arxiv.org/abs/1707.05836

I was able to repeat Beheshti’s findings on some popular sites but not on others. If you want to repeat the tests, go to https://www.webpagetest.org/. On the Test Settings tab, choose a custom setting and set your packet loss, as shown in figure 9.10.

Figure 9.10. Testing packet loss in WebPagetest

Figure 9.11. Disabling HTTP/2 for Chrome

To test HTTP/2 versus HTTP/1.1, you can use Chrome and add the --disable-http2 command-line option, as shown in figure 9.11.

For Firefox, you must use a slightly different method. Enter the following on the Script tab, replacing the final navigate line with the site you want to test against:

firefoxPref network.http.spdy.enabled          false
firefoxPref network.http.spdy.enabled.http2    false
firefoxPref network.http.spdy.enabled.v3-1     false
navigate    https://www.fastly.com/

Be aware that you must use tabs, not spaces, between the parts of these settings, as shown in figure 9.12.

Figure 9.12. Disabling HTTP/2 on Firefox in WebPagetest

To prevent any bias or to keep individual results from skewing overall results, run the tests multiple times, under different network conditions and in different locations. Figure 9.13 shows the results of one test of ebay.com.

As you see in figure 9.13, HTTP/2 (top) is nearly half a second slower. Repeating the same test with zero packet loss shows that HTTP/2 is faster, as expected.

Figure 9.13. Loading Ebay’s home page with 2% packet loss over HTTP/2 and HTTP/1.1

It’s also possible to export the raw data by clicking Raw Page Data on the right side of the page, as shown in figure 9.14.

Figure 9.14. Exporting WebPagetest raw data to a CSV file

This feature can be handy when you’re making multiple runs and want to plot them in a graph. Also, you can click Test History and select both images to see a quick comparison of the effect, as shown in figure 9.15.

Figure 9.15. Choosing two results to compare

This feature gives you access to a wealth of views and data, including size by timeline and thumbnails, as shown in figure 9.16.

Figure 9.16. Comparing two WebPagetest runs

WebPagetest is a fantastic tool for running performance comparisons like this one. You can also host your own private instance, which is well worth looking into if you plan to perform a lot of checks or want to test development servers that aren’t available to the web version.

Should you hold off on migrating to HTTP/2 due to TCP inefficiencies?

Given the fact that HTTP/2 performance can be worse under severe packet loss, should you hold off on moving to HTTP/2? Delaying probably would be overkill. Remember that HTTP/2 is faster than HTTP/1.1 under most scenarios. Should you hold up benefiting users because on some (ideally rare) occasions, it performs worse than HTTP/1.1?

This chapter discusses some real problems with HTTP and TCP, but not the likelihood that those problems will occur. The best measure is real-life metrics, rather than artificial scenarios like those in this chapter. As I stated earlier, a real-life network that experiences continual 2% packet loss is likely to be a poor network. Unfortunately, measuring packet loss in real life is more difficult; statistics are less readily available. Some scientific studies based on more realistic packet loss scenarios,^[a] however, have shown that, as expected, in general HTTP/2 outperforms HTTP/1.1.

^ahttps://www.semanticscholar.org/paper/HTTP%2F2-Performance-in-Cellular-Networks-Goel-Steiner/63fa6b3310a7c4d799d5b0b5bf37f0620dd3fc5d?tab=abstract

HTTP/2 implementations are still relatively new and will improve over time. The same is true of websites, which may optimize better for HTTP/2. Finally, TCP itself is still improving and can be optimized (see section 9.1.4). Some people have suggested using multiple TCP connections for HTTP/2 to work around some of these issues, but this workaround negates the reasons for moving to a single connection under HTTP/2. HTTP/2 mostly performs better because it uses a single connection.

In a specific scenario in which most of your users have poor network connections that can’t be improved, it may be prudent to remain on HTTP/1.1 or to shard HTTP/2 connections that won’t be coalesced. Ultimately, the best advice (as always) is to measure and test any changes.

9.1.4. Optimizing TCP

You’ve seen that TCP can greatly affect the performance of HTTP. In many ways, the inefficiencies in the HTTP protocol have been engineered out with HTTP/2, but performance bottlenecks that existed elsewhere are now more apparent. HTTP HOL blocking is no longer a problem in HTTP/2, thanks to multiplexing, but TCP HOL blocking has become a problem, especially in lossy environments.

The only two solutions to these problems are to improve TCP or to move away from it. The following sections look at the first solution; section 9.2 looks at the second. TCP has had several improvements over the years, some of which may already be in use in some environments.

Upgrading your operating system

The biggest effect is to upgrade your operating system. Although TCP is old, dating back to 1974, improvements and innovations are still being made as new research is completed and as computer use changes. Unfortunately, TCP usually is controlled by the low-level operating system, and you have less opportunity to change it outside the operating system. Therefore, the best way to ensure that you have optimum TCP usage is to ensure that you’re running the latest version of your operating system.

In this section, I concentrate on Linux as an example, but these settings apply equally to other operating systems, including Windows and macOS, even if the settings aren’t in the same place or as easy to change. Where appropriate, I provide the Linux version in which a change was introduced. Table 9.4 shows the Linux kernel version included in some of the most popular Linux distributions.

Table 9.4. Linux kernel versions for popular distributions

Distribution	Linux kernel version
RHEL/Centos 6	2.6.32
RHEL/Centos 7	3.10.0
Ubuntu/Debian 14.04	3.13
Ubuntu/Debian 16.04	4.4
Ubuntu/Debian 18.04	4.15
Debian 8 Jessie	3.16
Debian 9 Stretch	4.9

Finding and changing TCP connection settings

Most TCP settings in Linux are available to view in the following directory:

/proc/sys/net/ipv4/

Despite the directory name, most of these settings apply to IPv6 TCP connections too. You can view the values with cat:

$ cat /proc/sys/net/ipv4/tcp_slow_start_after_idle
1

You can set the values with the sysctl command:

sysctl -w net.ipv4.tcp_slow_start_after_idle=0

Take care when changing any of these settings, however, because TCP is such a critical part of the system. I advise most readers not to change the settings. I suggest instead that readers use this knowledge to make sure that these settings are appropriate and use them as an argument for a whole operating-system upgrade, which should set the values to the best practice values at the time of the kernel release.

Increasing the Initial CWND size

TCP slow start requires a round trip to increase the CWND size, which started at a size of 1 TCP packet but increased over the years to 2 and then 4; by Linux kernel 2.6.39, the setting increased from 4 to 10 by default. This setting is usually hardcoded into the kernel code, so it’s not advisable to change it except by upgrading the operating system.

Allowing window scaling

Traditionally, TCP allows a maximum CWND window size of 65,535 bytes, but later versions allow a scaling factor to be applied to this value, in theory allowing CWND sizes of up to 1 GB. This setting was made the default in Linux kernel 2.6.8, so it should be on for most readers, but to make sure, you can check it this way:

$ cat /proc/sys/net/ipv4/tcp_window_scaling
1

Using Selective Acknowledgment

Selective Acknowledgment (SACK) allows TCP to acknowledge receipt of packets out of order to avoid resending them if another packet is dropped. If packets 1–10 are sent, but packet 4 is dropped, you can acknowledge 1–3 and 5–10. That way, only packet 4 must be resent. Without this feature, packets 4–10 would need to be resent in this example. Confirm that this feature is set to 1 (on) with this command:

$ cat /proc/sys/net/ipv4/tcp_sack
1

Disabling slow start restart

This setting still defaults to potentially the wrong value, at least for web servers, so you may want to consider changing it. A TCP connection throttles back after an idle period, under the assumption that network conditions may have changed, so previous assumptions may be incorrect. By default, however, a web server is somewhat intermittent, with bursts of traffic as users browse the site, pause to read the web page, and potentially browse to other pages, so enabling this setting may not be optimum for web servers.

The setting usually is enabled by default:

$ cat /proc/sys/net/ipv4/tcp_slow_start_after_idle
1

To disable it, use the following command:

sysctl -w net.ipv4.tcp_slow_start_after_idle=0

As I stated earlier, you shouldn’t change your system TCP settings lightly. But depending on what your server is used for (such as a dedicated web server), changing this setting may be worth considering.

Using TCP fast open

TCP fast open allows an initial packet of traffic to be sent with the initial SYN part of the TCP three-way handshake. This method prevents some of the setup delay associated with TCP (see section 9.1.1). For security reasons, this packet can be sent only on TCP reconnections rather than on initial connections, and both client and server support are required. TCP fast open effectively allows HTTP (or HTTPS) messages to be sent earlier in the handshake, as shown in figure 9.17.

Figure 9.17. TCP and HTTPS reconnection handshake with and without fast open

You can check Linux support of this feature as follows:

$ cat /proc/sys/net/ipv4/tcp_fastopen
0

The setting usually is disabled (set to 0). Table 9.5 lists some options for this setting.

Table 9.5. TCP fast open settings

Value	Meaning
0	Disabled
1	Enabled for outgoing connections
2	Enabled for incoming connections
3	Enabled for both outgoing and incoming connections

You can change this setting with the following command:

echo "3" > /proc/sys/net/ipv4/tcp_fastopen

Support for this feature was added in Linux 3.7 and enabled by default in version 3.13, though IPv6 support wasn’t added until Linux 3.16.

In addition to setting this setting at the operating-system level, you must configure your server software to use it. On the web-server side, nginx allows this setting^[6] but requires compile options and configuration, so it’s not enabled by default. Windows IIS^[7] supports it, but Apache makes no mention of it in the documentation, so presumably it doesn’t support it. Other, less-common servers may not support this feature. On the client side, the setting can be enabled in Edge,^[8] and Chrome on Android, but at this writing, it isn’t supported in Chrome for Windows or macOS^[9] and is switched off in Firefox.^[10]

⁶https://nginx.org/en/docs/http/ngx_http_core_module.html#listen

⁷https://blogs.technet.microsoft.com/networking/2016/07/18/announcing-new-transport-advancements-in-the-anniversary-update-for-windows-10-and-windows-server-2016/

⁸https://www.windowscentral.com/enable-tcp-fast-open-microsoft-edge-faster-page-load-times

⁹https://bugs.chromium.org/p/chromium/issues/detail?id=635080

¹⁰https://bugzilla.mozilla.org/show_bug.cgi?id=1398201

The gains from TCP Fast Open are truly impressive. Google has stated^[11] that “based on traffic analysis and network emulation, we show that TCP Fast Open would decrease HTTP transaction network latency by 15% and whole-page load time over 10% on average, and in some cases up to 40%.” Support of this relatively new addition to TCP (the RFC was published in 2014)^[12] has been slow, however. Given these complexities, TCP Fast Open probably is one to watch for in the future rather than change now.

¹¹https://ai.google/research/pubs/pub37517

¹²https://tools.ietf.org/html/rfc7413

Using Congestion control algorithms, PRR, and BBR

TCP has various congestion control algorithms that control how TCP reacts when packet loss is experienced. Most TCP implementations use the CUBIC algorithm^[13] (the default since Linux kernel 2.6.19). This algorithm was enhanced by Proportional Rate Reduction (PRR)^[14] congestion avoidance (the default since 3.2), which reduces the halving of the congestion control window on packet loss.^[15] A detailed description of the differences is beyond the scope of this book, but suffice it to say that a better algorithm can significantly improve performance. Use this command to see the current algorithm in use:

¹³https://tools.ietf.org/html/rfc831

¹⁴https://tools.ietf.org/html/rfc6937

¹⁵https://ai.google/research/pubs/pub37486

$ cat /proc/sys/net/ipv4/tcp_congestion_control
cubic

The available congestion control algorithms are available here:

$ cat /proc/sys/net/ipv4/tcp_available_congestion_control
reno cubic

An even newer algorithm, Bottleneck Bandwidth and Round-trip propagation time (BBR), has been shown to improve performance further,^[16] particularly for HTTP/2 connections.^[17] BBR was created by Google and is available in Linux kernel 4.9; it requires no client-side changes. To enable it in Linux kernels that have it (version 4.9 or later), use the following commands:

¹⁶https://cloudplatform.googleblog.com/2017/07/TCP-BBR-congestion-control-comes-to-GCP-your-Internet-just-got-faster.html

¹⁷https://blog.cloudflare.com/http-2-prioritization-with-nginx/

#Dynamically load the tcp_bbr module if not loaded already
sudo modprobe tcp_bbr
#Add Fair Queue traffic policing which BBR works better with
sudo echo "net.core.default_qdisc=fq" > /etc/sysctl.conf
#Change the TCP congestion algorithm to BBR
sudo echo "net.ipv4.tcp_congestion_control=bbr" > /etc/sysctl.conf
#Reload the settings
sudo sysctl -p

Some researchers,^[18] however, claim that BBR potentially isn’t a nice player on the network, particularly when running alongside other non-BBR traffic, and can take an unfair proportion of network resources.

¹⁸https://doc.tm.uka.de/2017-kit-icnp-bbr-authors-copy.pdf

9.1.5. The future of TCP and HTTP

I’ve shown you some of the complications of TCP—a seemingly simple protocol that’s far more complex than most people realize. Like HTTP/1.1, TCP has some built-in inefficiencies that users may only now be starting to experience, as the inefficiencies in higher-level protocols such as HTTP are addressed, and as demands on networks continue to increase.

The protocol is still evolving, albeit quite slowly. Although new options and congestion control algorithms are being created all the time, and browsers are being upgraded to take advantage of these features if available, it takes some time for them to make it into the network stacks of servers. New features usually are tied to the fundamentals of the operating system, so they require a full operating system upgrade. It may be possible to turn on some of these settings manually if you’re running a version of the operating system in which the features have been introduced but not been made defaults, but it’s often better to upgrade the operating system. This area is a specialized one, and although I touched on a few recent innovations here that are likely to be beneficial for the next few years, it’s usually better to allow the maintainers of the operating system, who have the necessary skills and knowledge, to decide what these settings should be.

Also, I didn’t touch on all the network pipes and plumbing that usually lie between a user’s web browser and the web server. Even if both sides support some of these relatively new TCP features, if anything sitting between them doesn’t, there’s potential for a problem. Much as HTTP proxies downgrade connections to HTTP/1.1 even when both ends support HTTP/2, innovation in this area can be held back by so-called middleboxes. Because TCP is an old algorithm, some of these middleboxes have certain expectations about how TCP is used and don’t react well, or don’t allow it to be used in new, unexpected ways.

For these reasons and more, some people are questioning whether TCP is the right underlying protocol for HTTP and whether a new protocol is the way to go—a protocol designed from the ground up for the current (and future) needs of HTTP without the baggage of the past or dependency on the operating system. One such protocol is QUIC.

9.2. QUIC

QUIC (pronounced quick) is a new UDP-based protocol invented at Google (Google again!) that aims to replace TCP and other parts of the traditional HTTP stack to address many of the inefficiencies mentioned in this chapter. HTTP/2 introduced some TCP-like concepts (such as packets and flow control), but QUIC takes these concepts to the next level and replaces TCP.

What does QUIC stand for?

QUIC originally was an acronym for Quick UDP Internet Connections, as shown in most of the Google Chromium documentation when the protocol was introduced.^[a], ^[b], ^[c] During formalization, the QUIC Working Group decided to drop this acronym,^[d] and the QUIC specification explicitly notes, “QUIC is a name, not an acronym.”^[e]

^ahttps://www.chromium.org/quic

^bhttps://docs.google.com/document/d/1gY9-YNDNAB1eip-RTPbqphgySwSNSDHLq9D5Bty4FSU/

^chttps://docs.google.com/document/d/1RNHkx_VvKWyWg6Lr8SZ-saqsQx7rFV-ev2jRFUoVD34/

^dhttps://github.com/quicwg/base-drafts/pull/1282

^ehttps://tools.ietf.org/html/draft-ietf-quic-transport#section-2

Many sources still use the acronym, however. A member of the Working Group amusingly stated, “QUIC isn’t an acronym. You’re expected to shout it ;).”^[f]

^fhttps://www.ietf.org/mail-archive/web/quic/current/msg03844.html

QUIC was created with the following features in mind:^[19]

¹⁹https://www.chromium.org/quic

Dramatically reduced connection establishment time
Improved congestion control
Multiplexing without HOL line blocking
Forward error correction
Connection migration

The first three reasons should be obvious from the TCP (and HTTPS) drawbacks discussed in this chapter. The last two reasons are interesting additions that further address these problems.

Forward error correction (FEC) looks to reduce the need for packet retransmission by including part of a QUIC packet in neighboring packets. The idea is that if only a single packet is dropped, it should be possible to reassemble that packet from the successfully delivered packets. The process has been compared with “RAID 5 on the network level.”^[20] I said earlier that packets can get lost randomly, but not necessarily as a sign of limits of the connection, and FEC aims to correct this problem. QUIC adds redundancy and overhead costs, but given the fact that HTTP requires guaranteed delivery (unlike, say, video stream protocols, in which packets may be dropped without effect), the gains may be worth the small overhead. At this writing, this feature of QUIC is still experimental^[21] and won’t be in the initial version of QUIC, as it’s explicitly called as out of scope in the QUIC-WG charter.^[22]

²⁰https://ma.ttias.be/googles-quic-protocol-moving-web-tcp-udp/

²¹https://docs.google.com/document/d/1Hg1SaLEl6T4rEU9j-isovCo8VEjjnuCPTcLNJewj7Nk

²²https://datatracker.ietf.org/wg/quic/about/

Connection migration aims to reduce connection setup overhead by allowing a connection to move between networks. Under TCP, the connection is linked to the IP address and port on either side. Changing the IP address requires establishing a new TCP connection. This requirement was acceptable when TCP was invented, because IP addresses were viewed as being unlikely to change during the lifetime of a session. Now, with multiple networks (wired, wireless, and mobile), this situation can no longer be taken for granted. QUIC, therefore, allows you to start your session over Wi-Fi at home and then move to a mobile network without having to restart your session. You should even be able to use both your Wi-Fi and mobile networks at the same time for one QUIC connection via a technique known as multipath that allows increased bandwidth. Again, this multipath feature won’t be available in the first release, but connection migration should be.

9.2.1. Performance benefits of QUIC

In April 2015, Google published a blog post^[23] on the performance benefits of QUIC, including the following:

²³https://blog.chromium.org/2015/04/a-quic-update-on-googles-experimental.html

75% of connections take advantage of the zero-round-trip connection time.
Google Search saw a 3% improvement in mean page load time, and reducing page load time by a second on the slowest networks. These figures may not seem like much, but remember that Google Search is a massively optimized site on which any improvement is special.
YouTube users reported 30% fewer rebuffers when using QUIC.

The measurements presumably were compared with HTTP/2 and SPDY. At that time, 50% of Chrome traffic to Google used QUIC; that percentage is likely to have grown considerably since then. Because QUIC was supported only by Chrome and Google until recently (see section 9.2.6), its use is limited. W3Tech, for example, says that slightly more than 1% of sites use QUIC at this writing,^[24] though other measures say that this figure translates to 7.8% of traffic volume,^[25] of which 98% is Google.

²⁴https://w3techs.com/technologies/details/ce-quic/all/all

²⁵https://blog.apnic.net/2018/05/15/how-much-of-the-internet-is-using-quic/

9.2.2. QUIC and the internet stack

QUIC replaces more than TCP. Figure 9.18 shows where QUIC fits into the traditional HTTP technology stack.

Figure 9.18. Where QUIC fits into the HTTP technology stack

As you see in figure 9.18, QUIC replaces most of what TCP traditionally provides (the setup, reliability, and congestion control parts), all of HTTPS (to improve the setup delays), and even part of HTTP/2 (the flow control and header compression parts).

QUIC aims for a one-round-trip connection setup by performing the connection layer (TCP in the traditional world) and encryption layer (TLS in the traditional world) at the same time. To do so, it uses many of the concepts and innovations that have been added to TCP (such as Fast Open) and TLS (such as TLSv1.3).

At a higher level, QUIC doesn’t replace HTTP/2, but it takes over some of the Transport layer pieces, leaving a lighter HTTP/2 implementation running on top. As with the move from HTTP/1.1 to HTTP/2, the core syntax of HTTP that most higher-level developers need to care about stays the same under QUIC, and the concepts introduced in HTTP/2 (such as multiplexed streams, header compression, and server push) still exist in much the same way; QUIC takes care of some lower-level details. The move from HTTP/1.1 to HTTP/2 contains bigger changes for developers, but all the concepts remain the same under QUIC, so everything you’ve read and learned in this book isn’t wasted! The protocol is still a multiplexed, stream-based binary protocol, and some of the specifics used to achieve this change at a lower level now fall under QUIC rather than HTTP/2. To reflect the changes from HTTP/2, to differentiate it from QUIC itself, and to show that this is the best version of HTTP, it has been agreed HTTP over QUIC will be called HTTP/3 (discussed more in section 10.3 of chapter 10).^[26]

²⁶https://lists.w3.org/Archives/Public/ietf-http-wg/2018OctDec/0065.html

9.2.3. What UDP is and why QUIC is built on it

QUIC is based on the User Datagram Protocol (UPD), which is a light protocol compared with TCP, but is similarly built on top of Internet Protocol (IP). TCP implements reliability in IP for the network connection, including retransmission, congestion, and flow control. These features normally are good and necessary, but in HTTP/2, they introduce inefficiencies. These features aren’t necessarily wanted at the network level under HTTP/2; therefore, they produce an unnecessary TCP HOL blocking issue.

UDP is basic compared with TCP. It has the concept of ports, similar to that of TCP, so several UDP-based services can run on the same computer. It also has an optional checksum so that the integrity of UDP packets can be checked. Except for those two features, there’s not much to the protocol. Reliability, ordering, and congestion control don’t exist, and if you want them, they have to be built by the application. If a UDP packet is lost, it won’t automatically be resent. If a UDP packet arrives out of order, it’s still seen by the higher-level application. UDP was originally used for applications that didn’t need delivery guarantees (such as video, in which some frames could be dropped without too much loss in service). UDP is also perfect for a multiplexed protocol such as HTTP/2 if that higher-level protocol wants to implement better solutions to these problems than those available in TCP.

Why not improve TCP?

The most obvious question is why not improve TCP? TCP is still innovating, and the problems could be engineered out by further improvements. The main drawback is the speed of implementation of any such improvements. TCP is such a core protocol that it’s nearly always baked into operating systems, and although some changes can be made to configure it or some improvements can be made on the server side, operating-system upgrades are required to benefit from most TCP improvements. The problem isn’t that operating systems can’t innovate; it’s the length of time required for those innovations to be widely deployed. TCP Fast Open is a prime example; it offers huge benefits, but isn’t used yet by the vast majority of internet browsers or servers.

This slowness to innovate is exacerbated by the internet infrastructure, which makes certain assumptions about protocols such as TCP and reacts badly when those assumptions are broken. This problem is known as protocol ossification, whereby innovation is stifled because of these assumptions. By moving away from TCP, QUIC hopes to have greater freedom and fewer constraints.

Why not use SCTP?

Instead of building a new transport protocol on top of UDP or waiting for innovations in TCP to become more widespread, QUIC could have used Stream Control Transmission Protocol (SCTP).^[27] This protocol shares many characteristics with QUIC, such as stream-based reliable messaging, but it already exists and has been an internet standard since 2007.

²⁷https://tools.ietf.org/html/rfc4960

Unfortunately, existing as a standard isn’t enough to ensure use, and adoption of SCTP is low, primarily because TCP has been good enough until now. Therefore, moving to SCTP is likely to take as long as upgrading TCP. Even after such a move, innovation in the protocol is likely to stall. QUIC aims to improve stream-level congestion control and other issues that affect the internet, such as HTTPS handshake, limited packet loss, and connection migration.

Why not use IP directly?

Another option that the QUIC designers could have used was to build on IP, because the requirements of the Transport layer are light. IP is nothing but a source and destination IP address; everything else can be built on top of it.

But using IP directly has the same problems as using SCTP. The protocol would have to be implemented at operating-system level, because few applications get direct access to IP packets. Also, QUIC should be directed at a particular application, so it needs ports, which UDP has. Many clients can open separate HTTP connections over QUIC, such as to run Chrome and Firefox at the same time, and perhaps also an unrelated program that uses HTTP. Without this feature, some QUIC-controlling application would be required to read all QUIC packets and route them to each application as appropriate.

Advantages of UDP

UDP is a basic protocol that’s also implemented in the kernel. Anything built on top of it needs to be built in the Application layer, known as the user space. Being outside the kernel allows quick innovation by deploying the application on either side. Google uses QUIC in all its services when you use Chrome, so opening developer tools and navigating to a Google site shows you the current version of QUIC in use (version 43, at this writing), as shown in figure 9.19.

Figure 9.19. Viewing the deployed version of QUIC on www.google.com

In the few short years that QUIC has been around, Google has created 43 versions of it.^[28] As it did when deploying SPDY, Google was able to deploy changes to the main client used to browse the web (Chrome) and some of the most popular servers easily and then innovate without users noticing. As of 2017, an estimated 7% of the internet uses QUIC,^[29] though this figure is likely to represent mostly Google sites.

²⁸The version history is detailed in the source code: https://chromium.googlesource.com/chromium/src/+/master/net/third_party/quic/core/quic_versions.h.

²⁹https://ai.google/research/pubs/pub46403

Rolling out QUIC so quickly was possible only by using UDP rather than trying to force adoption or changes in existing protocols, which would take time and likely would be blocked by much of the current infrastructure of the internet. Using the light and limited UDP allowed Google to build and innovate the protocol as it saw fit, because it could control both sides of the connection.

UDP isn’t without problems. It’s a common protocol, but not as common as TCP. DNS works over UDP, for example, because it’s a simple protocol that doesn’t need the complications or slowness of TCP (though there are moves to allow DNS to work over HTTPS, as discussed in chapter 10). Other applications (such as real-time video streaming and online video games) also use UDP, so it’s often supported by network infrastructure. TCP is far more common, however, and UDP is often blocked by firewalls and middleware traffic by default. In this case, Chrome gracefully falls back to HTTP/2 over TCP. This concern was a large one in the beginning, but experiments by Google showed that 93% of UDP traffic made it through, and that percentage has improved over time. Although some infrastructure blocks UDP traffic for HTTP (where port 443 is also used), the vast majority doesn’t. UDP is also easy to enable if it becomes common (as it is for Google services, at least).

The other problem with UDP is that user space isn’t always as efficient as the highly optimized kernel space. Early measures of QUIC show that servers use up to 3.5 times the CPU of equivalent TLS/TCP-based servers.^[30] Although that use has been optimized to be only twice as much, the result still shows that UPD is a more expensive protocol and is likely to remain that way while it lives outside the kernel.

³⁰https://dl.acm.org/citation.cfm?id=3098842

Will QUIC always use UDP?

In their original FAQ released when it launched QUIC,^[a] Google stated, “We are hopeful that QUIC features will migrate into TCP and TLS if they prove effective.”

^ahttps://docs.google.com/document/d/1lmL9EF6qKrk7gbazY8bIdvq3Pno2Xj_l_YShP40GLQE/

So perhaps UDP will be used for experimentation, and TCP will evolve with it at a slower pace. Will QUIC revert to TCP at some point? That question is difficult to answer, but my opinion is it’ll be difficult to give up the freedom to evolve. The internet seems to be in a period of innovation at the transport layer, and it seems unlikely that protocol developers will reach a point where they’re happy to stop innovating and settle down to a fixed, difficult-to-upgrade protocol.

Also, QUIC is implementing fundamental changes compared with TCP, and these changes won’t be easily adopted into TCP, even if the drive to do so existed.

More likely, HTTP will continue to be available over both TCP (HTTP/2) and UDP (QUIC and HTTP/3), but the TCP implementation will lag UDP in terms of features and performance.

9.2.4. Standardizing QUIC

QUIC started as a Google protocol and was announced publicly in June 2013.^[31] Google evolved the protocol over the next two years, and in June 2015, the company submitted it to the Internet Engineering Task Force (IETF) as a proposed standard.^[32] This submission occurred after Google’s last standard (SPDY) was formally adopted as HTTP/2, so the timing was good; many people associated with that standardization were free to work on QUIC. A few months later, the IETF QUIC Working Group was established to work on standardizing the protocol.^[33]

³¹https://blog.chromium.org/2013/06/experimenting-with-quic.html

³²https://datatracker.ietf.org/doc/draft-tsvwg-quic-protocol/00/

³³https://datatracker.ietf.org/wg/quic/about/

The Two QUICs: gQUIC and iQUIC

Like SPDY, QUIC evolved under Google’s stewardship while the standardization process worked through it. This evolution has led to two implementations at this writing: gQUIC (for Google QUIC) and iQUIC (for IETF QUIC). Google continues to run its production environment in gQUIC and continues to evolve and improve this protocol as it sees fit, without the need to get formal approval for each change. Like SPDY, gQUIC is expected to die out when iQUIC is formally standardized (expected to happen in early 2019), but for now, gQUIC is the only usable version of the protocol in production environments.

Only Chrome and Chromium-based browsers such as Opera implement QUIC (where it uses gQUIC), and gQUIC undergoes frequent change as the Google team changes it.^[34] On the server side, all the Google services support gQUIC. Other web-server implementations at this writing include Caddy^[35] and LiteSpeed,^[36] but because they’re based on the evolving, nonstandardized gQUIC, they’re subject to keeping up with Google changes and may fall behind and stop working with Chrome.^[37]

³⁴See the Recent Changes by Version section of https://docs.google.com/document/d/1WJvyZflAO2pq77yOLbp9NsGjC1CHetAXV8I0fQe-B_U/.

³⁵https://github.com/mholt/caddy/wiki/QUIC

³⁶https://blog.litespeedtech.com/2017/07/11/litespeed-announces-quic-support/

³⁷https://github.com/mholt/caddy/issues/2194

Differences between gQUIC and iQUIC

This topic is evolving as each protocol advances, but at this writing, one of the main differences between gQUIC and iQUIC is in the encryption layer. Google used a custom cryptography design, whereas iQUIC is using TLSv1.3.^[38] This choice was made only because TLSv1.3 wasn’t available when QUIC was invented. Google stated that it will replace its custom cryptography design with TLSv1.3 when it’s formally approved,^[39] which has now happened, so gQUIC and iQUIC will likely converge. A few other changes exist between the two protocols, which aren’t compatible, but at a conceptual level, except for the use of TLSv1.3, they’re similar.

³⁸https://tools.ietf.org/html/rfc8446

³⁹https://docs.google.com/document/d/1g5nIXAIkN_Y-7XJW5K45IblHd_L2f5LTaDUDwvZ5L6g

The QUIC standards

At this writing, there’s no one QUIC standard, but six! Like HTTP/2, which is made up of two standards (HTTP/2 and HPACK), QUIC has separate standards for its main parts:

QUIC Invariants^[40] —The parts of QUIC, which shouldn’t change in future versions

⁴⁰https://tools.ietf.org/html/draft-ietf-quic-invariants
QUIC Transport^[41] —The core transport protocol

⁴¹https://tools.ietf.org/html/draft-ietf-quic-transport
QUIC Recovery^[42] —Loss detection and congestion control

⁴²https://tools.ietf.org/html/draft-ietf-quic-recovery
QUIC TLS^[43] —How TLS encryption is used in QUIC

⁴³https://tools.ietf.org/html/draft-ietf-quic-tls
HTTP/3^[44] —Heavily based on HTTP/2 with some changes

⁴⁴https://tools.ietf.org/html/draft-ietf-quic-http
QUIC QPACK^[45] —Header compression for HTTP in QUIC

⁴⁵https://tools.ietf.org/html/draft-ietf-quic-qpack

One more experimental document has been proposed: QUIC Spinbit^[46] would add a single bit to be used for basic monitoring of encrypted QUIC connections. Two additional informational documents on using QUIC are available, for application developers^[47] and for managing QUIC on the network.^[48]

⁴⁶https://tools.ietf.org/html/draft-ietf-quic-spin-exp

⁴⁷https://tools.ietf.org/html/draft-ietf-quic-applicability

⁴⁸https://tools.ietf.org/html/draft-ietf-quic-manageability

The IEFT Working Group is working on these documents at this writing. Because the standard is still being worked on, these specifications (and even the number of them) are subject to change.

One important point to note is that QUIC is intended to be a general-purpose protocol; HTTP is only one use of it. Although HTTP currently is the main use case for QUIC and what the working group is concentrating on at present, the protocol is being designed with potential other use cases in mind.

9.2.5. Differences between HTTP/2 and QUIC

QUIC builds on HTTP/2, so many of the core concepts you’ve learned in this book will stand you in good stead when QUIC becomes a standard and use grows beyond Google servers and browsers. Some key differences exist, however, including the underlying UDP protocol. The following sections discuss other differences.

QUIC and HTTPS

HTTPS is built into QUIC, and unlike HTTP/2, it doesn’t make QUIC available for unencrypted HTTP connections. This choice was made for the same practical and ideological reasons as HTTP/2 being available only over HTTPS for web browsing (see chapter 3).

On the practical side, encrypting the data ensures that parties that are unfamiliar with the protocol won’t unwittingly interfere with or make assumptions about the protocol. Although this situation may not seem to be a problem now (no infrastructure should be expecting HTTP traffic over UDP), it has already caused problems for QUIC, with middlebox vendors making assumptions that no longer held true as QUIC evolved.^[49] As the protocol evolves, it will become even more important to prevent the ossification experienced under TCP, with assumptions being made by middleboxes inspecting TCP traffic. QUIC aims to encrypt as much as possible. A proposal to allow a single unencrypted bit to allow middleboxes to monitor traffic^[50] was met with much consternation,^[51] and at this writing, no firm conclusion has been reached (though the proposal is included as a working draft, as mentioned earlier in this chapter).

⁴⁹https://www.youtube.com/watch?v=BazWPeUGS8M&feature=youtu.be&t=2216

⁵⁰https://datatracker.ietf.org/doc/draft-ietf-quic-spin-exp/

⁵¹https://news.ycombinator.com/item?id=16695816

Establishing a QUIC connection

HTTP/2 established several methods to negotiate the HTTP/2 protocol, including ALPN, the Upgrade header, prior knowledge, and the Alt-Svc HTTP header or HTTP/2 frame. All these methods assume the use of TCP initially, however. Because QUIC is based on UDP, a web browser connecting to web servers has to start a connection on TCP and upgrade to QUIC.^[52] This process introduces a dependency on HTTP over TCP and therefore negates one of the key benefits of QUIC (dramatically reduced connection establishment time). Alternatives include trying both TCP and UDP or accepting the initial performance hit, perhaps remembering next time that the server uses QUIC. Regardless, the ALPN and Alt-Svc identifier h3 will be registered for HTTP/3 (note: this was originally hq for HTTP over QUIC before the HTTP/3 name was agreed upon). This identifier should be used only for the official iQUIC when it becomes standardized; current gQUIC implementations shouldn’t use this reserved value.^[53]

⁵²https://tools.ietf.org/html/draft-ietf-quic-http-12#section-2.1

⁵³https://github.com/w3c/navigation-timing/issues/71

QPACK

HPACK, which is used for header compression, depends on the guaranteed nature of TCP to ensure that HTTP header frames are received in order, so that the dynamic table can be maintained correctly on both sides, as shown in figure 9.20.

Figure 9.20. HPACK compression example

Request 2 uses header indices defined in request 1 (62 and 63). If part of request 1 is lost, so that the header can’t be read in full, the state of the dynamic table can’t be known, so request 2 can’t be processed until the missing packets are received, as, otherwise, the incorrect references could be used. QUIC aims to remove the need for guaranteed in-order packet delivery at connection level to allow streams to be processed independently, but HPACK still requires this guarantee (at least for header frames), reintroducing HOL blocking, which is the very problem it’s trying to solve.

HTTP/3, therefore, needed a variation on HPACK, which was called QPACK (for obvious reasons). This variation is complex and is still being defined at this writing, but it appears to introduce the concept of acknowledged headers. If a sender needs to use an unacknowledged header, it can use it (and risk being blocked on that stream) or can send the header with literals (preventing blocking at the cost of less efficient compression for that header value).

QPACK introduces a few other changes. A bit defines whether the static or dynamic table is used (rather than explicitly counting from 61, as per HPACK). Also, headers can be duplicated more easily and efficiently to allow key headers (such as :authority and user-agent) to remain near the top of the dynamic table and be transferred in fewer bits.

Other differences

There are a few other changes in the frames and streams used by QUIC.^[54] Some of the Transport layer protocols’ frames are removed from the HTTP/3 layer (such as PING and WINDOW_UPDATE frames) and moved to the core QUIC-Transport layer, which isn’t HTTP-specific (which makes sense as these frames are likely to be used for non-HTTP protocols over QUIC). Also, the CONTINUATION frame, which was little used in HTTP/2, has been dropped from HTTP/3. There are also some frame formatting changes, but because the protocol is still evolving at this writing, I won’t discuss them here. Conceptually, nearly all of HTTP/2 remains in one format or another, and readers who have made it this far will have a good grounding in QUIC and HTTP/3 when they’re formally standardized and become available for client and server implementations.

⁵⁴https://github.com/quicwg/wg-materials/blob/master/interim-18-06/HTTP.pdf

9.2.6. QUIC tools

Because QUIC hasn’t yet been standardized, only gQUIC is available in the wild, though many developers are working on iQUIC implementations.^[55] Often, the best tool to use to see QUIC is Chrome when it’s connected to a Google server. A net-export page similar to HTTP/2 (see section 4.3.1) is available. When you click a QUIC session, you see a screen like figure 9.21.

⁵⁵https://github.com/quicwg/base-drafts/wiki/Implementations

Figure 9.21. Viewing QUIC data from Chrome

Other tools, such as Wireshark, have some support for gQUIC, as shown in figure 9.22.

Figure 9.22. gQUIC in Wireshark

Because gQUIC isn’t standardized and still being changed by Google, it needs to keep up with any changes Google makes. In my experience, you may find malformed packets or encrypted payloads that can’t be read by non-Google tools for this reason.

9.2.7. QUIC implementations

The story is similar if you want to implement a QUIC server. Caddy had an implementation of gQUIC based on the QUIC implementation in the Go programming language, but that implementation has been turned off as of this writing in the current release version.^[56] It’s available through compiling Caddy from source code and should make it to the next release. The Go version^[57] is usually kept up-to-date, so if you download the latest version, Chrome should be able to speak gQUIC to it. Similarly, LiteSpeed has had a QUIC implementation since June 2017^[58] and has kept its implementation up-to-date, but the open source version doesn’t support it yet, so it’s not a great tool for experimenting with QUIC unless you’re already using LiteSpeed. LiteSpeed also open sourced a QUIC client^[59] that could be useful. More recently, Akamai announced gQUIC support on its content delivery network platform in May 2018,^[60] and in June 2018, Google announced gQUIC support for its Google Cloud Platform load balancer,^[61] so those who use that platform get gQUIC straight from the horse’s mouth, so to speak.

⁵⁶https://github.com/mholt/caddy/issues/2190

⁵⁷https://github.com/lucas-clemente/quic-go

⁵⁸https://blog.litespeedtech.com/2017/06/26/litespeed-is-powered-by-quic/

⁵⁹https://github.com/litespeedtech/lsquic-client

⁶⁰https://community.akamai.com/customers/s/article/FAQ-QUIC-Native-Platform-Support-for-Media-Delivery-Products?language=en_US

⁶¹https://cloudplatform.googleblog.com/2018/06/Introducing-QUIC-support-for-HTTPS-load-balancing.html

9.2.8. Should you use QUIC?

Unlike SPDY, gQUIC hasn’t been taken up by much of the wider community, which seems unlikely to happen now that iQUIC is being standardized. At this point, it’s difficult to recommend QUIC except when using the Google cloud platform. For anyone who wants to experiment with QUIC, Go probably is the best option, but it probably shouldn’t be used in production for browsers. The browser implementation in Chrome is liable to change quite a bit, and Chrome switches off older versions of gQUIC in browsers quickly after rolling out new versions.

After iQUIC standardizes, I expect more implementations to crop up. There are fewer production implementations at this stage of standardization compared to SPDY. I suspect that the roll-out of QUIC and HTTP/3 will take longer than the roll-out of HTTP/2, as it’s a much bigger change and because it uses UDP rather than TCP. QUIC is a protocol to watch in the future, and a few years after standardization, I expect developers to be where they are now with HTTP/2, with use rapidly increasing, eventually becoming the majority player on the web landscape. QUIC adoption for a lot of web traffic may happen quickly, with a few players (such as Google) and CDNs serving the majority of traffic, but the long tail of smaller companies and servers will likely remain on the older TCP and HTTP/2 (or even HTTP/1.1) stack for some time.

Summary

The current HTTP network stack has several inefficiencies in the TCP and HTTPS layers.
Because of TCP connection establishment and cautious congestion control, it takes time for a TCP connection to reach maximum capacity, and HTTPS handshaking adds more time.
Innovations that resolve these inefficiencies exist, but on the TCP side in particular, they’re slow to roll out.
QUIC is a new protocol built on UDP.
By using UDP, QUIC aims to innovate much faster than TCP can.
QUIC builds on HTTP/2 and uses many of the same concepts with additional innovations.
QUIC isn’t intended for HTTP only; it may also be used for other protocols in the future.
HTTP over QUIC will be called HTTP/3.
QUIC is available in two versions: Google Quic (gQUIC), which is available in a limited fashion but isn’t standardized, and IETF QUIC (iQUIC), which is currently being standardized.
gQUIC is expected to be replaced by iQUIC when it’s approved, much as SPDY replaced HTTP/2.