A comprehensive guide to configuring your site

QUIC and HTTP/3: The Next Step in Web Performance

QUIC and HTTP/3: The Next Step in Web Performance

A comprehensive guide to configuring your site

QUIC and HTTP/3: The Next Step in Web Performance


Over the years, I've been involved with deploying web sites for many years, and also in pentests of many sites where I see lots of misconfiguration. Throughout this journey, I've witnessed firsthand the significant impact that protocol updates can have. The introduction of HTTP/2 fundamentally altered how we approach web service delivery, prioritizing efficiency and speed. Now, HTTP/3 is poised to do the same, ushering in a new era of web performance and potentially even security.

How did we get here?

Since the invention of the web in 1991, we've seen steady progress in the capabilities of the fundamental building blocks of the web: HTTP, HTML, and URLs.

  • HTTP/0.9: 1991, RFC
    • GET only, HTML only
  • HTTP/1.0: 1996, RFC1945
    • POST and other verbs, MIME
  • HTTP/1.1: 1997, RFC2068,2616
    • Keepalive, pipelining, host header, updated in 2014
  • HTTP/2: 2015, RFC7540
    • SPDY, binary protocol, multiplexed streams
  • HTTP/3: 2022, RFC9114

What did HTTP/2 change?

Binary protocol

Switching to a binary protocol represented a major shift in HTTP's architecture, and made several other options possible. This enhanced protocol was initially available in the form of "SPDY", and was implemented in several browsers and servers as an experimental extension that eventually evolved into HTTP/2.

Header compression

Text-based protocols are not good for operations like compression and encryption, and the binary protocol allowed HTTP to enable compression of HTTP headers, not just the body.

Multiplexing

HTTP initially tried to improve the performance of parallel response delivery by using multiple TCP connections (defaulting to six per domain in most browsers), but this also increased memory consumption and latency as each connection had to do a complete TCP and TLS handshake - this overhead is clearly visible in browser development tools. Multiplexing allowed multiple resources to be transferred over the same TCP connection at the same time. This was a step up from the pipelining and keepalive introduced in HTTP/1.1 as it allowed dynamic rescheduling of resource delivery, allowing for example an important, but small, JSON response to sneak past a bigger, but less important image download, even if it was requested later.

HTTP initially attempted to improve performance for parallel response delivery by allowing multiple TCP connections (typically defaulting to six per domain in most browsers). However, this approach also increased memory consumption and latency due to each connection requiring a full TCP and TLS handshake. This overhead is readily apparent in browser developer tools. Multiplexing, introduced in HTTP/2, addressed this by enabling the transfer of multiple resources over a single TCP connection concurrently. This marked a significant improvement over the pipelining and keepalive mechanisms of HTTP/1.1. Multiplexing allows for dynamic rescheduling of resource delivery, enabling a critical but smaller JSON response to bypass a larger, less important image download, even if it was requested later.

Server push

Server push eliminated some round trips, for example, allowing multiple image or JavaScript sub-resources to be speculatively bundled in the response to a single request for an initial HTML document. Despite the promise of this, especially for mobile applications, this approach has not seen much use.

TLS-only

Despite a great deal of push-back from corporate interests, and that HTTP/2 was ultimately technically allowed to be delivered over unencrypted HTTP, browser makers rejected the entire premise, and all popular implementations only support HTTP/2 over HTTPS, raising the security floor for everyone.

What problems does HTTP/2 have?

Head-of-line blocking

In early HTTP, every resource transfer required setting up a new TCP connection. HTTP 1.1 added pipelining and keepalive, allowing multiple requests and responses to use the same connection, removing a chunk of overhead. This was extended in HTTP/2 multiplexing, allowing dynamic reordering and reprioritisation of those resources within the connection, but both mechanisms are subject to the same problem. If the transfer at the front of the queue is held up, all of the responses queued up on that connection will stall, a phenomenon known as head-of-line blocking.

Network switching

An individual HTTP client connection is usually identified by the combination of its IP and port number. When a client transitions between network connections, for example moving from WiFi to mobile when leaving your house, both of these will change. This necessitates a completely new TCP connection with the new values, incurring overhead in setting up a new TCP and TLS connection from scratch. In situations when connections change rapidly, for example on a high speed train where connections are handed off between cell towers, or in high-density networks, for example, in a stadium, this can result in clients continuously reconnecting, with a dramatic impact on performance.

It's stuck with TCP

HTTP/2 is built on TCP, and as such inherits all of its shortcomings. TCP was designed 50 years ago, and while it's done remarkably well, it has some problems for the modern Internet that its creators did not foresee. However, we are stuck with it as it's implementation is typically tied to the client and server operating systems that use it, so we can't change it to suit one specific networking application, in this case HTTP.

TCP congestion control

One of the key things that can't be changed easily in TCP is the congestion control algorithms that kick in when on busy networks. A great deal of research over the last 50 years has produced approaches to handling busy networks that are superior to what's in TCP, and our inability to deploy them represents an ironic bottleneck of their own.

What are QUIC and HTTP/3?

QUIC (originally a backronym of "Quick UDP Internet Connections", but that's never used in practice) was started at Google in 2012. SPDY, that became HTTP/2, was a stepping stone to improvements in the low-level protocols that we rely on to deliver the web. Fundamentally, QUIC is a reimagining of TCP. Because we can't replace TCP in every device in the world, it needed to be built on an existing lower-level protocol that provides a functional foundation, and a great fit for that is UDP, the user datagram protocol. UDP is much simpler than TCP, and lacks all kinds of features such as reliable delivery, connection identification, lost packet retransmission, packet reordering, and so on. The advantage of UDP is that it's very fast and has very little overhead. UDP is most commonly used for protocols that don't mind losing a bit of data here and there - you really don't care that much about a few pixels glitching in the middle of a video call's frame, or a little click in an audio call; it's more important that the stream keeps going. It's also used for DNS, in scenarios where you don't care which DNS server responds, so long as one of them does.

At this point you're probably thinking, "but we need reliable delivery for the web!". That's true, and it's why we've used TCP to date. It provides the reliability guarantees we need, along with a bunch of other features we might not even use. But we can build reliable transports on top of unreliable...