Implementing HTTP from Scratch: An Overview
August 22, 2025Table of Contents
Recently, I decided to implement the HTTP/1.1 protocol from scratch in Common Lisp. The motivation is mainly educational; I want to understand what goes on under the hood when I use a web browser, or a lower-level extensible server like Hunchentoot. I have some vague aspirations of eventually supporting HTTP/2 if this goes well, but for now we'll consider that a mere pipe dream.
This is the first of a series of posts where I'll summarize the project and explain some of the foundations to the best of my ability. Please keep in mind as you read: I am by no means an expert on this subject. These posts are part of the learning process, and are therefore likely to contain inaccuracies and mistakes.
For far more extensive user-friendly documentation on HTTP, you may want to peruse the lovely and open-source MDN Web Docs HTTP reference pages, maybe starting with the Overview of HTTP. Or go straight to the sources:
Ok, so what actually are you talking about?
Zooming out a sec
The internet is just a bunch of computers talking to each other. In general, all web pages (and the internet at large, and data on "the cloud") are stored at a physical location, on physical memory, on someone else's computer(s).1
When a computer is configured to listen to incoming requests and reply back with content, it is referred to as a server. Any computer with network capability can act as a server, including your laptop or phone, though there's also specialized hardware for this purpose. Don't be fooled: they're just big, fancy computers. Like your laptop, but these go to eleven. The particular software on the computer doing the listening may also referred to as a server, and there may be more than one per computer (e.g. a Minecraft server).
The computer making the request, such as the device you're reading this on, is
referred to as the client. When you visit dieggsy.com
, your phone (the client)
is requesting information from a different computer (the server) pointed to by
that url. The server decides whether you should get any information, and what
it should be. The particular software on the computer requesting the
information may also be referred to as a client, and there may be more than one
per computer. Think of all the apps you use that access the internet, like the
browser itself, as clients.
The information is requested and received using the HyperText Transfer Protocol (HTTP).
The HTTP stack
HTTP is a protocol by which computers communicate. It's essentially an agreed-upon set of rules describing the syntax of instructions and the behavior of the client and server. It is not the only network communication protocol, just a very popular one that most people with computers and phones interact with on a daily basis. It's also not the lowest level protocol; it sits one or a few layers below what you see in your web browser (Mozilla Firefox, Google Chrome, Safari, etc.), and sits on top of a few other technologies.
Below is a diagram representing the classic HTTP tech stack (modified from the MDN Web Docs link in the footnotes).
╭ ╭────────┬───────┬────────────╮ ╮ Client requests, ┤ │ HTML │ CSS │ JavaScript │ ├ Client receives, Server sends ╰ ╰────────┴───────┴────────────╯ ╯ displays web pages │ ╭─────────────────────────────╮ ╰── using ──── │ HTTP │ ────────╮ ╰─────────────────────────────╯ │ ╭ ╭─────╮ ╭──────┬──────────────────────╮ transmitted Turns names ┤ │ DNS │ │ │ TLS (encryption) │ via into addresses ╰ ├─────┤ │ └──────────────────────┤ ╮ │ │ UDP │ │ TCP │ │ │ ╰─────╯ ╰─────────────────────────────╯ ├───────╯ ╭─────────────────────────────────────╮ │ │ IP │ │ ╰─────────────────────────────────────╯ ╯
(You might have to scroll right on mobile)
Let's break it down, starting from the bottom up. Each of these could be a whole blog post that I'm even less qualified to write about on its own. I'll link to relevant Wikipedia pages if you want to do more reading.
IP
IP stands for Internet Protocol. It's sort of like the "mailman" of the
internet. The Internet Protocol concerns how to actually take the data from
computer A to computer B by hopping across devices in the network. IP
addresses are very much like house addresses; they indicate how to find a
specific computer across the net. They are traditionally represented as a
sequence of period-separated integers like 192.168.0.4
.
TCP
TCP stands for Transmission Control Protocol. It's basically a verification layer on top of IP. It establishes connections between computers, and is in charge of ensuring data is transferred correctly. Most famously, it ensures that no data is lost in transmission. If a TCP server or client does not receive acknowledgement that some data was received, it is re-sent. Data is also re-sent if it appears to be corrupted.
TCP introduces the idea of port numbers, which represent separate software interfaces on which software servers can be established. An IP address along with a port number can identify a particular software server running on a computer; the same IP address with a different port number may identify a different software server on the same computer. They can handle completely independent connections with completely independent clients and data.
As a programmer, TCP is actually already useful for communication between computers! The operating system (Linux, Android, macOS, iOS, that other one) tends to provide the basic interface for this, and the programmer can just "use" it. We can ignore all the layers above this and use TCP to send arbitrary data between two computers:2 text, video, chat, you name it. All data is sent as a stream of bytes, which rather than a piranha-infested river, is just a way of thinking about data transmission as a sort of… stream of bytes.3
The reason the stack doesn't end here is that we need publicly agreed-upon standards for how to secure data, what it should look like, and more information about how it is sent and received. Otherwise we'd have a Network of Babel situation.
TLS
TLS stands for Transport Layer Security. TCP is not particularly secure on its own. It does not provide clients with a way to verify the identity of the server, or ensure that intermediate devices cannot tamper with the data sent between devices. TLS does this by encrypting and signing data sent over TCP connections. In simple terms, encryption is when the server garbles data so that only a particular client knows how read it. Signing is a way of sending along a stamp of approval identifying that the server is authentic.
I only have a slightly deeper understanding of TLS than I've described. It's a fairly complex subject and shouldn't be taken lightly in real-life public-facing software. Luckily, there are software libraries written by experts that take care of this for us, such as OpenSSL.
TLS is not technically required by basic TCP or HTTP,4 but web browsers enforce it for security reasons.
DNS
DNS stands for Domain Name System. It is not technically required in this stack. Computers would be happy to talk to each other usingly only IP adresses and port numbers, but that's not particularly useful to humans. Humans would like to remember websites they visit, or associate them with known entities like companies or people.
DNS describes how human names, called domain names, are assigned to IP
addresses. In simple terms, once you have an computer with an IP address, you
can register a domain name for it, like dieggsy.com
. This typically costs money
and there's a whole market around it.5
Domain names are usually registered under a particular Top Level Domain (TLD),
like .com
or .org
. When you register your domain, like dieggsy.com
, you also
get everything to the left of it, so you're free to point foo.dieggsy.com
or
foo.bar.dieggsy.com
(and so on) to the same IP address, or a different one. You
can also point totally different domain names to the same IP address.
When your computer connects to dieggsy.com
, it's actually first doing a DNS
lookup to determine what the actual IP address of that domain name is, then a
connection is established using TCP (optionally with TLS). Oh, and DNS lookup
does happen over the Internet Protocol, but traditionally using UDP instead of
TCP, and that's all I'll say about that for now.
Wait, what about port numbers?
DNS doesn't actually include port numbers. The standard way of specifying them
is to specify the port number after the IP address or domain, such as
dieggsy.com:80
, or more familiarly, to use a protocol with an implied or
agreed-upon port number before the domain, like http://dieggsy.com
. HTTP
defaults to port 80 as a standard. HTTPS, which is just HTTP using TLS,
defaults to port 443 as a standard.
HTTP, finally
Got it. Cool. Fundamentals. Neat. What's HTTP?
As I mentioned, we could use TCP (+ TLS) to reliably communicate between
computers. But we'd have to come up with a language to speak with. How should
you ask the server for information? You could send the sequence of characters
Tell me what I want to know, or else...
, but it's highly unlikely the server
speaks English (though I wouldn't put it beyond kids these days with their
newfangled toys). And once you know how to ask, how should you receive it? If
the server sends you a document, or an image, it arrives as a stream of bytes.
How do you know when you've received the whole document, or what kind of image
it is, or whether it's an image at all?
Enter HTTP. A specific British man called Timothy Berners Lee came around and said, "hey guys, why don't we do it like this?" And we said, "Sure thing TimBL, that sounds swell." And now he's Sir Timothy Berners Lee. Yep, guess the Queen of England thought it was pretty swell too.
I'll try to keep this relatively simple, since I plan to write more detailed posts about this subject as my project progresses. I'm going to omit some details, which might annoy you if you already know about this stuff.
The HTTP message
The core of the HTTP protocol is the HTTP message. It's a particular way to organize the data used to send instructions and extra information between client and server, in addition to the raw bytes of the actual content itself. The HTTP standard defines the syntax of messages, such as allowed characters and exact format.
HTTP messages come in two flavors:
- Requests are used by the client to give the server instructions, such as asking for a website or downloading and uploading files. There are other types of requests though, including for deleting data (imagine you're removing a social media post or account).
- Responses are used by the server to send back data, including whether it understood or accepted the request, the kind of data it's sending back, and the actual bytes of the data itself (like website contents or files).
Here's what a typical, simple request might look like if you visit this exact page (ignoring TLS for simplicity):
GET /http-implementation-overview.html HTTP/1.1 Host: dieggsy.com
To recap:
- You type or click on
http://dieggsy.com/http-implementation-overview.html
. - Your browser looks up what IP address
dieggsy.com
is associated with (using DNS) - It starts a TCP connection to the server on that IP address using port 80
(because of the
http://
bit).
The browser now quite literally sends the server the text above, as bytes over
the TCP connection.6 GET
is the name of the method used to ask for
content. There are other methods; for example, POST
is used to upload data.
What the above GET
request says is: "Please send me the content you understand
to be under the path /http-implementation-overview.html
, using HTTP protocol version
1.1. I visited using the dieggsy.com
domain."
If there's a server listening at that address and port number, which is not guaranteed (which is sometimes what we mean when we say a website is "down"), it may then literally send the following text as bytes:
HTTP/1.1 200 OK Content-Length: 2000 Content-Type: text/html <!DOCTYPE html> <html lang="en"> <head> ...
[truncated for brevity]
What the server is saying is: "Using HTTP protocol 1.1, I understood your request and have accepted. I'm sending you 2000 bytes of data, and it is HTML text."
200 is just the agreed upon status code for "you got it, no problem."
The Content-Length
is important so that the client knows that it has received
the full content and can stop reading bytes from the TCP connection
stream.7 The Content-Type
is also important, so that your browser can
determine what to do with the data: display it as a web page, show a video,
download it as a file, etc. The request and response may also contain
additional metadata about the information requested or sent; these are called
headers. Host
, Content-Length
, and Content-Type
are all headers.
Then follow the actual bytes of content, called the message body. In this case,
it's the HTML that makes up the website (that's the <!DOCTYPE HTML>...
part).
Once the browser is done reading the entire HTML document, it can render it as
the nice website you're used to.
Wait wait, what's this HTML/CSS/JavaScript stuff?
Speed run:
HTML is just a particular text format that the browser understands and uses to render and format websites using some default rules. It was actually also invented by Sir TimBL himself.
HTML on its own is sufficient for very simple, relatively nice but rather plain looking websites with formatting like bold and italicized text, bullet lists, and large titles, among other things.
- CSS is a separate but complimentary text format understood by the browser to style HTML, to add colors, introduce additional spacing and layout, and overall make it look pretty (or hideous depending on your taste).
- JavaScript is a programming language that the browser runs to execute scripts used for a large variety of tasks from complex interactivity to "web apps" and in-browser games. Congrats, your devices are constantly running other people's shitty code. Don't worry about it too much, it's usually sandboxed (heavily restricted).
Web pages don't need to use HTML (or CSS, or JavaScript). The above response could just as well have been the following in its entirety:
HTTP/1.1 200 OK Content-length: 12 Content-type: text/plain Hello there!
In which case the browser would happily display to you just the following text literally, with no additional decoration:
Hello there!
HTTP, the rest
The rest of the protocol describes additional semantics of requests, including behavior like:
- How to manage connections
- Recommended or required behavior like handling errors in HTTP messages, or requests for non-existent or restricted content.
- Handling of additional metadata including ways to indicate compressed data
- Basic authentication schemes (login)
- Alternate ways to encode and transfer the content data
- Extensions to the standard
I'll elaborate more on some of these in later posts.
Everything I've described so far pertains to version HTTP/1.1 of the protocol, which is still pretty widely used and supported. HTTP/2 is somewhat similar semantically, but contains upgrades to the syntax of messages and efficiency of data transfer. To my understanding, HTTP/3 is significantly different; it even uses a different transport protocol.
Project goals and scope
You've sure been talking a lot. What are you actually doing?
The goal is to write my own server and client software in Common Lisp. Here are the rules:
- I must write any features or capability defined by the HTTP standard myself.
- I may use existing code and libraries for additional features or helper code not specified by the HTTP standard.
- I must focus on standard compliance and correctness
- I should not prioritize performance (speed), especially not before the server is fully capable (whoops)
Here are some of the components I will implement myself (some of which I haven't yet discussed yet):
- HTTP/1.1 message parsing
- Chunked Transfer encoding; I'll cover this in a later post
- A minimal and extensible HTTP server that behaves in accordance with the HTTP/1.1 standard and is capable of handling multiple connections in parallel. The server will be able to serve content to clients, including web browsers.
- A low level, minimal and extensible HTTP client that behaves in accordance with the HTTP/1.1 standard. The client will be able to download and upload content from the web.
I may implement support for:
- WebSockets
- HTTP/2 (again, pipe dream)
I may use existing libraries to support:
- TLS
I will not be implementing:
- DNS, TCP, or IP (of course, since these are handled by the OS)
- Support for HTTP versions below HTTP/1.1 (sheesh, update your software)
- Data compression (…though this would be a fun aside, separately)
- Anything even remotely close to a browser that renders content or understands HTML, CSS, or JavaScript.
Anyway, that's it. Wish me luck. If you don't hear from me on this again, maybe I gave up. See you on the flip.
Footnotes:
Okay, this is oversimplified, but it's a useful simplification. Websites can also be stored or duplicated across multiple computers, or generated on the fly, or generated locally by your browser using JavaScript. Still, the instructions to do this ultimately sit on someone else's computer(s).
TCP can also facilitate communication between a software client and server on the same computer. This is kinda no different to TCP connections between different computers; the client software just requests data from its own computer's IP address on a particular port. There's a special IP address that always refers to "this computer" (127.0.0.1) that prevents the data having to go through other network devices at all.
Think of it like a river of binary data flowing between computers over the network, in chunks of 8 bits called a byte. We generally don't have to think too much about how to turn data (text, files) into bytes; we have existing interfaces that do this for us.
HTTP/3 does require it, but it's also built on top of UDP and not TCP, so that's fun.
In fact you don't need an IP address to register a domain. You can just register it and hold on to it without using it. You may do this in good faith, for example because you have plans for a future website. Some people do this to re-sell domains that are or may become high-value; this is called domain squatting. Don't do that, probably.
If you're curious about how text becomes or is represented as bytes, congrats, you've discovered character encoding. Godspeed.
In fact, there are other ways to indicate to the client when to stop reading the data, but we'll cover that later.