If you were creating a web app from scratch today, what database would you use? Probably the most frequent answer I see to this is Postgres, although there are a wide range of common answers: MySQL, MariaDB, Microsoft SQL Server, MongoDB, etc. Today I want you to consider: what if SQLite would do just fine?

For those who are unfamiliar, SQLite is a implementation of SQL as a library — this means that rather than running a SQL server that you program talks to, you embed the SQL implementation directly in your program, and it uses a single file on disk as a backing store. SQLite is a incredibly popular database — you likely have dozens or hundreds of SQLite database on whatever device you're reading this on, regardless of whether that's a smartphone or a tablet, and regardless of what manufacturer or browser it is. Historically, use of SQLite has usually been constrained to user programs (like browsers or apps), rather than servers. However, due to improvements in SQLite, hardware performance, and third-party libraries over the past ~decade, it now makes a lot of sense to use SQLite for many web projects.

In this post, I'll compare a typical "multitier architecture" with a web server and separate database server (postgres, etc) to running that same app all on a single machine using SQLite. I won't be comparing to more "cloudy" solutions (that'll be the subject of another post), but I think it's still useful to consider this sort of thing even if you end up using some sort of IaaS product. We'll look at scalability, performance, availability, operational complexity, and a few other things.

How scalable is SQLite, really?

There are three main metrics on which we might want a database to be able to scale: the total amount of data it can store, the read throughput, and the write throughput.

Write througput is the area where SQLite struggles the most, but there's not a ton of compelling data online about how it fares, so I got some of my own: I spun up a Equinix m3.large.x86 instance, and ran a slightly modified1 version of the SQLite kvtest2 program on it. Writing 512 byte blobs as separate transactions, in WAL mode with synchronous=normal3, temp_store=memory, and mmap enabled, I got 13.78μs per write, or ~72,568 writes per second. Going a bit larger, at 32kb writes, I got 303.74μs per write, or ~3,292 writes per second. That's not astronomical, but it's certainly way more than most websites being used by humans need. If you had 10 million daily active users, each one could get more than 600 writes per day with that.

Looking at read throughput, SQLite can go pretty far: with the same test above, I got a read throughput of ~496,770 reads/sec (2.013μs/read) for the 512 byte blob. Other people also report similar results — Expensify reports that you can get 4M QPS if you're willing to make some slightly more involved changes and use a beefier server. Four million QPS is enough that every internet user in the world could make ~70 queries per day, with a little headroom left over4. Most websites don't need that kind of throughput.

While mixed read/write workloads will be slightly slower than these numbers, the main problem that SQLite had with mixed read/write workloads (that writes would block reads) was fixed with WAL mode around a decade ago.

If the server I rented seems too expensive to you, I also tested it out with the cheapest ($5/month) machine I could get on DigitalOcean, and ran the 512-byte test on it. While one should expect more volatile results, since it's a shared host, I got 35.217μs/write and 3.340μs/read (28,395 writes/sec, and 299,401 reads/sec) — ~2.6× the time for writes and ~1.7× the time for reads — not bad for ~3% the price.

In terms of the amount of data that it can store, SQLite scales pretty well: it can hold up to 281TB of data, which is I think more than the sum total storage space of every disk I've owned in my life. While Postgres can store a theoretically unlimited amount of data in total, it has a lower limit on table size than SQLite does — 32TB with the default block size, up to 128TB with the maximum allowed block size.

How do these numbers compare to real apps? Let's look at two examples: recurse.social, a Mastodon instance serving ~100 users on Postgres, and Lobsters5 a news aggregation site with ~14.5k users running on MariaDB.

The Lobste.rs database is the slightly larger of the two, weighing in around 2200MB (largely comprised of the story_texts, votes, and comments tables, respectively). It sees around 28 SELECTs / sec, 0.1 INSERTs / sec, 0.1 UPDATEs / sec, and 0.004 DELETEs / sec6. This is quite small — well within the capabilities of any competent database system on reasonable hardware7 — when someone looked at it a few years ago, they concluded that Lobsters would need to see about 2,000× the traffic before the database was a scaling bottleneck on a 8-core machine.

recurse.social, despite the much smaller number of users still has a reasonably large database, since it has to sync with many other servers due to the nature of ActivityPub and the Fediverse. The total size of the database is 1543 MB (largely contained in statuses, mentions, conversations, and status_stats), just 0.0005% of SQLite's maximum database size. It sees around 2.7 transactions per second (Postgres doesn't easily let you separate this out into read/write transactions, sadly) — numbers that are trivially achievable on SQLite with no performance tuning and ancient hardware.

You might think that it'll be harder to scale SQLite, since it forces your web server and your database server to be on the same machine, which in turn forces you to only have a single server, meaning that you can't scale by just adding more web servers. However, in most cases, it's noticeably less efficient to run your database on a separate server from your webserver, since it massively increases the latency and decreases the bandwidth available between the database and the webserver. Comparing point query latency on SQLite and Postgres, Postgres is 9.5× slower when running on the same machine as the one doing the query, 17.8× slower when running on a different machine in the same AWS Availability Zone, and 50× slower when running in a different AZ but the same region.8

A significant part of the reason that people discount SQLite as a web database server is that in the past, it probably wasn't a good choice! SQLite has relatively quietly gotten significantly faster over the past decade — on my laptop, a recent version of SQLite completes the speedtest1 suite ~4.1× faster than a version of SQLite from 2010 on the same hardware (5.328s now, vs 21.849s then). Running speedtest1 on a $5/month DigitalOcean instance is around 10.8× faster than running the same benchmark with 2010 SQLite on a VPS running on a Intel Xeon E5-2450L with a HDD, which was a fairly typical midrange server in ~2012.

Not only has low-end server hardware improved significantly, but the upper limits of how much you can scale by just buying a bigger (but still commodity) machine have massively increased. These days, you can get servers with 8TB of RAM, hundreds of cores, multiple 100Gbps NICs, and SSDs with speeds approaching the same order of magnitude as RAM, which makes being limited to a single machine much less worrisome from a scaling perspective.

In the past decade, high-end servers have gotten two or three orders of magnitude better on nearly every metric, while website traffic has remained relatively steady. At the same time, SQLite has made significant performance gains, and fixed many of the problems that prevented from making sense as a webapp database — for instance, WAL mode (enabling concurrent reads and writes), JSON support, upsert support, improvements to ALTER TABLE, and strict typing. SQLite has been extremely impressive in the number of features and performance improvements that they've been able to add while maintaining extremely high levels of backwards compatibility and code quality9.

SQLite does have some legitimate scaling limitations, just like every system — if you need to write dozens of MB/s to the database, or have very complex or long-lived write transactions, it probably isn't for you. But the limits are likely higher than you expect!

But what about reliability?

"But Wesley!" I hear you shouting at your screen "If I run my website on one server, and that server goes down, then my entire website goes down!"