MangaDex is bigger than you think. No, really.
People are often surprised (and/or sneer) at the complexity of some of the technical details we mention, but we do not engage in complexity for fun's sake. The site is simply unreasonably popular in comparison to our budget, and throwing money at our problems just isn't an option.
For reference, MangaDex v3 was at times in the top 1000 most popular websites worldwide. And v5, while still in open beta and missing a lot of features and polish, has already climbed back quite a bit within the top 10 000.
over 24 hours - rate computed as a 30 seconds sliding window

In practice, we currently see peaks of above 2000 requests every single second during prime time. That is multiple billions of requests per month, or more than 10 million unique monthly visitors. And all of this before actually serving images.
In comparison, our ~$1500/month budget would have many who run platforms at a similar scale laugh you out of the room, thinking that surely you forgot a zero. As such, MangaDex is past the point of things "just working" by chance, and requires a mix of creativity, research and experimentation to keep alive and especially to keep growing.
Now is also a good time to remind ourselves of what MangaDex is1.
It is a platform driven by and for the scanlation community on which anyone can read manga for free, ad-free and with absolutely no compromise on image quality.
As we constantly fought to keep MangaDex v3 online and it went through a most painful end, one of the primary goals for the v5 rebuild was to ensure we could operate the platform in a secure and performant fashion in the future.
So, what do we refer to by "infrastructure"?
infrastructure, nounThe underlying foundation or basic framework [...] of a system or organization ~ Merriam Webster dictionary
Well, that is still quite vague. Here it will be used to denote the foundation upon which our applications lie.
In more concrete terms: having a database readily available, monitored and backed up for the backend to use would be infrastructure work, but what the backend actually uses it for (i.e.: end-user features) is out of scope and in the remit of the backend developers.
I like to keep in mind an end-goal when embarking on a complex project over a long period of time. It might be far off, and we might have to take unexpected detours on our way to it, but it's a good guide when making important architectural decisions, and every step taken towards it is a win.
The following, in this order, are what we see as the most important features of our infrastructure: security by design, observability, reliability and scalability.
With this out of the way, let's discuss some of the aspects of our current systems.