React is a programming model that makes it easy to build scalable and performant apps. We use React at Discord.
We love React.
However, back in November 2018, on a what we’ll say was a rainy day for the sake of setting a mood, we employed a seemingly routine React upgrade.
This upgrade led us on an unexpected journey through our codebase, the Chrome memory inspector, and the React source code.
We tried to upgrade React from 16.5 to 16.7. After we shipped the upgrade, we started getting reports of users experiencing Discord using a lot of memory — something somewhere was leaking.
At that time, we briefly investigated and identified an issue with our React upgrade, which we quickly reverted.
Upgrading to React 16.7 wasn’t our priority at the time; it wasn’t a huge deal for us to revert. The memory leak was no longer and Discord carried on.
When React 16.8 was released with hooks, Discord engineers wanted to use those hooks. We love hooks. We really wanted this upgrade.
However, we never figured out what went wrong with the failed 16.7 upgrade attempt. If the problem returned, we would start leaking again. For us to upgrade to 16.8 in good conscience, we needed to solve the memory leak.
Here is where we get into the nitty gritty.
Although it wasn’t always consistent, we could observe the memory leak by watching the V8 memory inspector while switching back and forth between two Discord channels. This is a route transition and causes many React components to unmount and mount.
V8 memory inspector snapshot of retained memory
A lot of time was spent in V8’s memory inspector. Memory cannot be freed as long as any “live” object holds a reference to it. In the above screenshot, you can see the leaking memory from a channel navigation.
The biggest offenders are React Fiber nodes (represented as minified
Vr in the screenshot) and detached DOM elements. While we expected to find the answer at the end of the chain, it was often things like V8 internals that we didn’t understand.
We tried bisecting the app by commenting out swaths of the app to see if we could isolate the issue. We upgraded legacy code and removed the use of deprecated lifecycle methods. We learned some best practices, including not referencing DOM elements directly in props and state. We audited our older dependencies like