Staging, Canaries, & Development data are two concept that help a developer test in their local environment. This page looks to dive into this topic to explain the pros and cons of this approaches.

Staging

A staging environment is a set of remote servers with the sole purpose of replicating production in such a way that you can test changes before pushing them to customers. A staging environment is can be used to test any kind of changes, assuming the staging environment replicates your production environment enough. Some areas you need to replicate:

This was not an exhaustive list, but aimed to show that every piece of every system needs to be representative of production. If it is not representative, then you risk putting confidence into a change that does not truly work in production.

The issue with staging is that as your applications grow, you likely will be using dozens or hundreds of servers and more complex networking schemes. These become very difficult to replicate without spending equal time and money on staging as you do on production.

Moreover, data is an even more complex issue. You need to make sure that the data is copied over and is representative of the state of the system you are testing against. If a database change goes out between the time you copy data and test it, then you need to make sure those changes are replicated against staging as well. This can become physically impossible as your datasets become larger (100s of GB, TB, or even PB in size). Assuming you can copy data over and have a representative data set, then you run into privacy concerns. What happens if your staging system sends emails? You need to mock out those emails. If there is personally identifiable information in any form, that brings legal concerns. Your staging environment is now subject to all privacy laws, such as Europe's General Data Protection Regulation (GDPR). The systems also become legally in scope in the case of any subpoena for any lawsuits.

All of this is to say that replicating production becomes more and more difficult, and borders on impossible as you scale. If you have a scaled down version, then you put undeserved trust in a system that was inadequately tested.

Staging systems, while ideally great, are not a feasible long term investment.

Development Data

Development data refers to the practice of having data in such a state, and available locally, for developers to use in creating features. This practice falls prey to the many of the same issues of Staging.

To give proper trust into the system, you need representative data or else you are not giving a proper test for the various real-world scenarios to which your app is subjected. However, you now have privacy law concerns, including the aforementioned GDPR and legal scope issues. Your developers laptops would become attack vectors if the data is not correctly anonymized, which many studies have shown is incredibly difficult to do. Your developer laptops also become "in scope" for any lawsuits and can be confiscated as evidence.

Other than the myriad of legal and privacy concerns, you also have issues that different teams need different data sets in different states. Perhaps you have a website hosting firm which handles domain registrations. A team may want domain records that are in various states of expiring, at very specific points in time. Another team doesn't care about domain records, but wants different pages that you host with various character encoding sets. This means you need essentially experts in each area that can hand craft datasets.

Random generation of data would fall prey to the data not being adequate to meet the needs of the various teams. Instead, a shared hand-crafted dataset is recommended on a per-project basis as the teams are experts in their own work. A tool to make those dataset creations easy is a fantastic approach.

Canaries

Canaries are a fantastic approach to gradually test code in production. They work by deploying code to a small subset of servers, essentially running the "old" version of your app and the "new" version at the same time.

This allows you to test the new version and make sure that nothing goes wrong during the gradual increase in servers running on the "new" version. If something does go wrong, you've affected a subset of your customers instead of the entire customer-base. You can then abort and rollback to the old version.

This does not replace testing, but gives you a last line of defense against your development and staging environments not representing the entirety of your production environment, which we discussed in the previous 2 sections. This method does assume that your app is set up in a way that does not assume a single running version, so it may take some work and tooling to set up.