When we introduced the NGINX Modern Apps Reference Architecture (MARA) project last fall at Sprint 2.0, we emphasized our intent that it not be a “toy” like some architectures, but rather a solution that’s “solid, tested, and ready to deploy in live production applications running in Kubernetes environments”. For such a project, observability tooling is an absolute requirement. All the members of the MARA team have experienced firsthand how lack of insight into status and performance makes application development and delivery an exercise in frustration. We instantly reached consensus that the MARA has to include instrumentation for debugging and tracing in a production environment.
Another guiding principle of the MARA is a preference for open source solutions. In this post, we describe how our search for a multifunctional, open source observability tool led us to OpenTelemetry, and then detail the trade‑offs, design decisions, techniques, and technologies we used to integrate OpenTelemetry with a microservices application built with Python, Java, and NGINX.
We hope hearing about our experiences helps you avoid potential pitfalls and accelerate your own adoption of OpenTelemetry. Please note that post is a time‑sensitive progress report – we anticipate that the technologies we discuss will mature within a year. Moreover, though we point out current shortcomings in some projects, we are immensely grateful for all the open source work being done and excited to watch it progress.
As the application to integrate with an observability solution, we chose the Bank of Sirius, our fork of the Bank of Anthos sample app from Google. It’s a web app with a microservices architecture which can be deployed via infrastructure as code. There are numerous ways this application can be improved in terms of performance and reliability, but it’s mature enough to be reasonably considered a brownfield application. As such, we believe it is a good example for showing how to integrate OpenTelemetry into an application, because in theory distributed tracing yields valuable insights into the shortcomings of an application architecture.
As shown in the diagram, the services that back the application are relatively straightforward.
Our path to selecting OpenTelemetry was quite twisty and had several stages.
Before evaluating the available open source observability tools themselves, we identified which aspects of observability we care about. Based on our past experiences, we came up with the following list.