360 Degrees of Innovation - Interview with Serkan Özal, CTO of Thundra
Serkan Özal is the co-founder and CTO of Thundra. Based in Ankara, Turkey, Thundra is a technology company that provides a platform for observability and security, relevant to serverless-centric, container, as well as virtual machine applications. Before founding Thundra, Serkan was a software engineer and engineering manager at several different technology companies, including OpsGenie (acquired by Atlassian) and Hazelcast.
Companies are increasingly migrating to microservice-based application architectures. Microservices are small, loosely coupled services that can be independently released. Typically, each microservice handles a separate task within the broader context of an application. Applications that reflect this architecture are essentially distributed systems consisting of sometimes hundreds of microservices interacting with each other. Hence, a single request can trigger a whole cascade of service calls; this makes debugging a real challenge as the root cause of a problem can be located anywhere within the cascade. The ability to trace service requests throughout the cascade in an end-to-end fashion is therefore crucial.
This is where Thundra shines – providing application monitoring, distributed tracing, and debugging in microservice-based applications. Thundra agents collect detailed metrics from the application itself rather than from the underlying infrastructure. This allows us to identify the root cause of many problems much faster than with infrastructure monitoring tools.
Thundra supports automated local and distributed tracing, both for request-response and event-driven scenarios. In addition to automated tracing that works out-of-the-box, we also support manual distributed tracing. This is where developers can define custom metrics to trace business flows end-to-end across multiple users acting on the same data entity. I’ll explain using the example of a blogging application. In a workflow, a single blog article may be created, edited, reviewed, and published by different users over multiple days or weeks. Down the line, a bug may be detected in the editing phase, but the root cause occurred upstream in the source code linked to creating the blog post. Manual tracing can link different activities performed on the same entity, i.e., a specific blog post in this case.
Once the problem has been localized using Thundra’s tracing capabilities, its debugging features allow developers to investigate the underlying issue. In particular, debugging can be carried out both offline in a local environment as well as online in the production environment; this is crucial for serverless applications because the behavior offline (local) and online in the cloud can be quite different.
In summary, Thundra supports users from development to production.
In early 2017, I joined OpsGenie as a software engineer and started working on a pilot project experimenting with AWS Lambda. The goal was to gain experience migrating the first service to serverless and then to continue migrating other services. We searched for an appropriate service to monitor serverless applications, but none of the existing tools suited our needs. Hence, we decided to build our own, resulting in the earliest version of what is now Thundra. When Atlassian acquired OpsGenie in 2018, it was decided that Thundra deserved its own separate company.
Serverless refers to a paradigm where servers are abstracted away from the developers. There are two types of serverless: first, function as a service, where small code blocks are deployed to a cloud platform, and executed upon defined events. Second, back-end as a service, where specific application components are provided by cloud providers and integrated via APIs.
Serverless reduces the burden for both developers and operations.
Using serverless always poses a trade-off between advantages and drawbacks. In my opinion, there are three main benefits. First, it reduces the burden for both developers and operations. For example, you don’t have to worry about scalability or patching security issues because these responsibilities are taken care of by the provider. Second, the cost model of pay-what-you-use reduces the risk of paying for unused resources. Third, it obligates developers to employ a microservice approach.
On the flip side, serverless also comes with certain downsides; there are three that pose the biggest issue in my books. First, despite improvements, slow cold starts are still a problem for many use cases. Second, despite being prevalent among enterprises and legacy applications, Java is not very popular in the serverless space due to its large overhead that leads to especially lengthy cold starts. Third, the tooling ecosystem for development and operations still has room for improvement.
In general, migrating a traditional architecture to serverless requires significant rearchitecting. Simple lift-and-shift would result in performance issues at run-time. Typically, the main pillars of rearchitecting are to split applications into microservices and to change from synchronous to asynchronous inter-service communication. In the case of existing applications, companies should complete these pillars before migrating to serverless. At the same time, it is a good idea to take the first steps into serverless with greenfield, non-mission-critical applications. This can be as simple as scheduled tasks in the DevOps area. As a company gains more experience with serverless, they will start to see an increase in convenience, sparking greater confidence to continue the journey.