user-icon Christian Vögele & Thomas Zwickl
26. September 2018
timer-icon 4 min

Modular Open-Source APM Kit Experiment

In our recent Blog Post we introduced the OpenAPM Initiative. Using OpenAPM you are able to design an open source Application Performance Management (APM) landscape that is suitable to monitor  your technology stack. In this blog post we show a proof of concept demonstrating a modular open-source APM solution by creating a small-scale experiment with selected open-source tools.

Experiment Setup

The goal of our experiment is to monitor an application by collecting traces. Furthermore, we want to distinguish the most important business transaction by the number of requests and response times. As common open-source tracer does not enable the detection of Business Transactions we implemented a Business Transaction Analyzer. Thus, we decided to use a OpenAPM stack consisting of the following tools:

AppFin is our in-house financial planning application which will serve as our monitored system in our small experiment. The resulting system collects traces from the monitored system and sends them to Apache Kafka where it is streamed first to the business transaction analyzer. The business transaction analyzer groups the traces and extracts the business transaction name. Each trace is tagged with the found business transaction name and is sent back to Kafka where it is further processed by the database writer. The database writer flattens the traces and stores them in Elasticsearch to make them available to Grafana. Grafana can than be used to visualize the traces and business transaction in a dashboard.

Starting Point

For our experiment we assume the following situation:

  • Apache Kafka is setup and running
  • The system under test is pre-configured with the jaeger agent to facilitate instrumentation and monitoring
  • The Jaeger collector is running and receiving traces from the agent
  • Elasticsearch is setup correctly
  • Grafana is running

In our experiment, we are using a modified version of the Jaeger Collector which posts to Elasticsearch and Apache Kafka at the same time. In our example scenario, we use AppFin as the sample application. For simplicity, in our example, we run the Jaeger Collector, Grafana, and Elasticsearch, all on the same host but the example can be easily transferred to a distributed setup.

Setup Business Transaction Analyzer

The business transaction analyzer is the tool that reads periodically all traces, groups them and tags all traces with a business transaction name. In order to start the analyzer we first have to configure it by using the following application.yml:


To start the analyzer use the following command inside the repository:

Setup Database Writer

The database writer is a tool that reads periodically all business traces, flattens them so Grafana can read them and stores them in the Elasticsearch database. This additional writer is necessary as the initial data received from Jaeger contains nested objects which cannot be read by Grafana. So in order to show values in Grafana, we need first to flatten the nested objects and store them Elasticsearch. In order to start the writer we first have to configure it by using the following application.yml:


To start the writer use the following command inside the repository:

Visualizing the Results in Grafana

Before we can use Grafana we need to import the Dashboard and configure the data source to connect to Elasticsearch. The dashboard can be found here.

Traces

The traces chart illustrates at what time we received how many traces. This chart is interesting to find out when your app is used most often and to track down the time when the performance may decrease due to too many requests.

Traces

Operations

With the operations chart, you can see all available operations that were captured with the tracer and the respective average response time in milliseconds for each operation.

Business Transaction

A business transaction is a collection of operations that describe the end-to-end processing path used to fulfill a service request like opening a project tab. By default, Jaeger Tracing doesn’t define any business transactions, but only operations. For this, we created a business transaction analyzer which fetches all traces from Kafka and analyzes them in order to find and group operations to business transactions. The chart below shows the result after running the business transaction analyzer. We can see that the business transaction Open Project Tab takes the longest with around 233 milliseconds on average.

If we want to take a closer look at the business transaction Open Project Tab we can select the business transaction from the list of available business transactions. The chart below shows all operations that make up the business transaction Open Project Tab. This chart can help to find the culprit responsible for the high latency of a business transaction. In our example, we can see that finding all known projects takes around 70ms, finding all employees around 13ms and executing the query takes around 82ms.

Conclusion

In this blog post, we showed a small-scale proof that it is possible to combine several open-source tools to create a modular open-source APM solution. Furthermore, we extended the available solution by providing a simple application that processes traces in order to detect business transaction.

Tell us what you think! If you are interested in OpenAPM or APM in general and would like to know more, talk to us in the comment section below or reach out to us via email at apm@novatec-gmbh.de.

Comment article