user-icon Ivan Senić
05. July 2017
timer-icon 5 min

Can opentracing scene benefit from new tracer implementations?

Version 1.7.11 of the open-source APM tool inspectIT introduced the support for remote tracing based on HTTP and JMS communication. inspectIT based the tracing functionality on the opentracing.io approach, fully implementing the opentracing.io Java API as part of it’s java-agent-sdk project. This way it became number 8 on the official opentracing.io supported tracers list. As new tracer implementations are joining the list that has been led by Zipkin for a long time, the question that arises is if the open-source tracing scene can benefit from new tracers. Thus, in this blog post I will try to answer this question by providing a short comparison between inspectIT and Zipkin.

Language independence

As inspectIT is purely made for tracing JVM-based systems only, we already have a huge win for Zipkin as it’s officially supporting Go, Java, JavaScript, Ruby and Scala with number of additional community libraries for other languages as well. Thus, if your stack is not Java only, I guess it’s hard to consider inspectIT as the option as some parts of your system would be untraceable, thus losing the complete end-to-end picture that we aim for.

Integration

If we stick to the Java universe, let’s take a look on how can you integrate the tracers into your application. Zipkin has a great support for SpringBoot applications, so if you have such application it’s simply enough to add a specific dependency and you are ready to go. If this is not the case then it’s more complicated as you need to add proper Brave filters and interceptors to get automatic tracing for your HTTP endpoints. In any case, you’ll need to change your code base in some way, either by altering your dependencies or going deeper into the application’s configuration.

inspectIT on the other hand uses the Java agent approach and byte code modification to add measurement and tracing points into your code. It’s enough to start your Java app with:


and all the frameworks supported by inspectIT will be instrumented. It provides a certain benefit as you don’t need to change the source code, thus you can for example analyse applications that are not yours or easily start your app without inspectIT.

Data

When you start your application with any of the tracers you will start opening the black box(es) and gain some insights about the way your application performes. Both Zipkin and inspectIT provide traces that show what’s the complete time of each trace, how much time was spend on each node you are tracing, who is communicating with who and that’s pretty cool. However, inspectIT comes with one small benefit here and that’s the set of default instrumentation profiles that, in addition, instrument other important parts of your application. For example, alongside your tracing data inspectIT will also collect all SQL statements executed during the request execution, thus providing more detailed trace information out-of-the-box.

Zipkin and inspectIT difference

Difference between trace details in Zipkin and inspectIT when calling same use-case with default configurations

More data

However, soon you will discover that with no additional hacking the data provided by both tools is not enough for meaningful performance diagnosis. Hence, sooner or later you would like to start adding additional measurement points, in order to make your traces more detailed. Sticking to the opentracing.io API you can do this easily by creating spans inside of your source code and explicitly declaring parts of the application to be additionally traced. This usually looks like:


The idea of opentracing.io movement that developers should know what should be traced in same way as they know what should be logged is really great. Still, if you already have a huge existing code base it can be quite a hustle to define all the spans you would like to have. inspectIT can solve this problem in no time, because it’s offering the UI-based configuration interface that allows you to quickly bound measurement points to any Java method, thus you will get the duration of that method executions together with your tracing data. In addition, inspectIT provides the dynamic instrumentation feature which enables user to add and remove instrumentation points without a need for restarting the JVM, similar to the hot-code deployment we are all used to.

What about user interface?

Until now, we have seen that both tools have some advantages, but what’s with the way they present the data to users. Zipkin has a web-based user interface which nowadays seems much more acceptable by users than the fat, Eclipse-based UI client inspectIT is running. The impression is that the Zipkin web-based UI offers more filtering possibilities for searching the traces than inspectIT at the current state, while inspectIT uses the fat client features to present the traces in a more “fancy” way by showing more icons, having a styled details box and offering multiple navigation options. One additional benefit of Zipkin here is that it provides an overview on the node dependencies, thus you can clearly see how the system that’s being traced looks like.

Zipkin dependencies view

Zipkin dependencies view showing the system layout

Sampling rate (!)

If you are running an application with a high load, you will start wondering how much of an overhead will the tracer(s) bring to my production system. Zipkin has a great feature here – sampling rate. With sampling rate you can define that not all user requests are traced, but only a portion (for example 1%) . That can significantly decrease the overhead introduced by the tracing tool. The sampling rate approach is based on the Google Dapper paper that concludes that if a problem exists in a system with high-throughput than the same problem will surface multiple times and would be part of one of the captured traces.

New Dapper users often wonder if low sampling probabilities – often as low as 0.01% for high-traffic services – will interfere with their analyses. Our experience at Google leads us to believe that, for high-throughput services, aggressive sampling does not hinder most important analyses. If a notable execution pattern surfaces once in such systems, it will surface thousands of times.

inspectIT at the moment does not provide the sampling rate feature, it collects all the traces and is thus more suitable for services with lower volume.

Sunny future

What I like about Zipkin is its simplicity and maturity, especially if your system is composed of applications and services built with different technologies. For JVM-based systems inspectIT does provide some nice features on top of the tracing, especially in the direction of showing additional data and injection of user-defined measurement points. On the other side it also lacks some features like sampling rate. The cool thing about inspectIT is that it aims to provide more than just tracing, thus users can benefit from its other features like business context or monitoring support. The big win for inspectIT can be the new end user monitoring feature coming out in the 1.8 version line, where inspectIT will automatically inject a JavaScript agent to the HTML pages delivered by your Java Servlet and automatically start traces from the browser side.

Anyhow, both tools have nothing less than the sunny future.

P.S. Don’t forget to star Zipkin and inspectIT on GitHub if you like what their maintainers provide for the open source community 🙂

Comment article