user-icon Nadja Hagen & Tobias Dehn
11. November 2020
timer-icon 7 min

Data visualization with Kafka - How to use and connect Grafana to your Cluster

LinkedIn, Netflix, Twitter, Yahoo and PayPal - to only name a few companies which are using Apache Kafka for example for data streaming, messaging, log aggregation or event sourcing. As you can imagine, the collected data volumes will become complex very quickly. Kafka was developed to be highly scalable and fault-tolerant which makes it perfectly suited for Big Data applications. But now that you have collected all the data, how to visualize them easy and fast to get an overview? 

To answer this question, we want to provide a short tutorial on how to connect your Kafka system with Grafana and to visualize Kafka data on a Grafana dashboard. The instructions are based on a Docker-setup of your Kafka cluster. If you do not have a Docker setup, the Confluent documentation can help you.

In the following, we will step through these points:

  1. The right data sink for connecting Kafka and Grafana
  2. Include and configure your database in your docker-compose file
  3. Configure the Kafka connector between Kafka and your data sink
  4. Include and configure Grafana in your docker-compose file
  5. Create a data source for Grafana
  6. Start the application
  7. Create a dashboard for Grafana

The right data sink for connecting Kafka and Grafana

Depending on the data sink you currently have connected to your Kafka cluster, this step might be obsolete. Grafana supports various data sources.
This tutorial will use PostgresSQL but other relational DBMS will work very similarly.

Unfortunately, MongoDB is not among the supported DBMS (or at least not yet), so if you stored your data in MongoDB and are dependent on it, you have to add another data source or give some custom plugins on Github a try.

Include and configure your database in your docker-compose file

First of all, add the configuration of the PostgreSQL image to your docker-compose file. This could look as in the following:

docker-compose.yaml

If needed for any reason, you can choose another port number. You should also specify credentials and the name of your database which you will need in the next steps.

Configure the Kafka connector between Kafka and your data sink

Next, you need a connector to connect Kafka with the PostgreSQL DB as a data sink. Maybe, you created a folder with all your connector configuration files, where you should now also add the below configuration file. It is also possible to simply add the file in the root directory of your project. This could become a bit messy when your project starts to grow, but for the beginning it would be fine too. In the step “Start the application” it will be explained in more detail how to now practically add the connector to your cluster via the command line.

connect-jdbc-sink.json

In line 2, you can choose whatever name suits best for your project. In line 6, you specify the topic’s name of which the data should be stored within the database and then be displayed with Grafana. Remember that if you have to transform/process your data in any kind before it can be displayed in Grafana, this should be done beforehand. For this purpose, Kafka stream processing can be used.
The connection URL in line 7 consists of different parts: The last part (‘example’) will be the name of your database as chosen in step 1). The second part (‘postgres_db:5432’) also depends on your configuration in your docker-compose.yaml. If you specified something differently than in this tutorial, you should also edit this line. The same goes for the connection user and password in lines 8 and 9.
Kafka offers some additional configuration options for connectors which can be very useful depending on your use case. For example, if you want your topic data to be compacted in your database instance, you could use the “upsert” insert mode.

Include and configure Grafana in your docker-compose file

In the third step, you need to add Grafana to your docker-compose.yaml file which will look like the following:

docker-compose.yaml

With the help of the environment variable ‘GF_INSTALL_PLUGINS’, you can add different plugins to Grafana which will then be automatically installed when starting your application. In this example, I included the world map panel.
The last three environment variables are not needed, but in this example, they are included for simplicity and to not having to log in each time when starting the application.
The paths in lines 10 and 11 are depending on your project’s folder structure and could of course also look different if you decide to deviate from this tutorial at this point. The next step will explain how those two files have to look like.

Create a data source for Grafana

To mount the files into your container, you first create a folder named ‘grafana’ in the root directory of your project. Now you create a subfolder named ‘provisioning’ and in this folder again two subfolders named ‘dashboards’ and ‘datasources’. Your directory should now look like this:
Project and Folder Strucutre 1

Within the ‘datasources’ folder, you create a file called ‘datasource.yaml’. Do not forget to eventually adjust the database name, user name and password or connection URL.

datasource.yaml

Start the application

If not already done, you first have to make sure that the JDBC connector was successfully added to your Kafka cluster before starting the application. Run the below command in your terminal. If you did not add the connector file to the root directory but to a subfolder, you have to adjust the path of the file within the command:


In case you automated the start of your application with a bash script, do not forget to include the corresponding step in your script.
At this point, everything is set up to access our Kafka data from Grafana. Currently, no dashboard will automatically be created during setup. Creating a dashboard via the user interface of Grafana is the easiest way to enable this. For this purpose, you first start your application: Depending on your overall setup, you maybe do this with the help of your bash script or by simply using docker-compose:


Once the application is running, navigate in your browser to “localhost:3000”.  Navigate to ‘Configurations’ on the left, then select ‘Data Sources’ where your PostgreSQL database should then be displayed.

Grafana: Datasources

Create a dashboard for Grafana

To create your first dashboard, navigate to ‘Dashboards’ and then to ‘Create your first dashboard’ on the right. Click on ‘Add new panel’. Now you can configure your dashboard as needed. On the right, there is the possibility to set some preferences and on the bottom, you can either use the query builder or the SQL editor to fetch the data from the database (the latter is less error-prone in my opinion but this also depends on your SQL knowledge). Make sure to select the right option for ‘Format as’:  ‘Time series’ or ‘Table’. When everything looks like you want to, click on the preferences icon in the top right corner.

Grafana: Create a Dashboard

Grafana: Export the JSON Model

Eventually, add some other configurations and click on ‘JSON Model’. Copy the displayed file content. Next, add within the ‘dashboards’ folder a file called ‘dashboards.json’. Paste the copied file content into this file.
Afterward, create another file called ‘dashboards.yaml’ and add the following:

dashboards.yaml

Your folder structure should now look like this:
Project and Folder Structure 2

Lastly, you restart your application and the dashboard should now be visible in Grafana. Of course, you can also create additional dashboards and add them to your configuration. To give you an impression of how your dashboard could look like with the usage of the world map plugin, I added a screenshot as an example:

Grafana: Example World Map Panel

Summary

In this tutorial, we briefly went through all the steps to set up a Grafana dashboard with data from a Kafka topic. The described steps can of course also be transferred to e.g. adding other connectors, creating different data sources or dashboards.
I tried to explain everything beginner-friendly, nevertheless, it could be possible that your setup is slightly different, and therefore, you might have to adjust some lines.
Are some points unclear? Feel free to leave a comment or ask questions! What visualization tools do you use? Would you like to read more tutorials on Kafka-related topics?

Comment article