user-icon Florian Ernst
13. March 2023
timer-icon 8 min

Kubernetes Service Meshes - Linkerd

The Linkerd Service Mesh as deployed on Kubernetes: how to install, and how to get some insights out of it.

This is one in a small series of blog posts about Kubernetes Service Meshes, please check there for reference. While the previous post started out by introducing the general picture and the basic concepts, here I will deal with a specific implementation and its respective details.

All contents presented here will be covered in more detail as part of the Novatec training “Docker und Kubernetes in der Praxis”. And of course all shown technologies have more features than merely those presented here, where I will only provide a glimpse into them.

Linkerd

Linkerd is fully open source, licensed under Apache License 2.0, and is a Cloud Native Computing Foundation incubating project. Linkerd is the very first Service Mesh project and the project that gave birth to the term service mesh itself, cf. The Service Mesh: What Every Software Engineer Needs to Know about the World’s Most Over-Hyped Technology, and it is developed in the open in the Linkerd GitHub organization.

Buoyant, the original creators of Linkerd, offer support, training, and enterprise products around the open-source Linkerd tool.

Linkerd has three basic components: a data plane (with proxies written in Rust and specialized for Linkerd in order to be as small, lightweight, and safe as possible), a control plane, and a UI. One runs Linkerd by:

  • Installing the CLI on a local system
  • Installing the control plane into a cluster (any cluster will do as long as I have sufficient permissions, even tiny local ones running via Minikube or KinD)
  • Addin workloads to Linkerd’s data plane

Once a service is running with Linkerd, one can use Linkerd’s UI to inspect and manipulate it.

Installation

CLI

As per Linkerd | Getting Started I can download the CLI directly. At the time of this writing this is version 2.12.4 and thus


will serve to install the CLI to my local system.

Cluster

Using this CLI I can then deploy the linkerd components into my cluster just as per Linkerd | Getting Started and Linkerd | Using extensions:


The various check commands only serve to first ensure my cluster fulfills the requirements and the later installation went as expected. Make sure to execute them to see what has been checked and deployed – even though you might not understand the full output right now.

Please note that there are instructions for Linkerd | Installing Linkerd with Helm, but I’d need to provide certificates by myself, so I’ll stick to using linkerd install.

Integrate workloads

Now that the Linkerd control plane is running in my cluster I can start integrating workloads. For that I can simply mark a namespace for auto-injection of the data plane sidecar proxies via a specific annotation:


This will ensure that any future deployments to this namespace will be handled by Linkerd. And if I already have any deployments in this namespace I could simply trigger their integration via


I will now deploy two sample applications that together build a simple todo list. First a backend:


And then a frontend:

Examine the meshed workloads

When I now examine the newly-deployed workloads I can see what Linkerd has automatically added:


Please note that each pod now consists of two containers, even though only a single one was defined above. Let’s inspect those a bit to take a look under the hood:


and also


as well as


So, not only did the pod receive an additional container in which the linkerd-proxy runs, but there was also an init container executed before any other containers were started that merely took care of redirecting all traffic to the linkerd-proxy container. Neat.

Feel free to check the full command output, of course. I have redacted the output somewhat to pinpoint some specific parts, but you will see that Linkerd has added quite some additional information altogether.

Overhead

Of course, this comes at a certain price per pod for the data plane proxy, check e.g.


12Mi RAM and some CPU cycles already, and without handling any real work at the moment. Once real work starts the overhead can get quite high as shown in benchmarks. But of course there is also the control plane as well as the various helper applications:

Investigate network traffic metrics

First let’s put some load on the deployed and meshed todo app, so I execute in a separate command window:


Of course feel free to also access the todo app frontend pointing your browser of choice at that URL. What does Linkerd know about this traffic now? For my deployments:


So all my deployments are meshed (i.e. have the linkerd-proxies added to them), and I see the so-called golden metrics (per default for the past minute):

  • success rate of requests (i.e. ratio of non-4XX and non-5XX requests)
  • requests per second
  • latency percentiles

There are various filtering options available, just check linkerd viz stat.

And now for specific Pods in our namespace, filtered by connections to a specific Service:

Investigate requests

But what is actually going on in my namespace?


More details can be retrieved by tap‘ing into a deployment:


So yes, the curl loop from the separate command window can be traced:

  • line 2: a curl GET on / reaches the todoui pod running on 10.244.0.17
  • lines 3, 4, 5: this pod sends a GET on /todos/ to the todobackend pod running on 10.244.0.15 and receives a response with status 200
  • lines 6, 7: the todoui pod sends a response with status 200 back to curl

And in addition to the various methods, paths and metrics Linkerd also indicates that the cluster-internal connection from todoui to todobackend is encrypted using mutual TLS (mTLS), i.e. with both sender and receiver using certificates to identify each other and to encrypt the communication – as handled by the sidecar proxies.

By the way, Linkerd readily reports the same when asked suitably:


In addition to the connection already observed above, also the connections from Linkerd’s Prometheus monitoring instance and from the tap interface which gathered the data from above are secured.

The initial connection to the LoadBalancer IP of the todoui service had not been encrypted, though, see the tls=no_tls_from_remote above. This is due to the Linkerd Service Mesh only handling cluster-internal connections. However, it is easily possible to secure external connections using an Ingress Controller and then also integrate this Ingress Controller into a Service Mesh, but that is out of scope for here now.

Access the Linkerd dashboard

I might prefer a different interface for accessing all these data on the fly. For that I can easily utilize the Linkerd dashboard, i.e. a web interface.

So I’ll open a port-forwarding to this service:


Now I will be able to access this through my local browser at http://localhost:8084/, and I will see the application like in the following pictures:

Linkerd Dashboard showing Deployments

Linkerd Dashboard showing Deployments

Linkerd Dashboard showing Connections

Linkerd Dashboard showing Connections

So all I could query via CLI I will find here as well.

And yes, there also exists a linkerd viz dashboard command, but the above way via kubectl could be used even if the linkerd binary were not available locally, hence the mention.

Further outlook

Of course, by the above only a rather limited subset of Linkerd’s features has been shown.

For instance, Linkerd’s traffic split functionality allows one to dynamically shift arbitrary portions of traffic destined for a Kubernetes service to a different destination service. This feature can be used to implement sophisticated rollout strategies such as canary deployments and blue/green deployments, for example, by slowly easing traffic off of an older version of a service and onto a newer version.

Check Traffic Split for further details.

Linkerd can also be configured to emit trace spans from the proxies with will then be collected via Jaeger, allowing one to see exactly what time requests and responses spend inside. However, distributed tracing requires more changes and configuration. Still, out of the box Linkerd already provides

  • live service topology and dependency graphs
  • aggregated service health, latencies, and request volumes
  • aggregated path / route health, latencies, and request volumes

Check Distributed tracing with Linkerd for further details.

Conclusion

The Linkerd Service Mesh readily provides the basics one would expect from a Service Mesh, and more, such as:

  • cluster-internal network observability
  • fully-handled cluster-internal transfer encryption via mTLS
  • cluster-internal traffic shaping

In doing so, Linkerd remains relatively resource-efficient, so if a Service Mesh is needed and its feature set suffices Linkerd will be a good choice.

Image Sources: (c) Samuel Zeller on Unsplash

Comment article