03. January 2023
timer-icon 4 min

How Kotlin Coroutines Can Boost Your System's Performance

Does your system suffer from performance bottlenecks caused by multiple sequential outgoing calls? Then this blog post is for you! It will show you how you can leverage Kotlin's coroutines in order to easily convert multiple sequential calls into multiple parallel calls without even blocking a thread and therefore drastically increase your system's performance.

If the teaser is appealing to you, I’m sure you already touched topics like asynchrony, thread starvation or parallelization. Although there are a couple of frameworks out there which offer great support for those topics, like project reactor or reactiveX, I find myself struggling with the complexity of those technologies over and over again when writing code. That’s because of the big paradigm shift when coding in a reactive manner. Working with Kotlin’s coroutines feels more conventional, which made it easier for me to adapt. Also, they balance complexity and feature diversity very well.

This post requires some basic knowledge about Kotlin and coroutines. If you haven’t familiarized yourself with those technologies yet, I strongly recommend to do so first, before you continue reading.

Now, let’s have a look.

A common scenario

Imagine a common scenario in a world of today’s microservice architecture in which your system has a contract with a bunch of other systems, all returning the same data structure.
Instead of calling all involved systems sequentially, one after the other, leverage Kotlin’s powerful coroutines to make multiple calls in parallel like this:

(1) As this function is a suspend function, all scopes in this function inherit their CoroutineContext and CoroutineScope from their parent by default. If declaring this function as suspending is not feasible for your use case, you can easily provide your own CoroutineContext and CoroutineScope by leveraging the library’s built-in helpers, for example: CoroutineScope(SupervisorJob()).async { ... }.

(2) Using a supervisorScope. Unlike a classic coroutineScope a failure of one of the scope’s children does not affect other children of the same scope. Also, a failure of one child doesn’t cancel the scope. Instead custom error handling policies can be installed. In cases where launch is used, ExceptionHandler can be installed in the scope’s CoroutineContext. In our case, where async is used, we can simply wrap awaitAll() into a try-catch block.

(3) That’s the most critical part. For all system calls, a new coroutine (consider it a lightweight thread) will immediately be scheduled on a distinct thread pool and the given block will be executed on it. That also means that the main thread won’t be blocked by it. As async is executed within a map block, the result returned by supervisorScope will be a List<Deferred<SystemResult>>. A Deferred can be collected later on by calling one of the await() functions.

(4) The block which will be executed. In this case, performing an HTTP call.

(5) Awaiting the results. This is especially interesting when it comes to error handling. awaitAll() collects all values of the execution results in parallel. It raises an exception as soon as it encounters the first error result. With this method, there is no possibility to work with the results which were successfully completed already. If you want to work with the success results however, you would need to call the await() function on each element in the list and wrap it into a try-catch block. You can find this alternative on github.

What are the benefits of executing code like this?

Imagine a scenario in which your system would need to call 4 different systems. The time this method needs to collect all data from remote systems is calculated as follows:

With parallel execution as shown above it is calculated as follows:

Especially when working with legacy systems, response times tend to be rather bad (talking >= 500ms). Let’s do simple math for an example where all involved systems answer within 1 second on average.
Executing calls sequentially results in a total of 4 seconds, whereas the parallel execution results in – yes, that’s right: 1 second. That’s a decrease of 75%!
If that’s not convincing enough for you yet, consider scenarios with 10 or even 20 involved systems. You can do the math yourself. 😉

Is there a concise way of testing as well?

Well, I have good news for you, because there is! Let’s have a look at the testing side of things.

For those tests to work you need at least the following dependency:


First, we would like to verify that the overall execution time is determined by the response time of the slowest system.

(1) Allows the immediate execution of coroutines and calling suspending functions in the first place. Details can be found in Kotlin’s excellent documentation.
(2) Simulating different response times of called systems.
(3) Asserting that the total time is not the sum of all systems called but determined by the slowest responding system.

In addition, we would like to test if there is a supervisorScope in place to avoid that multiple coroutines are cancelling each other, if one of them fails.

(1) When using awaitAll() in productive code without catching its exception, this becomes necessary in order for the test to pass. An alternative can be found on github.
(2) Although system1 fails with an exception, an HTTP call to system2 is performed. That’s because supervisorScope is in place and does not cancel its other children if one of them fails. Changing supervisorScope to a simple coroutineScope would make this test fail.


In this blog post we covered a common use case in which we leveraged Kotlin’s coroutines to perform multiple HTTP calls in a parallel, asynchronous and non-blocking way. We had a look at its potential performance boost and the way we can cover that behavior with automated tests. Next to its potential performance boost, one major advantage of using Kotlin’s coroutines is that both productive and testing code are pretty concise to write and comparably easy to comprehend, making it a good candidate for performance critical Kotlin applications.

You can find the whole example on github.

Stay tuned & happy coding! 🙂

Comment article